`_regmap_update_bits()` checks if the current register value differs
from the new value, and only writes to the register if they differ. When
testing hardware drivers, it might be desirable to always force a
register write, for example when writing to a `regmap_field`. This
enables and simplifies testing and verification of the hardware
interaction. For example, when using a hardware mock/simulation model,
one can then more easily verify that the driver makes the correct
expected register writes during certain events.
Add a bool variable `force_write_field` and a corresponding debugfs
entry to enable this. Since this feature could interfere with driver
operation, guard it with a macro.
Signed-off-by: Waqar Hameed <waqar.hameed@axis.com>
Link: https://lore.kernel.org/r/pnd1qifa7sj.fsf@axis.com
Signed-off-by: Mark Brown <broonie@kernel.org>
regcache_maple_sync() tries to sync all cached values no matter
whether it's writable or not. OTOH, regache_sync_val() does care the
wrtability and returns -EIO for a read-only register. This results in
an error message like:
snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2f0009. -5
and the sync loop is aborted incompletely.
This patch adds the writable register check to regcache_sync_val() for
addressing the bug above.
Note that, although we may add the check in the caller side
(regcache_maple_sync()), here we put in regcache_sync_val(), so that a
similar case like this can be avoided in future.
Fixes: f033c26de5 ("regmap: Add maple tree based register cache")
Link: https://lore.kernel.org/r/877cs7g6f1.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20230613112240.3361-1-tiwai@suse.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Mark Brown <broonie@kernel.org>:
Our existing coverage only deals with buses that provide single register
read and write operations, extend it to cover raw buses using a similar
approach with a RAM backed register map that the tests can inspect to
check operations. This coverage could be more complete but provides a
good start.
The only user of regcache_set_val() ignores the return value so we may as
well not bother checking if the value we are trying to set is the same as
the value already stored.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230609-regcache-set-val-no-ret-v1-1-9a6932760cf8@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
For register maps where we can write multiple values in a single bus
operation it is generally much faster to do so. Improve the performance of
maple tree cache syncs on such devices by identifying blocks of adjacent
registers that need to be written out and combining them into a single
operation.
Combining writes does mean that we need to allocate a scratch buffer and
format the data into it but it is expected that for most cases where caches
are in use the cost of I/O will be much greater than the cost of doing the
allocation and format.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230609-regcache-maple-sync-raw-v1-1-8ddeb4e2b9ab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSGPiIeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGxWcH/AoUClfYYbYZYaoI
spQiFQiZEdniE3W/36GKtfJok4ur1Aogb/q99KvfQxFGkH+aLbTgXxJlTqrlmfWk
Kj4uVUecQl2OHOuEM6AVFyK6S4xdTjWS7TKMGIVe4LDQCAsFidgcYbcVrAhQ+s1s
eAOhlfrKMzVgRQ0KsJkn60SoSbVP6RyRfu+QQu24GfNtD8SzN8y0adB1PYXGVWmr
WW8+MqfZ1WIkn+NWW5bn3AEz8AYjQC66EvcWhlAWmyyBuUjIoNhpIgPHp6Vm7s+a
oPvw/VXwDX8NzpN0XYI+eVpsgClWtJ+HUcCSeuExTxrWu7Bewp6MMLeCjvi/lPpe
zKRDgnw=
=hx4k
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmSHIsUACgkQJNaLcl1U
h9AbDQf+OnspdI5uW+X7ObBdB7KvPAXzDd7rMsXOTdp13bciSmaOzl74f0jPNWYU
GVo3a84eIU0YsdBP7M9+42vyUDQctgoBa7yyxZ+hW8gZQbXwgOZGmvWMncankm9n
hTlUbuLiowPzzjLERMDS0T2iJF0aFG7eB8Xghk/Su38HOCJI6fBx7sQlvt+aFK6J
/WzUuAqHng+6T7FHqxerGz/XIfvnRdS8qC9FZ7ySurmopn2AGY4YlifSGmSlPuFt
MM+nGNTOYn8scBAPRQlVCOwv7XOg6XqEvP5UNe8KpKN1rHfgP/B0X3ZbyE58bpPI
lqZzTXmY/dWD27FUc+3779EGLZ+L8w==
=PyzM
-----END PGP SIGNATURE-----
regmap: Merge up v6.4-rc6
The fix for maple tree RCU locking on sync is a dependency for the
block sync code for the maple tree.
Simple tests that cover basic raw I/O, plus basic coverage of cache sync
since the caches generate bulk I/O with raw register maps. This could be
more comprehensive but it is good for testing generic code.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230610-regcache-raw-kunit-v1-2-583112cd28ac@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
UEFI Specification version 2.9 introduces the concept of memory
acceptance. Some Virtual Machine platforms, such as Intel TDX or AMD
SEV-SNP, require memory to be accepted before it can be used by the
guest. Accepting happens via a protocol specific to the Virtual Machine
platform.
There are several ways the kernel can deal with unaccepted memory:
1. Accept all the memory during boot. It is easy to implement and it
doesn't have runtime cost once the system is booted. The downside is
very long boot time.
Accept can be parallelized to multiple CPUs to keep it manageable
(i.e. via DEFERRED_STRUCT_PAGE_INIT), but it tends to saturate
memory bandwidth and does not scale beyond the point.
2. Accept a block of memory on the first use. It requires more
infrastructure and changes in page allocator to make it work, but
it provides good boot time.
On-demand memory accept means latency spikes every time kernel steps
onto a new memory block. The spikes will go away once workload data
set size gets stabilized or all memory gets accepted.
3. Accept all memory in background. Introduce a thread (or multiple)
that gets memory accepted proactively. It will minimize time the
system experience latency spikes on memory allocation while keeping
low boot time.
This approach cannot function on its own. It is an extension of #2:
background memory acceptance requires functional scheduler, but the
page allocator may need to tap into unaccepted memory before that.
The downside of the approach is that these threads also steal CPU
cycles and memory bandwidth from the user's workload and may hurt
user experience.
Implement #1 and #2 for now. #2 is the default. Some workloads may want
to use #1 with accept_memory=eager in kernel command line. #3 can be
implemented later based on user's demands.
Support of unaccepted memory requires a few changes in core-mm code:
- memblock accepts memory on allocation. It serves early boot memory
allocations and doesn't limit them to pre-accepted pool of memory.
- page allocator accepts memory on the first allocation of the page.
When kernel runs out of accepted memory, it accepts memory until the
high watermark is reached. It helps to minimize fragmentation.
EFI code will provide two helpers if the platform supports unaccepted
memory:
- accept_memory() makes a range of physical addresses accepted.
- range_contains_unaccepted_memory() checks anything within the range
of physical addresses requires acceptance.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mike Rapoport <rppt@linux.ibm.com> # memblock
Link: https://lore.kernel.org/r/20230606142637.5171-2-kirill.shutemov@linux.intel.com
bool is the most sensible return value for a yes/no return. Also
add __init as this funtion is only called from the early boot code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230531125535.676098-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxDNg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl1ywCg0uz+E/GYKx5cP9chPFmbbaFwxH4AnRpn/kIH
xz6nbAqSf7CBbtxmED11
=4J1c
-----END PGP SIGNATURE-----
Merge tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a bunch of tiny char/misc/other driver fixes for 6.4-rc5 that
resolve a number of reported issues. Included in here are:
- iio driver fixes
- fpga driver fixes
- test_firmware bugfixes
- fastrpc driver tiny bugfixes
- MAINTAINERS file updates for some subsystems
All of these have been in linux-next this past week with no reported
issues"
* tag 'char-misc-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (34 commits)
test_firmware: fix the memory leak of the allocated firmware buffer
test_firmware: fix a memory leak with reqs buffer
test_firmware: prevent race conditions by a correct implementation of locking
firmware_loader: Fix a NULL vs IS_ERR() check
MAINTAINERS: Vaibhav Gupta is the new ipack maintainer
dt-bindings: fpga: replace Ivan Bornyakov maintainership
MAINTAINERS: update Microchip MPF FPGA reviewers
misc: fastrpc: reject new invocations during device removal
misc: fastrpc: return -EPIPE to invocations on device removal
misc: fastrpc: Reassign memory ownership only for remote heap
misc: fastrpc: Pass proper scm arguments for secure map request
iio: imu: inv_icm42600: fix timestamp reset
iio: adc: ad_sigma_delta: Fix IRQ issue by setting IRQ_DISABLE_UNLAZY flag
dt-bindings: iio: adc: renesas,rcar-gyroadc: Fix adi,ad7476 compatible value
iio: dac: mcp4725: Fix i2c_master_send() return value handling
iio: accel: kx022a fix irq getting
iio: bu27034: Ensure reset is written
iio: dac: build ad5758 driver when AD5758 is selected
iio: addac: ad74413: fix resistance input processing
iio: light: vcnl4035: fixed chip ID check
...
Here are 2 small driver core cacheinfo fixes for 6.4-rc5 that resolve a
number of reported issues with that file. These changes have been in
linux-next this past week with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHxChg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykrLACeJBLCDThdooct8G/7MzfpJhFcjSYAn1/EhJDA
GxgOmZrsB1HcO3Bo587a
=Cucq
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are two small driver core cacheinfo fixes for 6.4-rc5 that
resolve a number of reported issues with that file. These changes have
been in linux-next this past week with no reported problems"
* tag 'driver-core-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: base: cacheinfo: Update cpu_map_populated during CPU Hotplug
drivers: base: cacheinfo: Fix shared_cpu_map changes in event of CPU hotplug
The current behaviour around cache_only is slightly inconsistent,
most paths will only check cache_only if cache_bypass is false,
and will return -EBUSY if a read attempts to go to the hardware
whilst cache_only is true. However, a couple of paths will not check
cache_only at all. The most notable of these being regmap_raw_read
which will check cache_only in the case it processes the transaction
one register at a time, but not in the case it handles them as a
block. In the typical case a device has been put into cache_only
whilst powered down this can cause physical reads to happen whilst the
device is unavailable.
Add a check in regmap_raw_read and move the check in regmap_noinc_read,
adding a check for cache_bypass, such that all paths are covered and
consistent.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230601101036.1499612-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Typically handle_post_irq is going to be used to manage some
additional chip specific hardware operations required on each IRQ,
these are very likely to want the chip to be resumed. For example the
current in tree user max77620 uses this to toggle a global mask bit,
which would obviously want the device resumed. It is worth noting this
device does not specify the runtime_pm flag in regmap_irq_chip, so
there is no actual issue.
Move the callback to before the pm_runtime_put, so it will be called
whilst the device is still resumed.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230601101036.1499612-1-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Until commit 5c2712387d ("cacheinfo: Fix LLC is not exported through
sysfs"), cacheinfo called populate_cache_leaves() for CPU coming online
which let the arch specific functions handle (at least on x86)
populating the shared_cpu_map. However, with the changes in the
aforementioned commit, populate_cache_leaves() is not called when a CPU
comes online as a result of hotplug since last_level_cache_is_valid()
returns true as the cacheinfo data is not discarded. The CPU coming
online is not present in shared_cpu_map, however, it will not be added
since the cpu_cacheinfo->cpu_map_populated flag is set (it is set in
populate_cache_leaves() when cacheinfo is first populated for x86)
This can lead to inconsistencies in the shared_cpu_map when an offlined
CPU comes online again. Example below depicts the inconsistency in the
shared_cpu_list in cacheinfo when CPU8 is offlined and onlined again on
a 3rd Generation EPYC processor:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
9-15,136-143
Clear the flag when the CPU is removed from shared_cpu_map when
cache_shared_cpu_map_remove() is called during CPU hotplug. This will
allow cache_shared_cpu_map_setup() to add the CPU coming back online in
the shared_cpu_map. Set the flag again when the shared_cpu_map is setup.
Following are results of performing the same test as described above with
the changes:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# echo 1 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# cat /sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list
8,136
# cat /sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list
8-15,136-143
Fixes: 5c2712387d ("cacheinfo: Fix LLC is not exported through sysfs")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-3-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
While building the shared_cpu_map, check if the cache level and cache
type matches. On certain systems that build the cache topology based on
the instance ID, there are cases where the same ID may repeat across
multiple cache levels, leading inaccurate topology.
In event of CPU offlining, the cache_shared_cpu_map_remove() does not
consider if IDs at same level are being compared. As a result, when same
IDs repeat across different cache levels, the CPU going offline is not
removed from all the shared_cpu_map.
Below is the output of cache topology of CPU8 and it's SMT sibling after
CPU8 is offlined on a dual socket 3rd Generation AMD EPYC processor
(2 x 64C/128T) running kernel release v6.3:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
CPU8 is removed from index0 (L1i) but remains in the shared_cpu_list of
index1 (L1d) and index2 (L2). Since L1i, L1d, and L2 are shared by the
SMT siblings, and they have the same cache instance ID, CPU 2 is only
removed from the first index with matching ID which is index1 (L1i) in
this case. With this fix, the results are as expected when performing
the same experiment on the same system:
# for i in /sys/devices/system/cpu/cpu8/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu8/cache/index0/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index1/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index2/shared_cpu_list: 8,136
/sys/devices/system/cpu/cpu8/cache/index3/shared_cpu_list: 8-15,136-143
# echo 0 > /sys/devices/system/cpu/cpu8/online
# for i in /sys/devices/system/cpu/cpu136/cache/index*/shared_cpu_list; do echo -n "$i: "; cat $i; done
/sys/devices/system/cpu/cpu136/cache/index0/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index1/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index2/shared_cpu_list: 136
/sys/devices/system/cpu/cpu136/cache/index3/shared_cpu_list: 9-15,136-143
When rebuilding topology, the same problem appears as
cache_shared_cpu_map_setup() implements a similar logic. Consider the
same 3rd Generation EPYC processor: CPUs in Core 1, that share the L1
and L2 caches, have L1 and L2 instance ID as 1. For all the CPUs on
the second chiplet, the L3 ID is also 1 leading to grouping on CPUs from
Core 1 (1, 17) and the entire second chiplet (8-15, 24-31) as CPUs
sharing one cache domain. This went undetected since x86 processors
depended on arch specific populate_cache_leaves() method to repopulate
the shared_cpus_map when CPU came back online until kernel release
v6.3-rc5.
Fixes: 198102c910 ("cacheinfo: Fix shared_cpu_map to handle shared caches at different levels")
Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230508084115.1157-2-kprateek.nayak@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Usage of 'attr' and 'name' in the context of a sysfs attribute
definition are confusing because those read as being related to:
struct attribute .name
Rename 'name' to 'property' in preparation for renaming 'struct
node_hmem_attr' to a more generic name that can be used in more contexts
('struct access_coordinate'), and not be confused with 'struct
attribute'.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168332213518.2189163.18377767521423011290.stgit@djiang5-mobl3
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The isa_dev->dev.platform_data is initialized with incoming
parameter isa_driver. After it isa_dev->dev.platform_data is
checked for NULL, but incoming parameter isa_driver is not
NULL since it is dereferenced many times before this check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20230517125025.434005-1-VEfanov@ispras.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The most important fix here is for missing dropping of the RCU read lock
when syncing maple tree register caches, the physical devices I have
that use the code don't do any syncing so I'd only ever tested this with
virtual devices and missed the fact that we need to drop the lock in
order to write to buses that need to sleep. Otherwise there's a fix for
an edge case when splitting up large batch writes which has been lurking
for a long time, a check to make sure nobody writes new drivers with a
bug that was found in several SoundWire drivers and a tweak to the way
the new kunit tests are enabled to ensure they don't cause regmap to be
enabled when it wouldn't otherwise be.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmR15dQACgkQJNaLcl1U
h9BGKgf+NFD98ZNPsvOoFqAdG33yTDWISTf8FtIakM074vhz0xrBTFolqf/wmBGF
8VEabgfpmLSyuGEizjK/YLDK4f143XBgivyXUH04vGeAQUx4MfBaf0DSxBlQLh4K
K3LebH5QNQqdv9Qx7xMy7Y06r3KiQUxUvM8AINukspY0pmY8LtxyGLvv8T3kSpjz
WU28umwPrNsr36TcSuowncvZmMJ7Y1kSEnNjer1cdVlJVn4cFzkl/Sashmy8Rg3g
5Qmbg6Kuc6mLetoshtsculYntoLUJeCt6oX01Iq3cxVqkNi+mpOgwMOgjFFlwtrc
fela1GXUFEKe31FKyQPE0DRKv697eg==
=iAaD
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"The most important fix here is for missing dropping of the RCU read
lock when syncing maple tree register caches, the physical devices I
have that use the code don't do any syncing so I'd only ever tested
this with virtual devices and missed the fact that we need to drop the
lock in order to write to buses that need to sleep.
Otherwise there's a fix for an edge case when splitting up large batch
writes which has been lurking for a long time, a check to make sure
nobody writes new drivers with a bug that was found in several
SoundWire drivers and a tweak to the way the new kunit tests are
enabled to ensure they don't cause regmap to be enabled when it
wouldn't otherwise be"
* tag 'regmap-fix-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: maple: Drop the RCU read lock while syncing registers
regmap: sdw: check for invalid multi-register writes config
regmap: Account for register length when chunking
regmap: REGMAP_KUNIT should not select REGMAP
Move the pm_suspend_target_state definition for CONFIG_SUSPEND
unset from the wakeup code into the headers so as to allow it
to still be used elsewhere when CONFIG_SUSPEND is not set.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
[ rjw: Changelog and subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, while calculating residency and latency values, right
operands may overflow if resulting values are big enough.
To prevent this, albeit unlikely case, play it safe and convert
right operands to left ones' type s64.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: 30f604283e ("PM / Domains: Allow domain power states to be read from DT")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently we use the normal single register write function to load the
default values into the cache, resulting in a large number of reallocations
when there are blocks of registers as we extend the memory region we are
using to store the values. Instead scan through the list of defaults for
blocks of adjacent registers and do a single allocation and insert for each
such block. No functional change.
We do not take advantage of the maple tree preallocation, this is purely at
the regcache level. It is not clear to me yet if the maple tree level would
help much here or if we'd have more overhead from overallocating and then
freeing maple tree data.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-load-defaults-v1-1-0c04336f005d@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Unfortunately the maple tree requires us to explicitly lock it so we need
to take the RCU read lock while iterating. When syncing this means that we
end up trying to write out register values while holding the RCU read lock
which triggers lockdep issues since that is an atomic context but most
buses can't be used in atomic context. Pause the iteration and drop the
lock for each register we check to avoid this.
Reported-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230523-regcache-maple-sync-lock-v1-1-530e4d68dfab@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
SoundWire code as it is only supports Bulk register writes and
it does not support multi-register writes.
Any drivers that set can_multi_write and use regmap_multi_reg_write() will
easily endup with programming the hardware incorrectly without any errors.
So, add this check in bus code to be able to validate the drivers config.
Fixes: 522272047d ("regmap: sdw: Remove 8-bit value size restriction")
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20230523154747.5429-1-srinivas.kandagatla@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
When class_dev_iter is initialized, the reference count for the subsys
private structure is incremented, but never decremented, causing a
memory leak over time. To resolve this, save off a pointer to the
internal structure into the class_dev_iter structure and then when the
iterator is finished, drop the reference count.
Reported-and-tested-by: syzbot+e7afd76ad060fa0d2605@syzkaller.appspotmail.com
Fixes: 7b884b7f24 ("driver core: class.c: convert to only use class_to_subsys")
Reported-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Tested-by: Mirsad Goran Todorovac <mirsad.todorovac@alu.unizg.hr>
Link: https://lore.kernel.org/r/2023051610-stove-condense-9a77@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, when regmap_raw_write() splits the data, it uses the
max_raw_write value defined for the bus. For any bus that includes
the target register address in the max_raw_write value, the chunked
transmission will always exceed the maximum transmission length.
To avoid this problem, subtract the length of the register and the
padding from the maximum transmission.
Signed-off-by: Jim Wylder <jwylder@google.com
Link: https://lore.kernel.org/r/20230517152444.3690870-2-jwylder@google.com
Signed-off-by: Mark Brown <broonie@kernel.org
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>:
This is a straightforward patch series, mostly just removing a bunch
of old features that were only used by a handful of drivers.
- 1/4 and 2/4 remove unused, deprecated functionality
- 3/4 makes the behavior of .handle_mask_sync() a bit more consistent
w.r.t. mask and unmask registers, to aid maintainability.
- 4/4 removes now-unused "inverted mask/unmask" compatibility code.
Regmap's stride is used for MMIO regmaps to check the correctness of
reg_width. However, it's acceptable to pass an empty config->reg_stride,
in that case the actual stride used is 1.
There are valid cases now to pass an empty stride, when using
down/upshifting of register address. In this case, the stride value
loses its sense, so ignore the reg_width when the stride isn't set.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com
Link: https://lore.kernel.org/r/20230511142735.316445-1-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org
All users must now specify .mask_unmask_non_inverted = true to
ensure they are using the expected semantics: 1s disable IRQs
in the mask registers, and enable IRQs in the unmask registers.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com
Link: https://lore.kernel.org/r/20230511091342.26604-5-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org
If a .handle_mask_sync() callback is provided it supersedes all
inbuilt handling of mask registers, and judging by the commit
69af4bcaa0 ("regmap-irq: Add handle_mask_sync() callback") it
is intended to completely replace all default IRQ masking logic.
The implementation has two minor inconsistencies, which can be
fixed without breaking compatibility:
(1) mask_base must be set to enable .handle_mask_sync(), even
though mask_base is otherwise unused. This is easily fixed
because mask_base is already optional.
(2) Unmask registers aren't accounted for -- they are part of
the default IRQ masking logic and are just a bit-inverted
version of mask registers. It would be a bad idea to allow
them to be used at the same time as .handle_mask_sync(),
as the result would be confusing and unmaintainable, so
make sure this can't happen.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com
Link: https://lore.kernel.org/r/20230511091342.26604-4-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org
Enabling a (modular) test should not silently enable additional kernel
functionality, as that may increase the attack vector of a product.
Fix this by:
1. making REGMAP visible if CONFIG_KUNIT_ALL_TESTS is enabled,
2. making REGMAP_KUNIT depend on REGMAP instead of selecting it.
After this, one can safely enable CONFIG_KUNIT_ALL_TESTS=m to build
modules for all appropriate tests for ones system, without pulling in
extra unwanted functionality, while still allowing a tester to manually
enable REGMAP and its test suite on a system where REGMAP is not enabled
by default.
Fixes: 2238959b6a ("regmap: Add some basic kunit tests")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org
Link: https://lore.kernel.org/r/b0a5dbb17c1d5ea482e052e585ae83bb69c48806.1682516005.git.geert@linux-m68k.org
Signed-off-by: Mark Brown <broonie@kernel.org
Remove the map parameter from the struct regmap_irq_chip callback
handle_mask_sync() because it can be passed via the irq_drv_data
parameter instead. The gpio-104-dio-48e driver is the only consumer of
this callback and is thus updated accordingly.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org
Signed-off-by: William Breathitt Gray <william.gray@linaro.org
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com
Link: https://lore.kernel.org/r/1f44fb0fbcd3dccea3371215b00f1b9a956c1a12.1679323449.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page().
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
than exclusive, and has fixed a bunch of errors which were caused by its
unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
=J+Dj
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening in
the driver core in the quest to be able to move "struct bus" and "struct
class" into read-only memory, a task now complete with these changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules for
all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most of
them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5
LEGadNS38k5fs+73UaxV
=7K4B
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
- First part of DT header detangling dropping cpu.h from of_device.h
and replacing some includes with forward declarations. A handful of
drivers needed some adjustment to their includes as a result.
- Refactor of_device.h to be used by bus drivers rather than various
device drivers. This moves non-bus related functions out of
of_device.h. The end goal is for of_platform.h and of_device.h to stop
including each other.
- Refactor open coded parsing of "ranges" in some bus drivers to use DT
address parsing functions
- Add some new address parsing functions of_property_read_reg(),
of_range_count(), and of_range_to_resource() in preparation to convert
more open coded parsing of DT addresses to use them.
- Treewide clean-ups to use of_property_read_bool() and
of_property_present() as appropriate. The ones here are the ones
that didn't get picked up elsewhere.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmRIOrkACgkQ+vtdtY28
YcN9WA//R+QrmSPExhfgio5y+aOJDWucqnAcyAusPctLcF7h7j0CdzpwaSRkdaH4
KiLjeyt6tKn8wt8w7m/+SmCsSYXPn81GH/Y5I2F40x6QMrY3cVOXUsulKQA+6ZjZ
PmW3bMcz0Dw9IhUK3R/WX96+9UdoytKg5qoTzNzPTKpvKA1yHa/ogl2FnHJS5W+8
Rxz+1oJ70VMIWGpBOc0acHuB2S0RHZ46kPKkPTBgFYEwtmJ8qobvV3r3uQapNaIP
2jnamPu0tAaQoSaJKKSulToziT+sd1sNB+9oyu/kP+t3PXzq4qwp2Gr4jzUYKs4A
ZF3DPhMR3YLLN41g/L3rtB0T/YIS287sZRuaLhCqldNpRerSDk4b0HRAksGk1XrI
HqYXjWPbRxqYiIUWkInfregSTYJfGPxeLfLKrawNO34/eEV4JrkSKy8d0AJn04EK
jTRqI3L7o23ZPxs29uH/3+KK90J3emPZkF7GWVJTEAMsM8jYZduGh7EpsttJLaz/
QnxbTBm9295ahIdCfo/OQhqjWnaNhpbTzf31pyrBZ/itXV7gQ0xjwqPwiyFwI+o/
F/r81xqdwQ3Ni8MKt2c7zLyVA95JHPe95KQ3GrDXR68aByJr4RuhKG8Y2Pj1VOb3
V+Hsu5uhwKrK7Yqe+rHDnJBO00OCO8nwbWhMy2xVxoTkSFCjDmo=
=89Zj
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull more devicetree updates from Rob Herring:
- First part of DT header detangling dropping cpu.h from of_device.h
and replacing some includes with forward declarations. A handful of
drivers needed some adjustment to their includes as a result.
- Refactor of_device.h to be used by bus drivers rather than various
device drivers. This moves non-bus related functions out of
of_device.h. The end goal is for of_platform.h and of_device.h to
stop including each other.
- Refactor open coded parsing of "ranges" in some bus drivers to use DT
address parsing functions
- Add some new address parsing functions of_property_read_reg(),
of_range_count(), and of_range_to_resource() in preparation to
convert more open coded parsing of DT addresses to use them.
- Treewide clean-ups to use of_property_read_bool() and
of_property_present() as appropriate. The ones here are the ones that
didn't get picked up elsewhere.
* tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits)
bus: tegra-gmi: Replace of_platform.h with explicit includes
hte: Use of_property_present() for testing DT property presence
w1: w1-gpio: Use of_property_read_bool() for boolean properties
virt: fsl: Use of_property_present() for testing DT property presence
soc: fsl: Use of_property_present() for testing DT property presence
sbus: display7seg: Use of_property_read_bool() for boolean properties
sparc: Use of_property_read_bool() for boolean properties
sparc: Use of_property_present() for testing DT property presence
bus: mvebu-mbus: Remove open coded "ranges" parsing
of/address: Add of_property_read_reg() helper
of/address: Add of_range_count() helper
of/address: Add support for 3 address cell bus
of/address: Add of_range_to_resource() helper
of: unittest: Add bus address range parsing tests
of: Drop cpu.h include from of_device.h
OPP: Adjust includes to remove of_device.h
irqchip: loongson-eiointc: Add explicit include for cpuhotplug.h
cpuidle: Adjust includes to remove of_device.h
cpufreq: sun50i: Add explicit include for cpu.h
cpufreq: Adjust includes to remove of_device.h
...
- Fix the frequency unit in cpufreq_verify_current_freq checks()
(Sanjay Chandrashekara).
- Make mode_state_machine in amd-pstate static (Tom Rix).
- Make the cpufreq core require drivers with target_index() to set
freq_table (Viresh Kumar).
- Fix typo in the ARM_BRCMSTB_AVS_CPUFREQ Kconfig entry (Jingyu Wang).
- Use of_property_read_bool() for boolean properties in the pmac32
cpufreq driver (Rob Herring).
- Make the cpufreq sysfs interface return proper error codes on
obviously invalid input (qinyu).
- Add guided autonomous mode support to the AMD P-state driver (Wyes
Karny).
- Make the Intel P-state driver enable HWP IO boost on all server
platforms (Srinivas Pandruvada).
- Add opp and bandwidth support to tegra194 cpufreq driver (Sumit
Gupta).
- Use of_property_present() for testing DT property presence (Rob
Herring).
- Remove MODULE_LICENSE in non-modules (Nick Alcock).
- Add SM7225 to cpufreq-dt-platdev blocklist (Luca Weiss).
- Optimizations and fixes for qcom-cpufreq-hw driver (Krzysztof
Kozlowski, Konrad Dybcio, and Bjorn Andersson).
- DT binding updates for qcom-cpufreq-hw driver (Konrad Dybcio and
Bartosz Golaszewski).
- Updates and fixes for mediatek driver (Jia-Wei Chang and
AngeloGioacchino Del Regno).
- Use of_property_present() for testing DT property presence in the
cpuidle code (Rob Herring).
- Drop unnecessary (void *) conversions from the PM core (Li zeming).
- Add sysfs files to represent time spent in a platform sleep state
during suspend-to-idle and make AMD and Intel PMC drivers use them
(Mario Limonciello).
- Use of_property_present() for testing DT property presence (Rob
Herring).
- Add set_required_opps() callback to the 'struct opp_table', to make
the code paths cleaner (Viresh Kumar).
- Update the pm-graph siute of utilities to v5.11 with the following
changes:
* New script which allows users to install the latest pm-graph
from the upstream github repo.
* Update all the dmesg suspend/resume PM print formats to be able to
process recent timelines using dmesg only.
* Add ethtool output to the log for the system's ethernet device if
ethtool exists.
* Make the tool more robustly handle events where mangled dmesg or
ftrace outputs do not include all the requisite data.
- Make the sleepgraph utility recognize "CPU killed" messages (Xueqin
Luo).
- Remove unneeded SRCU selection in Kconfig because it's always set
from devfreq core (Paul E. McKenney).
- Drop of_match_ptr() macro from exynos-bus.c because this driver is
always using the DT table for driver probe (Krzysztof Kozlowski).
- Use the preferred of_property_present() instead of the low-level
of_get_property() on exynos-bus.c (Rob Herring).
- Use devm_platform_get_and_ioream_resource() in exyno-ppmu.c (Yang Li).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRGvX4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxcwsQAK5wK1HWLZDap8nTGGAyvpX+bNJ3YM+l
TS1zSzWV97K6kq2bg4GTgDi6EXJJNgfP9sThOEIee5GrWAjrk9yaxjEyIcrUBjfl
oyFN8SEuYbMN5t9Bir3GRqkL+tWErUiVafplML6vTT8W8GlL2rbxPXM6ifmK9IJq
7r3Rt+tlMrookTzV+ykSGVmC5cpnlNGsvMlGGw91Z8rlICy7MI/ecg8O6Zsy25dR
Vchrg0M+jVxtaFU9/ikQaNHx0B3AF7fpi472CYYWgk1ABfIfNyQATeHsCkKan/ZV
i4+gfgIhIQnO1Ov/05aGYbBhxVpFGQIcLkG0vEmdbHsnC/WDuMCrr5wg1HCgCdpQ
+0eQem5bWxrzKp0g9tL07QG8LuiJTfjuA4DrRZNhudKFU9oglZfZeywRk+s6ta4v
rQFzz7qdlKpcM87pz/Bm8tSTc8UYNCDd7hLe+ZI940CMs/vQ4CfQJ2tlYaIl0AiO
q33Nz1iqhEycQ9OZDzBDyQtK+Xm6lsXUehIBtbqBsFsP3Ry+nxe/fz6UMs5tVNeM
BYaaNhhkiZMhXgJncMi2oR8/LRLYtOHjn1rdOGSMu9Rck5i5TVPsxqzUOzkhvuM9
eXAwts6SwFVYxtaPJs+i6yl8cdLOFORsntIBWFKuwsgH8BFx7pNFuZA33eMOA+Iw
UFey2fKDn3W5
=p/5G
-----END PGP SIGNATURE-----
Merge tag 'pm-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These update several cpufreq drivers and the cpufreq core, add sysfs
interface for exposing the time really spent in the platform low-power
state during suspend-to-idle, update devfreq (core and drivers) and
the pm-graph suite of tools and clean up code.
Specifics:
- Fix the frequency unit in cpufreq_verify_current_freq checks()
Sanjay Chandrashekara)
- Make mode_state_machine in amd-pstate static (Tom Rix)
- Make the cpufreq core require drivers with target_index() to set
freq_table (Viresh Kumar)
- Fix typo in the ARM_BRCMSTB_AVS_CPUFREQ Kconfig entry (Jingyu Wang)
- Use of_property_read_bool() for boolean properties in the pmac32
cpufreq driver (Rob Herring)
- Make the cpufreq sysfs interface return proper error codes on
obviously invalid input (qinyu)
- Add guided autonomous mode support to the AMD P-state driver (Wyes
Karny)
- Make the Intel P-state driver enable HWP IO boost on all server
platforms (Srinivas Pandruvada)
- Add opp and bandwidth support to tegra194 cpufreq driver (Sumit
Gupta)
- Use of_property_present() for testing DT property presence (Rob
Herring)
- Remove MODULE_LICENSE in non-modules (Nick Alcock)
- Add SM7225 to cpufreq-dt-platdev blocklist (Luca Weiss)
- Optimizations and fixes for qcom-cpufreq-hw driver (Krzysztof
Kozlowski, Konrad Dybcio, and Bjorn Andersson)
- DT binding updates for qcom-cpufreq-hw driver (Konrad Dybcio and
Bartosz Golaszewski)
- Updates and fixes for mediatek driver (Jia-Wei Chang and
AngeloGioacchino Del Regno)
- Use of_property_present() for testing DT property presence in the
cpuidle code (Rob Herring)
- Drop unnecessary (void *) conversions from the PM core (Li zeming)
- Add sysfs files to represent time spent in a platform sleep state
during suspend-to-idle and make AMD and Intel PMC drivers use them
Mario Limonciello)
- Use of_property_present() for testing DT property presence (Rob
Herring)
- Add set_required_opps() callback to the 'struct opp_table', to make
the code paths cleaner (Viresh Kumar)
- Update the pm-graph siute of utilities to v5.11 with the following
changes:
* New script which allows users to install the latest pm-graph
from the upstream github repo.
* Update all the dmesg suspend/resume PM print formats to be able
to process recent timelines using dmesg only.
* Add ethtool output to the log for the system's ethernet device
if ethtool exists.
* Make the tool more robustly handle events where mangled dmesg
or ftrace outputs do not include all the requisite data.
- Make the sleepgraph utility recognize "CPU killed" messages (Xueqin
Luo)
- Remove unneeded SRCU selection in Kconfig because it's always set
from devfreq core (Paul E. McKenney)
- Drop of_match_ptr() macro from exynos-bus.c because this driver is
always using the DT table for driver probe (Krzysztof Kozlowski)
- Use the preferred of_property_present() instead of the low-level
of_get_property() on exynos-bus.c (Rob Herring)
- Use devm_platform_get_and_ioream_resource() in exyno-ppmu.c (Yang
Li)"
* tag 'pm-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits)
platform/x86/intel/pmc: core: Report duration of time in HW sleep state
platform/x86/intel/pmc: core: Always capture counters on suspend
platform/x86/amd: pmc: Report duration of time in hw sleep state
PM: Add sysfs files to represent time spent in hardware sleep state
cpufreq: use correct unit when verify cur freq
cpufreq: tegra194: add OPP support and set bandwidth
cpufreq: amd-pstate: Make varaiable mode_state_machine static
PM: core: Remove unnecessary (void *) conversions
cpufreq: drivers with target_index() must set freq_table
PM / devfreq: exynos-ppmu: Use devm_platform_get_and_ioremap_resource()
OPP: Move required opps configuration to specialized callback
OPP: Handle all genpd cases together in _set_required_opps()
cpufreq: qcom-cpufreq-hw: Revert adding cpufreq qos
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCM2290
dt-bindings: cpufreq: cpufreq-qcom-hw: Sanitize data per compatible
dt-bindings: cpufreq: cpufreq-qcom-hw: Allow just 1 frequency domain
cpufreq: Add SM7225 to cpufreq-dt-platdev blocklist
cpufreq: qcom-cpufreq-hw: fix double IO unmap and resource release on exit
cpufreq: mediatek: Raise proc and sram max voltage for MT7622/7623
cpufreq: mediatek: raise proc/sram max voltage for MT8516
...
This is a much bigger change for regmap than is normal, the main things
being the addition of some KUnit coverage and a maple tree based
register cache which longer term is likely to replace the rbtree cache
except possibly for very small register maps. While it's complete
overkill for most applications the code for maple trees is there and
there are some larger, sparser devices where the data structure is a
better fit.
The maple tree support is still a work in progress but already useful,
there's some conversions of drivers ready to go after the merge window.
- Support for shifting register addresses up as well as down, there's a
use cases with memory mapped MDIO.
- Refactoring of the type configuration in regmap-irq to allow access
to driver data in the handler, needed by some GPIO devices.
- Some initial KUnit coverage, the bulk of the driver facing API is
covered but there's holes and things like the data marshalling for
bytestream buses are just not covered in the slightest.
- Removal of the compressed cache type, it had zero users and was
getting in the way of KUnit.
- Addition of a maple tree based register cache, there's more work to
do but it's already useful for some devices with a flatter data
structure than rbtree and getting to use all the optimisation work
Liam is doing.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmRGe7QACgkQJNaLcl1U
h9BNSwf9ElqCWZgVGjMoF4b5+1M0rD6jQXbWshwRYjnvVIBaMtTXuCIZmzdmV6pH
oQTiDaTCiZGQVne3Cobncf0Pk1FTtuE3IL39CB5X355cqrkMOOxw57NYufDhhQTk
UJNTORbHPZGLOmLigVeK5nVFyGUTcSt9s9Haqz1S85Ao3rnXMw9IrC5L1QAv73zQ
pNQTLOdZDyg+vdW5Yuc0sVY5PnmVLIF5abI6E1MXumCLyQ8wS2Px1fUYXnQHmtJ1
2pZFYoqjxxIfObzC1SIsZFh0lkDdPe7WBeSKsGORhWiiCY5nYba1CiZ5DFNWR8hf
hJkxbALhdk6Iksvz0X14nAP6ESv66A==
=1hr2
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This is a much bigger change for regmap than is normal, the main
things being the addition of some KUnit coverage and a maple tree
based register cache which longer term is likely to replace the rbtree
cache except possibly for very small register maps.
While it's complete overkill for most applications the code for maple
trees is there and there are some larger, sparser devices where the
data structure is a better fit.
The maple tree support is still a work in progress but already useful,
there's some conversions of drivers ready to go after the merge
window.
Summary:
- Support for shifting register addresses up as well as down, there's
a use cases with memory mapped MDIO.
- Refactoring of the type configuration in regmap-irq to allow access
to driver data in the handler, needed by some GPIO devices.
- Some initial KUnit coverage, the bulk of the driver facing API is
covered but there's holes and things like the data marshalling for
bytestream buses are just not covered in the slightest.
- Removal of the compressed cache type, it had zero users and was
getting in the way of KUnit.
- Addition of a maple tree based register cache, there's more work to
do but it's already useful for some devices with a flatter data
structure than rbtree and getting to use all the optimisation work
Liam is doing"
* tag 'regmap-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: allow upshifting register addresses before performing operations
regmap: Pass irq_drv_data as a parameter for set_type_config()
regmap: Use mas_walk() instead of mas_find()
regmap: Fix double unlock in the maple cache
regmap: Add maple tree based register cache
regmap: Factor out single value register syncing
regmap: Add some basic kunit tests
regmap: Add RAM backed register map
regmap: Removed compressed cache support
regmap: Support paging for buses with reg_read()/reg_write()
regmap: Clarify error for unknown cache types
regmap: Handle sparse caches in the default sync
regmap: add a helper to translate the register address
regmap: cache: Silence checkpatch warning
regmap: cache: Return error in cache sync operations for REGCACHE_NONE
regmap-irq: Place kernel doc of struct regmap_irq_chip in order
regmap-irq: Add no_status support
regmap: sdw: Remove 8-bit value size restriction
regmap: sdw: Update misleading comment
o MAINTAINERS files additions and changes.
o Fix hotplug warning in nohz code.
o Tick dependency changes by Zqiang.
o Lazy-RCU shrinker fixes by Zqiang.
o rcu-tasks stall reporting improvements by Neeraj.
o Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
name for robustness.
o Documentation Updates:
o Significant changes to srcu_struct size.
o Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
o rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
o Other misc changes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
+Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
=Rgbj
-----END PGP SIGNATURE-----
Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux
Pull RCU updates from Joel Fernandes:
- Updates and additions to MAINTAINERS files, with Boqun being added to
the RCU entry and Zqiang being added as an RCU reviewer.
I have also transitioned from reviewer to maintainer; however, Paul
will be taking over sending RCU pull-requests for the next merge
window.
- Resolution of hotplug warning in nohz code, achieved by fixing
cpu_is_hotpluggable() through interaction with the nohz subsystem.
Tick dependency modifications by Zqiang, focusing on fixing usage of
the TICK_DEP_BIT_RCU_EXP bitmask.
- Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
kernels, fixed by Zqiang.
- Improvements to rcu-tasks stall reporting by Neeraj.
- Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
increased robustness, affecting several components like mac802154,
drbd, vmw_vmci, tracing, and more.
A report by Eric Dumazet showed that the API could be unknowingly
used in an atomic context, so we'd rather make sure they know what
they're asking for by being explicit:
https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/
- Documentation updates, including corrections to spelling,
clarifications in comments, and improvements to the srcu_size_state
comments.
- Better srcu_struct cache locality for readers, by adjusting the size
of srcu_struct in support of SRCU usage by Christoph Hellwig.
- Teach lockdep to detect deadlocks between srcu_read_lock() vs
synchronize_srcu() contributed by Boqun.
Previously lockdep could not detect such deadlocks, now it can.
- Integration of rcutorture and rcu-related tools, targeted for v6.4
from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
module parameter, and more
- Miscellaneous changes, various code cleanups and comment improvements
* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
checkpatch: Error out if deprecated RCU API used
mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
rcu/trace: use strscpy() to instead of strncpy()
tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
...
device_property functions do not modify the device pointer passed to them.
The underlying of_device and fwnode_ functions actually already take
const * arguments. Mark the parameter constant to simplify conversion
from of_property to device_property functions, and to let the calling code
use const device pointers where possible.
Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20230419164127.3773278-1-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Document that some subsystems are still going to use device_rename for
the time being, so it is not a good idea to assume it's not used. Also
remove mentions of a plan to stop renaming net devices.
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Link: https://lore.kernel.org/r/20230406045435.19452-1-wedsonaf@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().
Fixes: eb7fbc9fb1 ("driver core: Add missing '\n' in log messages")
Cc: stable <stable@kernel.org>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Brian Norris <briannorris@chromium.org>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The crypto dependencies for the firmwware loader are incomplete,
in particular a built-in FW_LOADER fails to link against a modular
crypto hash driver:
ld.lld: error: undefined symbol: crypto_alloc_shash
ld.lld: error: undefined symbol: crypto_shash_digest
ld.lld: error: undefined symbol: crypto_destroy_tfm
>>> referenced by main.c
>>> drivers/base/firmware_loader/main.o:(fw_log_firmware_info) in archive vmlinux.a
Rework this to use the usual 'select' from the driver module,
to respect the built-in vs module dependencies, and add a
more verbose crypto dependency to the debug option to prevent
configurations that lead to a link failure.
Fixes: 02fe26f253 ("firmware_loader: Add debug message with checksum for FW file")
Reviewed-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230414080329.76176-1-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Having helped an user recently figure out why the customized path being
specified was not taken into account landed on a subtle difference
between using:
echo "/xyz/firmware" > /sys/module/firmware_class/parameters/path
which inserts an additional newline which is passed as is down to
fw_get_filesystem_firmware() and ultimately kernel_read_file_from_path()
and fails.
Strip off \n from the customized firmware path such that users do not
run into these hard to debug situations.
Link: https://lore.kernel.org/all/20230402135423.3235-1-f.fainelli@gmail.com/
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230413191757.1949088-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cache information can be extracted from either a Device Tree(DT),
the PPTT ACPI table, or arch registers (clidr_el1 for arm64).
When the DT is used but no cache properties are advertised, the current
code doesn't correctly fallback to using arch information. The changes
fixes the same and also assuse the that L1 data/instruction caches
are private and L2/higher caches are shared when the cache information
is missing in DT/ACPI and is derived form clidr_el1/arch registers.
Currently the cacheinfo is built from the primary CPU prior to secondary
CPUs boot, if the DT/ACPI description contains cache information.
However, if not present, it still reverts to the old behavior, which
allocates the cacheinfo memory on each secondary CPUs which causes
RT kernels to triggers a "BUG: sleeping function called from invalid
context".
The changes here attempts to enable automatic detection for RT kernels
when no DT/ACPI cache information is available, by pre-allocating
cacheinfo memory on the primary CPU.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmQ9bmEACgkQAEG6vDF+
4pggDhAAo75vFky/U63PV9x4IMvVTkNGIK/DSihvG+s5GsujiSS78f4tSSpLAY1X
yWTEYsU+1q03cE8rv0Xmkw+cxLerjOGisvPZCdnKVljUF5TWez6OlwKw5V1QqWk9
faOmWBSb8fH4X0ys73e0SBsF3bzyJB/cORmbOL8OnCE7rGkyMQ0plhYVOBQ3CoV8
KMrw7rnPlc5Aoq/8LgTY+Gojqf82njacCIQPrn1TjS/V0SdobC8xm7ZoepZgG9Q1
MHzBtTmKzGrVxWawxcfTaE89t2rudAafa3noYr4xy+dI8ptSkTQNKCG9pvTxIcFo
KXKfBO6+hTWNREf7a4ginKdKcIKDD7Oo2oTnPBYr9O9/2r727c5Wa5FEGsXa4c14
0dN3vFbhwWIDc5sA7eBOCDYycJm6XyTGJ6iPR+fmaWKwgXhIM2WxakeO8aptuwhn
sV+XO7IJxSF79GrrAkSN6AtVj6NHvNpzwRx9mBwHkT3ajDGb04P23AOuqoNDt7kh
wcz217ng3hG4CDEbOk9JCQqtFhJeSuIWP5T3wofUfYeXy2opQhwzmiOL+NVROuGY
h/cwQCcXCEYUohJhln2PvbMS7w0bBNserpetzEFJfc5aLLdocaH3aQ/AFnCXmCdN
YPXP6bp0WdbmfcMskVY8vlb/TBlcOzu6+W0qB1dXTWuib7cwm3o=
=4hhJ
-----END PGP SIGNATURE-----
Merge tag 'cacheinfo-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
cacheinfo and arch_topology updates for v6.4
The cache information can be extracted from either a Device Tree(DT),
the PPTT ACPI table, or arch registers (clidr_el1 for arm64).
When the DT is used but no cache properties are advertised, the current
code doesn't correctly fallback to using arch information. The changes
fixes the same and also assuse the that L1 data/instruction caches
are private and L2/higher caches are shared when the cache information
is missing in DT/ACPI and is derived form clidr_el1/arch registers.
Currently the cacheinfo is built from the primary CPU prior to secondary
CPUs boot, if the DT/ACPI description contains cache information.
However, if not present, it still reverts to the old behavior, which
allocates the cacheinfo memory on each secondary CPUs which causes
RT kernels to triggers a "BUG: sleeping function called from invalid
context".
The changes here attempts to enable automatic detection for RT kernels
when no DT/ACPI cache information is available, by pre-allocating
cacheinfo memory on the primary CPU.
* tag 'cacheinfo-updates-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
The cache information can be extracted from either a Device
Tree (DT), the PPTT ACPI table, or arch registers (clidr_el1
for arm64).
The clidr_el1 register is used only if DT/ACPI information is not
available. It does not states how caches are shared among CPUs.
Add a use_arch_cache_info field/function to identify when the
DT/ACPI doesn't provide cache information. Use this information
to assume L1 caches are privates and L2 and higher are shared among
all CPUs.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230414081453.244787-5-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
fetch_cache_info() tries to get the number of cache leaves/levels
for each CPU in order to pre-allocate memory for cacheinfo struct.
Allocating this memory later triggers a:
'BUG: sleeping function called from invalid context'
in PREEMPT_RT kernels.
If there is no cache related information available in DT or ACPI,
fetch_cache_info() fails and an error message is printed:
'Early cacheinfo failed, ret = ...'
Not having cache information should be a valid configuration.
Remove the error message if fetch_cache_info() fails with -ENOENT.
Suggested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
If a Device Tree (DT) is used, the presence of cache properties is
assumed. Not finding any is not considered. For arm64 platforms,
cache information can be fetched from the clidr_el1 register.
Checking whether cache information is available in the DT
allows to switch to using clidr_el1.
init_of_cache_level()
\-of_count_cache_leaves()
will assume there a 2 cache leaves (L1 data/instruction caches), which
can be different from clidr_el1 information.
cache_setup_of_node() tries to read cache properties in the DT.
If there are none, this is considered a success. Knowing no
information was available would allow to switch to using clidr_el1.
Fixes: de0df442ee ("cacheinfo: Check 'cache-unified' property to count cache leaves")
Reported-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/all/20230404-hatred-swimmer-6fecdf33b57a@spud/
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
If there is no ACPI/DT information, it is assumed that L1 caches
are private and L2 (and higher) caches are shared. A cache is
'shared' between two CPUs if it is accessible from these two
CPUs.
Each CPU owns a representation (i.e. has a dedicated cacheinfo struct)
of the caches it has access to. cache_leaves_are_shared() tries to
identify whether two representations are designating the same actual
cache.
In cache_leaves_are_shared(), if 'this_leaf' is a L2 cache (or higher)
and 'sib_leaf' is a L1 cache, the caches are detected as shared as
only this_leaf's cache level is checked.
This is leads to setting sib_leaf as being shared with another CPU,
which is incorrect as this is a L1 cache.
Check 'sib_leaf->level'. Also update the comment as the function is
called when populating 'shared_cpu_map'.
Fixes: f16d1becf9 ("cacheinfo: Use cache identifiers to check if the caches are shared if available")
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230414081453.244787-2-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Now that of_cpu_device_node_get() is defined in of.h, of_device.h is just
implicitly including other includes, and is no longer needed. Update the
includes to use of.h instead of of_device.h.
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230329-dt-cpu-header-cleanups-v1-10-581e2605fe47@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Recent work enables cacheinfo memory for secondary CPUs to be allocated
early, while still running on the primary CPU. That allows cacheinfo
memory to be allocated safely on RT kernels. To make that work, the
number of cache levels/leaves must be defined in the device tree or ACPI
tables. Further work adds a path for early detection of the number of
cache levels/leaves, which makes it possible to allocate the cacheinfo
memory early without requiring extra DT/ACPI information.
This patch addresses a specific issue with ACPI systems with no PPTT. In
that case, parse_acpi_topology() returns an error code, which in turn
makes init_cpu_topology() return early, before fetch_cache_info() is
called. In that case, the early cache level detection doesn't run.
The solution is to simply remove the "return" statement and let the code
flow fall through to calling fetch_cache_info().
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reported-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/all/dea94484-797f-3034-7b86-6d88801c0d91@arm.com/
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230412185759.755408-4-rrendec@redhat.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
This patch gives architecture specific code the ability to initialize
the cache level and allocate cacheinfo memory early, when cache level
initialization runs on the primary CPU for all possible CPUs.
This is part of a patch series that attempts to further the work in
commit 5944ce092b ("arch_topology: Build cacheinfo from primary CPU").
Previously, in the absence of any DT/ACPI cache info, architecture
specific cache detection and info allocation for secondary CPUs would
happen in non-preemptible context during early CPU initialization and
trigger a "BUG: sleeping function called from invalid context" splat on
an RT kernel.
More specifically, this patch adds the early_cache_level() function,
which is called by fetch_cache_info() as a fallback when the number of
cache leaves cannot be extracted from DT/ACPI. In the default generic
(weak) implementation, this new function returns -ENOENT, which
preserves the original behavior for architectures that do not implement
the function.
Since early detection can get the number of cache leaves wrong in some
cases*, additional logic is added to still call init_cache_level() later
on the secondary CPU, therefore giving the architecture specific code an
opportunity to go back and fix the initial guess. Again, the original
behavior is preserved for architectures that do not implement the new
function.
* For example, on arm64, CLIDR_EL1 detection works only when it runs on
the current CPU. In other words, a CPU cannot detect the cache depth
for any other CPU than itself.
Signed-off-by: Radu Rendec <rrendec@redhat.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20230412185759.755408-2-rrendec@redhat.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Similar to the existing reg_downshift mechanism, that is used to
translate register addresses on busses that have a smaller address
stride, it's also possible to want to upshift register addresses.
Such a case was encountered when network PHYs and PCS that usually sit
on a MDIO bus (16-bits register with a stride of 1) are integrated
directly as memory-mapped devices. Here, the same register layout
defined in 802.3 is used, but the register now have a larger stride.
Introduce a mechanism to also allow upshifting register addresses.
Re-purpose reg_downshift into a more generic, signed reg_shift, whose
sign indicates the direction of the shift. To avoid confusion, also
introduce macros to explicitly indicate if we want to downshift or
upshift.
For bisectability, change any use of reg_downshift to use reg_shift.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230407152604.105467-1-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Assignments from pointer variables of type (void *) do not require
explicit type casts, so remove such type cases from the code in
drivers/base/power/main.c where applicable.
Signed-off-by: Li zeming <zeming@nfschina.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Merge series from William Breathitt Gray <william.gray@linaro.org>:
The regmap API supports IO port accessors so we can take advantage of
regmap abstractions rather than handling access to the device registers
directly in the driver.
A patch to pass irq_drv_data as a parameter for struct regmap_irq_chip
set_type_config() is included. This is needed by the
idio_24_set_type_config() and ws16c48_set_type_config() callbacks in
order to update the type configuration on their respective devices.
For CONFIG_NO_HZ_FULL systems, the tick_do_timer_cpu cannot be offlined.
However, cpu_is_hotpluggable() still returns true for those CPUs. This causes
torture tests that do offlining to end up trying to offline this CPU causing
test failures. Such failure happens on all architectures.
Fix the repeated error messages thrown by this (even if the hotplug errors are
harmless) by asking the opinion of the nohz subsystem on whether the CPU can be
hotplugged.
[ Apply Frederic Weisbecker feedback on refactoring tick_nohz_cpu_down(). ]
For drivers/base/ portion:
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: rcu <rcu@vger.kernel.org>
Cc: stable@vger.kernel.org
Fixes: 2987557f52 ("driver-core/cpu: Expose hotpluggability to the rest of the kernel")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Liam recommends using mas_walk() instead of mas_find() for our use case so
let's do that, it avoids some minor overhead associated with being able to
restart the operation which we don't need since we do a simple search.
Suggested-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-walk-fine-v2-1-c07371c8a867@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Doing the dance to drop the maple tree's internal spinlock means we need
multiple exit paths in our error handling.
Reported-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230403-regmap-maple-unlock-v1-1-89998991b16c@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The add_dev and remove_dev callbacks in struct class_interface currently
pass in a pointer back to the class_interface structure that is calling
them, but none of the callback implementations actually use this pointer
as it is pointless (the structure is known, the driver passed it in in
the first place if it is really needed again.)
So clean this up and just remove the pointer from the callbacks and fix
up all callback functions.
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Allen Hubbe <allenbh@gmail.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: John Stultz <jstultz@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Weiyang <wangweiyang2@huawei.com>
Cc: Yang Yingliang <yangyingliang@huawei.com>
Cc: Jakob Koschel <jakobkoschel@gmail.com>
Cc: Cai Xinchen <caixinchen1@huawei.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Link: https://lore.kernel.org/r/2023040250-pushover-platter-509c@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct class pointer in struct class_interface is never modified, so
mark it as const so that no one accidentally tries to modify it in the
future.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040249-handball-gruffly-5da7@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the class code is cleaned up to not modify the class pointer
registered with it, change class_register() to take a const * to allow
the structure to be placed into read-only memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040248-customary-release-4aec@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The struct class callback, class_release(), is only called in 2 places,
the pcmcia cardservices code, and in the class driver core code. Both
places it is safe to mark the structure as a const *, to allow us to
in the future mark all struct class usages as constant and move into
read-only memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040248-outrage-obsolete-5a9a@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device_create() and device_create_with_groups() function comments
incorrectly state that they only work with a struct class that was
created using class_create(), but that is not true now and I am not sure
if it ever was. So just remove the comment as it's not needed now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/2023040218-scouts-unplowed-24d2@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The current state of the art for sparse register maps is the
rbtree cache. This works well for most applications but isn't
always ideal for sparser register maps since the rbtree can get
deep, requiring a lot of walking. Fortunately the kernel has a
data structure intended to address this very problem, the maple
tree. Provide an initial implementation of a register cache
based on the maple tree to start taking advantage of it.
The entries stored in the maple tree are arrays of register
values, with the maple tree keys holding the register addresses.
We store data in host native format rather than device native
format as we do for rbtree, this will be a benefit for devices
where we don't marshal data within regmap and simplifies the code
but will result in additional CPU overhead when syncing the cache
on devices where we do marshal data in regmap.
This should work well for a lot of devices, though there's some
additional areas that could be looked at such as caching the
last accessed entry like we do for rbtree and trying to minimise
the maple tree level locking. We should also use bulk writes
rather than single register writes when resyncing the cache where
possible, even if we don't store in device native format.
Very small register maps may continue to to better with rbtree
longer term.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230325-regcache-maple-v3-2-23e271f93dc7@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
In order to support sparse caches that don't store data in raw format
factor out the parts of the raw block sync implementation that deal with
writing a single register via _regmap_write().
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230325-regcache-maple-v3-1-23e271f93dc7@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Syzbot found that we had forgotten to unregister the lock_class_key when
using it in commit dcfbb67e48 ("driver core: class: use lock_class_key
already present in struct subsys_private") so fix that up and correctly
release it when done.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-and-tested-by: <syzbot+41d665317c811d4d88aa@syzkaller.appspotmail.com>
Fixes: dcfbb67e48 ("driver core: class: use lock_class_key already present in struct subsys_private")
Link: https://lore.kernel.org/r/2023040126-blandness-duckling-bd55@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nothing outside of drivers/base/core.c uses sysfs_dev_char_kobj, so
make it static and document what it is used for so we remember it the
next time we touch it 15 years from now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nothing outside of drivers/base/core.c uses sysfs_dev_block_kobj, so
make it static and document what it is used for so we remember it the
next time we touch it 15 years from now.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The dev_kobj field in struct class is now only written to, but never
read from, so it can be removed as it is useless.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a dev_t is set in a struct device, an symlink in /sys/dev/ is
created for it either under /sys/dev/block/ or /sys/dev/char/ depending
on the device type.
The logic to determine this would trigger off of the class of the
object, and the kobj_type set in that location. But it turns out that
this deep nesting isn't needed at all, as it's either a choice of block
or "everything else" which is a char device. So make the logic a lot
more simple and obvious, and remove the incorrect comments in the code
that tried to document something that was not happening at all (it is
impossible to set class->dev_kobj to NULL as the class core prevented
that from happening.
This removes the only place that class->dev_kobj was being used, so
after this, it can be removed entirely.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the last users of the subsystem private pointer in struct class
are gone, the pointer can be removed, as no one is using it. One step
closer to allowing struct class to be const and moved into read-only
memory.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some classes (i.e. gpio), want to know if they have been registered or
not, and poke around in the class's internal structures to try to figure
this out. Because this is not really a good idea, provide a function
for classes to call to try to figure this out.
Note, this is racy as the state of the class could change at any moment
in time after the call is made, but as usually a class only wants to
know if it has been registered yet or not, it should be fairly safe to
use, and is just as safe as the previous "poke at the class internals"
check was.
Move the gpiolib code to use this function as proof that it works
properly.
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: linux-gpio@vger.kernel.org
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a number of places in core.c that need access to the private
subsystem structure of struct class, so move them to use
class_to_subsys() instead of accessing it directly.
This requires exporting class_to_subsys() out of class.c, but keeping it
local to the driver core.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230331093318.82288-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On the theory that it's better to make a start let's add some KUnit tests
for regmap. Currently this is a bit of a mess but it passes and hopefully
will at some point help catch problems. We provide very basic cover for
most of the core functionality that operates at the register level,
repeating each test for each cache type in order to exercise the caches.
There is no coverage of anything to do with the bulk operations at the bus
level or formatting for byte stream buses yet.
Each test creates it's own regmap since the cache structures are built
incrementally, meaning we gain coverage from the different access
patterns, and some of the tests cover different init scenarios.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regmap-kunit-v2-2-b208801dc2c8@kernel.org
Add a register map that is a simple array of memory, for use in
KUnit testing of the framework. This is not exposed in regmap.h
since I can't think of a non-test use case, it is purely for use
internally. To facilitate testing we track if registers have been
read or written to.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regmap-kunit-v2-1-b208801dc2c8@kernel.org
The compressed register cache support has assumptions that make it hard to
cover in testing, mainly that it requires raw registers defaults be
provided. Rather than either address these assumptions or leave it untested
by the forthcoming KUnit tests let's remove it, the use case is quite thin
and there are no current users.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regcache-lzo-v1-1-08c5d63e2a5e@kernel.org
Several SoC drivers use the same of-based mechanism to populate the machine
name. Therefore move this to the core and try to populate the machine name
in soc_device_register if it's not set yet.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://lore.kernel.org/r/6dbdf458-9f46-613e-de58-b4a56a6cdd9f@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After entering 6.3-rc1 the LLC cacheinfo is not exported on our ACPI
based arm64 server. This is because the LLC cacheinfo is partly reset
when secondary CPUs boot up. On arm64 the primary cpu will allocate
and setup cacheinfo:
init_cpu_topology()
for_each_possible_cpu()
fetch_cache_info() // Allocate cacheinfo and init levels
detect_cache_attributes()
cache_shared_cpu_map_setup()
if (!last_level_cache_is_valid()) // not valid, setup LLC
cache_setup_properties() // setup LLC
On secondary CPU boot up:
detect_cache_attributes()
populate_cache_leaves()
get_cache_type() // Get cache type from clidr_el1,
// for LLC type=CACHE_TYPE_NOCACHE
cache_shared_cpu_map_setup()
if (!last_level_cache_is_valid()) // Valid and won't go to this branch,
// leave LLC's type=CACHE_TYPE_NOCACHE
The last_level_cache_is_valid() use cacheinfo->{attributes, fw_token} to
test it's valid or not, but populate_cache_leaves() will only reset
LLC's type, so we won't try to re-setup LLC's type and leave it
CACHE_TYPE_NOCACHE and won't export it through sysfs.
This patch tries to fix this by not re-populating the cache leaves if
the LLC is valid.
Fixes: 5944ce092b ("arch_topology: Build cacheinfo from primary CPU")
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20230328114915.33340-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that class_to_subsys() can be used to get access to the internal
class private pointer, convert the remaining few places in class.c that
were accessing the pointer directly to use class_to_subsys() instead.
By doing this, the need for class_get() and class_put() goes away as no
one actually tries to increment the class structures anymore, only the
internal dynamic one.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325194234.46588-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Much like what was done in commit 273afac615 ("driver core: bus:
implement bus_get/put() without the private pointer"), it is time to
move the driver core away from using the internal private pointer in
struct class in order to enable it to be always a constant and be placed
in read-only memory in the future.
First step in doing this is to create a helper function that turns a
'struct class' into 'struct subsys_private' called class_to_subsys().
class_to_subsys() walks the list of registered busses in the system and
finds the matching one based on the pointer to the class itself. As
this is a short list, and this function is not on any fast path, it
should not be noticable.
Implement class_get() and class_put() using this new helper function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325194234.46588-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct class should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
class to be moved to read-only memory.
While we are touching all class sysfs callbacks also mark the attribute
as constant as it can not be modified. The bonding code still uses this
structure so it can not be removed from the function callbacks.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steve French <sfrench@samba.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: linux-cifs@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: netdev@vger.kernel.org
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230325084537.3622280-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add a build time equivalent of fw_devlink.sync_state=timeout so that
board specific kernels could enable it and not have to deal with setting
or cluttering the kernel commandline.
Cc: Doug Anderson <dianders@chromium.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20230317205134.964098-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_unregister() and class_destroy() function should be taking a
const * to struct class, not just a *, so fix that up.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230325084526.3622123-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The structure sysfs_dev_char_kobj is local only to the driver core code,
so move it out of the global class.h file and into the internal base.h
file as no one else should be touching this symbol.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230327160319.513974-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit dcfbb67e48 ("driver core: class: use lock_class_key already
present in struct subsys_private") we removed the key parameter to the
function class_create() but forgot to remove it from the kerneldoc,
which causes a build warning. Fix that up by removing the key parameter
from the documentation as it is now gone.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: dcfbb67e48 ("driver core: class: use lock_class_key already present in struct subsys_private")
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230327081828.1087364-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The error message printed when we fail to locate the cache type the map
requested says it can't find a compress type rather than a cache type,
fix that. Since the compressed type is the only one currently compiled
conditionally it's likely to be the missing type but that might not always
be true and is still unclear.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20230324-regcache-unknown-v1-1-80deecbf196b@kernel.org
In commit 37e98d9bed ("driver core: bus: move lock_class_key into
dynamic structure"), the lock_key variable moved out of struct bus_type
and into struct subsys_private, yet the documentation for it did not
move. Fix that up and place the documentation comment in the correct
location.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Fixes: 37e98d9bed ("driver core: bus: move lock_class_key into dynamic structure")
Link: https://lore.kernel.org/r/20230324090814.386654-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from Maxime Chevallier <maxime.chevallier@bootlin.com>:
This introduces a helper that factors out register rewriting, it
will be the basis for further work that will need cross tree
merges so is on a branch.
Register addresses passed to regmap operations can be offset with
regmap.reg_base and downshifted with regmap.reg_downshift.
Add a helper to apply both these operations and return the translated
address, that we can then use to perform the actual register operation
ont the underlying bus.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://lore.kernel.org/r/20230324093644.464704-2-maxime.chevallier@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/base/physical_location.h as
they are not needed.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324122711.2664537-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/base/base.h as they are not
needed.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324122711.2664537-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit 37e98d9bed ("driver core: bus: move lock_class_key into
dynamic structure"), we moved the lock_class_key into the internal
structure shared by busses and classes, but only used it for buses.
Move the class code to use this structure as it is already present and
being allocated, instead of the statically allocated on-the-stack
variable that class_create() was using as part of a macro wrapper around
the core function call.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230324100132.1633647-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fwnode parameter is not altered in the following APIs:
- fwnode_get_next_parent_dev()
- fwnode_is_ancestor_of()
- fwnode_graph_get_endpoint_count()
so constify them.
Reported-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20230324112720.71315-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fwnode_get_phy_mode() does not modify the fwnode argument, merely
using it to obtain the phy-mode property value. Therefore, it can
be made const.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/E1pfdh9-00EQ8t-HB@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's funny to think about getting a reference count of a constant
structure pointer, but this locks into place the private data
"underneath" the struct bus_type() which is important to not go away
while we are working with the bus structure for some callbacks.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-27-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bus_rescan_devices() function was missed in the previous change of
the bus_for_each* constant pointer changes, so fix it up now to take a
const * to struct bus_type.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-25-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bus_register() is now safe to take a constant * to bus_type, so make
that change and mark the subsys_private bus_type * constant as well.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-24-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that all accesses of dev_root is through the bus_get_dev_root()
call, move the pointer out of struct bus_type and into the private
dynamic structure, subsys_private.
With this change, there is no modifiable portions of struct bus_type so
it can be marked as a constant structure and moved to read-only memory.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313182918.1312597-22-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The functions device_create() and device_create_with_groups() do not
modify the struct class passed into it, so enforce this by changing the
function parameters to be struct const class.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-12-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The pointer to a struct class in a struct device should never be used to
change anything in that class. So mark it as constant to enforce this
requirement.
This requires a few minor changes to some internal driver core functions
to enforce the const * being used here now.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_create_file*() and class_remove_file*() functions do not
modify the struct class at all, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The class_find_device*() functions do not modify the struct class or the
struct device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
class_for_each_device() does not modify the struct class or the struct
device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
class_dev_iter_init() does not modify the struct class or the struct
device passed into it, so mark them as const * to enforce that.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to manually set the owner of a struct class, as the
registering function does it automatically, so remove all of the
explicit settings from various drivers that did so as it is unneeded.
This allows us to remove this pointer entirely from this structure going
forward.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's no need to manually have to set the module owner of a class, the
compiler should do it automatically for you, so add a module * to the
__class_register() function and allow it to set the module owner
automatically.
This will let us move the module pointer out of struct class eventually,
as it should not be embedded in there if we wish for it to be a
read-only structure eventually.
And, funny story, this module pointer isn't even being used for
anything, so while we will keep it around for now, it's not like it even
matters.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
checkpatch.pl warned:
WARNING: ENOSYS means 'invalid syscall nr' and nothing else
Align the return value to regcache_drop_region().
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20230313071812.13577-2-alexander.stein@ew.tq-group.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no sense in doing a cache sync on REGCACHE_NONE regmaps.
Instead of panicking the kernel due to missing cache_ops, return an error
to client driver.
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Link: https://lore.kernel.org/r/20230313071812.13577-1-alexander.stein@ew.tq-group.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Some of the functions do not provide Return: section on absence of which
kernel-doc complains. Besides that several functions return the fwnode
handle with incremented reference count. Add a respective note to make sure
that the caller decrements it when it's not needed anymore.
While at it, unify the style of the Return: sections.
Reported-by: Daniel Kaehn <kaehndan@gmail.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230217133344.79278-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the file is written to and sync_state() hasn't been called for the
device yet, then call sync_state() for the device independent of the
state of its consumers.
This is useful for supplier devices that have one or more consumers that
don't have a driver but the consumers are in a state that don't use the
resources supplied by the supplier device.
This gives finer grained control than using the
fw_devlink.sync_state=timeout kernel commandline parameter.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When all devices that could probe have finished probing (based on
deferred_probe_timeout configuration or late_initcall() when
!CONFIG_MODULES), this parameter controls what to do with devices that
haven't yet received their sync_state() calls.
fw_devlink.sync_state=strict is the default and the driver core will
continue waiting on all consumers of a device to probe successfully
before sync_state() is called for the device. This is the default
behavior since calling sync_state() on a device when all its consumers
haven't probed could make some systems unusable/unstable. When this
option is selected, we also print the list of devices that haven't had
sync_state() called on them by the time all devices the could probe have
finished probing.
fw_devlink.sync_state=timeout will cause the driver core to give up
waiting on consumers and call sync_state() on any devices that haven't
yet received their sync_state() calls. This option is provided for
systems that won't become unusable/unstable as they might be able to
save power (depends on state of hardware before kernel starts) if all
devices get their sync_state().
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230304005355.746421-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In removing the CONFIG_SYSFS_DEPRECATED and CONFIG_SYSFS_DEPRECATED_V2
config options, I messed up in the __class_register() function and got
the logic incorrect. Fix this all up by just removing the special case
of a block device class logic in this function, as that is what is
intended.
In testing, this solves the boot problem on my systems, hopefully on
others as well.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 721da5cee9 ("driver core: remove CONFIG_SYSFS_DEPRECATED and CONFIG_SYSFS_DEPRECATED_V2")
Link: https://lore.kernel.org/r/20230307075102.3537-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from William Breathitt Gray <william.gray@linaro.org>:
There are devices which have interrupt support with mask and ack
registers but no status register. Add a flag which lets us support
them, we just assume that all the interrupts fired.
CONFIG_SYSFS_DEPRECATED was added in commit 88a22c985e
("CONFIG_SYSFS_DEPRECATED") in 2006 to allow systems with older versions
of some tools (i.e. Fedora 3's version of udev) to boot properly. Four
years later, in 2010, the option was attempted to be removed as most of
userspace should have been fixed up properly by then, but some kernel
developers clung to those old systems and refused to update, so we added
CONFIG_SYSFS_DEPRECATED_V2 in commit e52eec13cd ("SYSFS: Allow boot
time switching between deprecated and modern sysfs layout") to allow
them to continue to boot properly, and we allowed a boot time parameter
to be used to switch back to the old format if needed.
Over time, the logic that was covered under these config options was
slowly removed from individual driver subsystems successfully, removed,
and the only thing that is now left in the kernel are some changes in
the block layer's representation in sysfs where real directories are
used instead of symlinks like normal.
Because the original changes were done to userspace tools in 2006, and
all distros that use those tools are long end-of-life, and older
non-udev-based systems do not care about the block layer's sysfs
representation, it is time to finally remove this old logic and the
config entries from the kernel.
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: linux-block@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Acked-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/20230223073326.2073220-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices lack status registers, yet expect to handle interrupts.
Introduce a no_status flag to indicate such a configuration, where
rather than read a status register to verify, all interrupts received
are assumed to be active.
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/bd501b4b5ff88da24d467f75e8c71b4e0e6f21e2.1677515341.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Some SoundWire devices have larger width device specific register
maps, in addition to the standard SoundWire 8-bit map. Update the
helpers to allow accessing arbitrarily sized register values and remove
the explicit 8-bit restriction from regmap_sdw_config_check.
Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230112171840.2098463-3-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
In the regmap config reg_bits represents the number of address bits not
the number of value bits. Correct the misleading comment which looks a
lot like it suggests the register value itself is 32-bits wide.
Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20230112171840.2098463-2-ckeepax@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
- Prevent possible NULL pointer derefences in irq_data_get_affinity_mask()
and irq_domain_create_hierarchy().
- Take the per device MSI lock before invoking code which relies
on it being hold.
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted to
use core infrastructure and results in a fals positive warning.
- Remove dead code in the MSI subsystem.
- Clarify the documentation for pci_msix_free_irq().
- More kobj_type constification.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmQEVToTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYodc9EACg5HBOGsh5OV8pwnuEDqfThK/dOZ5k
LJ8xQjGAx29JeNWu4gkkHaSDEGjZhwLiZlB6qUeH+LPQ1NgmAlLL3T2NxEOOWa6y
z/xQv+1Ceu6XxazpCSFRWR/6w4Nyup92jhlsUIkmmsWkVvKH/pV6Uo+3ta0WagWg
heb3vqts6J0AOJaMepF8azYGbwAPSIElNLI1UtiEuQYEKU55N8jLK20VJTL6lzJ2
FyRg/0ghNWDAaBdnv4cZCQ/MzoG5UkoU3f2cqhdSce5mqnq2fKRfgBjzllNgaRgA
zxOxIR88QaKTMHIr+WKD1dyWxDQlotFbBOkmVW39XAa13rn42s4GIeW88VCjJGww
RAm52SbC48cCIyNQlh4A6Vhb4vjPx2DndWbWnnVj5fWlUevdAPRxSlm4BjfxFxe+
LbuZCRRL1jjlC0fXmhVXTTxeE1/K7jarAZwRV7Nxhr3g0gT+Zv1jyaaW9rWuHq5U
3pS+xBl89LA/VYp9tv6jDfJlocmRwgrFbGX4UlfikqtObdTFqcH0FtmqisE61fZS
n0194BMWNDfPSibSpDohf/CDPoHZ6pNxeuqkVDiisUJHPpIYOt8+lH+8//DgBL7a
oi9zS0JazPIn2VM6NB4f/WXOYmS9GZq5+loiYEWb52AYtodKUKmOoWG0SUyy6XFr
E7yJzemsUwrJVg==
=jWot
-----END PGP SIGNATURE-----
Merge tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"A set of updates for the interrupt susbsystem:
- Prevent possible NULL pointer derefences in
irq_data_get_affinity_mask() and irq_domain_create_hierarchy()
- Take the per device MSI lock before invoking code which relies on
it being hold
- Make sure that MSI descriptors are unreferenced before freeing
them. This was overlooked when the platform MSI code was converted
to use core infrastructure and results in a fals positive warning
- Remove dead code in the MSI subsystem
- Clarify the documentation for pci_msix_free_irq()
- More kobj_type constification"
* tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced
genirq/msi: Drop dead domain name assignment
irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy()
genirq/irqdesc: Make kobj_type structures constant
PCI/MSI: Clarify usage of pci_msix_free_irq()
genirq/msi: Take the per-device MSI lock before validating the control structure
genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()
Here is another small set of driver core patches for 6.3-rc1
They resolve some reported problems with the previous driver core
patches that are in your tree.
They solve a problem with the bus_type cleanup as reported and fixced by
Geert, and 2 fw_devlink changes to make debugging problems easier.
There is one known outstanding problem with the fw_deflink changes in
your tree that is still being worked on, and it looks like a clk core
change will be submitted soon for that, probably after 6.3-rc1.
All 3 of these have been in linux-next with no reported problems (only
reports that they fixed problems.)
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZAB4LQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymOqgCfWIGanNU4fg2DOB4eyJu2JBQmq00AoJeHmJhR
/pAjBOMYYJYVCqQCR6ik
=1UhM
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1_2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here is another small set of driver core patches.
They resolve some reported problems with the previous driver core
patches that are in your tree.
They solve a problem with the bus_type cleanup as reported and fixed
by Geert, and two fw_devlink changes to make debugging problems
easier.
There is one known outstanding problem with the fw_deflink changes in
your tree that is still being worked on, and it looks like a clk core
change will be submitted soon for that, probably after 6.3-rc1.
All three of these have been in linux-next with no reported problems
(only reports that they fixed problems)"
* tag 'driver-core-6.3-rc1_2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: fw_devlink: Print full path and name of fwnode
driver core: fw_devlink: Avoid spurious error message
driver core: bus: Handle early calls to bus_to_subsys()
Miquel reported a warning in the MSI core which is triggered when
interrupts are freed via platform_msi_device_domain_free().
This code got reworked to use core functions for freeing the MSI
descriptors, but nothing took care to clear the msi_desc->irq entry, which
then triggers the warning in msi_free_msi_desc() which uses desc->irq to
validate that the descriptor has been torn down. The same issue exists in
msi_domain_populate_irqs().
Up to the point that msi_free_msi_descs() grew a warning for this case,
this went un-noticed.
Provide the counterpart of msi_domain_populate_irqs() and invoke it in
platform_msi_device_domain_free() before freeing the interrupts and MSI
descriptors and also in the error path of msi_domain_populate_irqs().
Fixes: 2f2940d168 ("genirq/msi: Remove filter from msi_free_descs_free_range()")
Reported-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87mt4wkwnv.ffs@tglx
case with the CLK_OPS_PARENT_ENABLE flag combined with clk_core_is_enabled()
where it hangs the system. We'll simply assume the clk is disabled if the
parent is disabled and the flag is set. Trying to turn on the parent to check
the enable state of the clk runs into system hangs at boot. We let this bake in
-next for a couple weeks to make sure there aren't any more issues because the
last attempt to fix this ran into hangs and had to be reverted.
Note: There were some more patches to the core framework around sync_state and
disabling unused clks, but I asked for that to be reverted from the qcom PR
because it isn't ready and we're still discussing the best solution on the
list.
Outside of the core clk framework, we have the usual collection of clk driver
updates and support for new SoCs (which seems to never stop). The dirstat is
dominated by Qualcomm because they added support for quite a few SoCs this time
around and also migrated quite a few of their drivers to clk_parent_data. The
other big diff is in the Mediatek clk drivers that saw a significant rework
this cycle to similarly modernize the code, and we'll see that work continue in
the next cycle as well. Nothing really jumps out as scary here, except that the
significant churn in parent data descriptions can have typos that go unnoticed.
More details below.
Core:
- Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
New Drivers:
- Add a new clk-gpr-mux clock type and use it on i.MX6Q to add ENET ref
clocks
- Support for Mediatek MT7891 SoC clks
- Support for many Qualcomm clk controllers:
- QDU1000/QRU1000 global clock controller
- SA8775P global clock controller
- SM8550 TCSR and display clock controller
- SM6350 clock controller
- MSM8996 CBF and APCS clock controllers
Updates:
- Various cleanups and improvements to Mediatek clk drivers to reduce
code size and modernize the drivers
- Support for Versa 5P49V60 clks
- Disable R-Car H3 ES1.*, as it was only available to an internal
development group and needed a lot of quirks and workarounds
- Add PWM, Compare-Match Timer (TIM), USB, SDHI, and eMMC clocks and
resets on Renesas RZ/V2M
- Add display clocks on Renesas R-Car V4H
- Add Camera Receiving Unit (CRU) clocks and resets on Renesas RZ/G2L
- Free the imx_uart_clocks even if imx_register_uart_clocks returns early
- Get the stdout clocks count from device tree on i.MX
- Drop the clock count argument from imx_register_uart_clocks()
- Keep the uart clocks on i.MX93 for when earlycon is used
- Fix SPDX comment in i.MX6SLL clocks bindings header
- Drop some unnecessary spaces from i.MX8ULP clocks bindings header
- Add imx_obtain_fixed_of_clock() for allowing to add a clock that is
not configured via devicetree
- Fix the ENET1 gate configuration for i.MX6UL according to the
reference manual
- Add ENET refclock mux support for i.MX6UL
- Add support for USB host/device configuration on Renesas RZ/N1
- Add PLL2 programming support, and CAN-FD clocks on Renesas R-Car V4H
- Add D1 CAN bus gates and resets for Allwinner
- Mark D1 CPUX clock as critical on Allwinner
- Reuse D1 driver for Allwinner R528/T113
- Cleanup sunxi-ng Kconfig
- Fix sunxi-ng kernel-doc issues
- Model Allwinner H3/H5 DRAM clock as fixed clock
- Use .determine_rate() instead of .round_rate() for the dualdiv, mpll,
sclk-div and cpu-dyn-div amlogic clock drivers
- DDR clocks were marked as critical in the proper clock driver for each
AT91 SoC such that drivers/memory/atmel-sdramc.c to be deleted
in the next releases as it only does clock enablement
- Patch to avoid compiling dt-compat.o for all AT91 SoCs as only some of
them may use it
- Support synchronous power_off requests in the qcom GDSC driver for proper
GPU power collapse
- Drop test clocks from various Qualcomm clk drivers
- Update parent references to use clk_parent_data/clk_hw in various Qualcomm clk drivers
- Fixes for the Qualcomm MSM8996 CPU clock controller
- Transition Qualcomm MSM8974 GCC off the externally defined sleep_clk
- Add GDSCs in the global clock controller for Qualcomm QCS404
- The SDCC core clocks on Qualcomm SM6115 are moved to floor_ops
- Programming of clk_dis_wait for GPU CX GDSC on Qualcomm SC7180 and SDM845 are
moved to use the recently introduced properties in the GDSC struct
- Qualcomm's RPMh clock driver gains SM8550 and SA8775P clocks, and the IPA clock
is added on a variety of platforms
- De-duplicate identical clks in Qualcomm SMD RPM clk driver
- Add a few missing clocks across msm8998, msm8992, msm8916, qcs404 to
Qualcomm SDM RPM clk driver
- Various Qualcomm clk drivers use devm_pm_runtime_enable() to simplify
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmP5L68RHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXLxRAAx5C2PBxGnQS5Dqy7yGFBKhoyM6MnD131
4wsWunTzw/fx3MWTRUBu1FYq7ZN38dmeUNeKVNO7QkfIzXe/Htxa5DJp8oJjeFkA
WBlJr/S9pnCKJ1+jGgnEJ3AL8rtssc2nasS8Gj66eu3Zs3dA1MlUz1M0wqiGeD/Y
2crg1nowHurxhsmdUM+6sBRZsCoUz1DxAynqOK25Ip08ygBGYRdkk3aCoyx1bICF
02RwSTQP9pGykgkO7BMkr1pA000mlcawXflzfbY0bA57GKvITBaXh3PhVWwsAnqk
utagT3G2/mQNBF+DVX4Xr5rRqYttDeATiS2D0B31x68Ovjw6kaA28QoY19oIjc1p
D7CabcPnrdK6JFimJL/uEnjIpVnMW2kbTAkTdgGFNKqYUC1O+Mm5ZscKo8RUaiQ7
8XAgJXnGxG7RlZDIMCj69xiZBR0I0wgyx6E7sqMK5j4/v9ezhUMemj4u/sNe3R1n
ih43L2vXjftAhl7jlGQb6eFQjPU/n8kf1rte5miJxX2vFBgWqiCfKHnVSe8KSNU+
gqhar55G9ACnYBjqApmySd/7IFBzmJSyWujXg+fwIu0ZI5ir+ZPr+rQ7RkAUqMij
QCPvJa6XlRIyl6j54E5GaSFO3ZDc3wAZEs3I2N4yhYTDUGv342G3YdwNU+b0ioHB
bDW5Gec9u2s=
=Fle/
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"We have one small patch to the clk core this time around. It fixes a
corner case with the CLK_OPS_PARENT_ENABLE flag combined with
clk_core_is_enabled() where it hangs the system. We'll simply assume
the clk is disabled if the parent is disabled and the flag is set.
Trying to turn on the parent to check the enable state of the clk runs
into system hangs at boot. We let this bake in -next for a couple
weeks to make sure there aren't any more issues because the last
attempt to fix this ran into hangs and had to be reverted.
Note: There were some more patches to the core framework around
sync_state and disabling unused clks, but I asked for that to be
reverted from the qcom PR because it isn't ready and we're still
discussing the best solution on the list.
Outside of the core clk framework, we have the usual collection of clk
driver updates and support for new SoCs (which seems to never stop).
The dirstat is dominated by Qualcomm because they added support for
quite a few SoCs this time around and also migrated quite a few of
their drivers to clk_parent_data. The other big diff is in the
Mediatek clk drivers that saw a significant rework this cycle to
similarly modernize the code, and we'll see that work continue in the
next cycle as well. Nothing really jumps out as scary here, except
that the significant churn in parent data descriptions can have typos
that go unnoticed. More details below.
Core:
- Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
New Drivers:
- Add a new clk-gpr-mux clock type and use it on i.MX6Q to add ENET
ref clocks
- Support for Mediatek MT7891 SoC clks
- Support for many Qualcomm clk controllers:
- QDU1000/QRU1000 global clock controller
- SA8775P global clock controller
- SM8550 TCSR and display clock controller
- SM6350 clock controller
- MSM8996 CBF and APCS clock controllers
Updates:
- Various cleanups and improvements to Mediatek clk drivers to reduce
code size and modernize the drivers
- Support for Versa 5P49V60 clks
- Disable R-Car H3 ES1.*, as it was only available to an internal
development group and needed a lot of quirks and workarounds
- Add PWM, Compare-Match Timer (TIM), USB, SDHI, and eMMC clocks and
resets on Renesas RZ/V2M
- Add display clocks on Renesas R-Car V4H
- Add Camera Receiving Unit (CRU) clocks and resets on Renesas RZ/G2L
- Free the imx_uart_clocks even if imx_register_uart_clocks returns
early
- Get the stdout clocks count from device tree on i.MX
- Drop the clock count argument from imx_register_uart_clocks()
- Keep the uart clocks on i.MX93 for when earlycon is used
- Fix SPDX comment in i.MX6SLL clocks bindings header
- Drop some unnecessary spaces from i.MX8ULP clocks bindings header
- Add imx_obtain_fixed_of_clock() for allowing to add a clock that is
not configured via devicetree
- Fix the ENET1 gate configuration for i.MX6UL according to the
reference manual
- Add ENET refclock mux support for i.MX6UL
- Add support for USB host/device configuration on Renesas RZ/N1
- Add PLL2 programming support, and CAN-FD clocks on Renesas R-Car
V4H
- Add D1 CAN bus gates and resets for Allwinner
- Mark D1 CPUX clock as critical on Allwinner
- Reuse D1 driver for Allwinner R528/T113
- Cleanup sunxi-ng Kconfig
- Fix sunxi-ng kernel-doc issues
- Model Allwinner H3/H5 DRAM clock as fixed clock
- Use .determine_rate() instead of .round_rate() for the dualdiv,
mpll, sclk-div and cpu-dyn-div amlogic clock drivers
- DDR clocks were marked as critical in the proper clock driver for
each AT91 SoC such that drivers/memory/atmel-sdramc.c to be deleted
in the next releases as it only does clock enablement
- Patch to avoid compiling dt-compat.o for all AT91 SoCs as only some
of them may use it
- Support synchronous power_off requests in the qcom GDSC driver for
proper GPU power collapse
- Drop test clocks from various Qualcomm clk drivers
- Update parent references to use clk_parent_data/clk_hw in various
Qualcomm clk drivers
- Fixes for the Qualcomm MSM8996 CPU clock controller
- Transition Qualcomm MSM8974 GCC off the externally defined
sleep_clk
- Add GDSCs in the global clock controller for Qualcomm QCS404
- The SDCC core clocks on Qualcomm SM6115 are moved to floor_ops
- Programming of clk_dis_wait for GPU CX GDSC on Qualcomm SC7180 and
SDM845 are moved to use the recently introduced properties in the
GDSC struct
- Qualcomm's RPMh clock driver gains SM8550 and SA8775P clocks, and
the IPA clock is added on a variety of platforms
- De-duplicate identical clks in Qualcomm SMD RPM clk driver
- Add a few missing clocks across msm8998, msm8992, msm8916, qcs404
to Qualcomm SDM RPM clk driver
- Various Qualcomm clk drivers use devm_pm_runtime_enable() to
simplify"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (228 commits)
clk: qcom: apcs-msm8986: Include bitfield.h for FIELD_PREP
clk: qcom: Revert sync_state based clk_disable_unused
clk: imx: pll14xx: fix recalc_rate for negative kdiv
clk: rs9: Drop unused pin_xin field
MAINTAINERS: clk: imx: Add Peng Fan as reviewer
clk: sprd: Add dependency for SPRD_UMS512_CLK
clk: ralink: fix 'mt7621_gate_is_enabled()' function
clk: mediatek: clk-mtk: Remove unneeded semicolon
dt-bindings: clock: remove stih416 bindings
dt-bindings: clock: add loongson-2 clock
dt-bindings: clock: add loongson-2 clock include file
clk: imx: fix compile testing imxrt1050
clk: Honor CLK_OPS_PARENT_ENABLE in clk_core_is_enabled()
clk: imx: set imx_clk_gpr_mux_ops storage-class-specifier to static
clk: renesas: rcar-gen3: Disable R-Car H3 ES1.*
dt-bindings: clock: Merge qcom,gpucc-sm8350 into qcom,gpucc.yaml
clk: qcom: gpucc-sdm845: fix clk_dis_wait being programmed for CX GDSC
clk: qcom: gpucc-sc7180: fix clk_dis_wait being programmed for CX GDSC
dt-bindings: clock: qcom,sa8775p-gcc: add the power-domains property
clk: qcom: cpu-8996: add missing cputype include
...
Some of the log messages were printing just the fwnode name. While it's
short, it's not always uniquely identifiable in system. So print the
full path and name to make debugging easier.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230225065443.278284-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink can sometimes try to create a device link with the consumer
and supplier as the same device. These attempts will fail (correctly),
but are harmless. So, avoid printing an error for these cases. Also, add
more detail to the error message.
Fixes: 3fb16866b5 ("driver core: fw_devlink: Make cycle detection more robust")
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20230225064148.274376-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling soc_device_match() from early_initcall(), bus_kset is still
NULL, causing a crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
...
Call trace:
__lock_acquire+0x530/0x20f0
lock_acquire.part.0+0xc8/0x210
lock_acquire+0x64/0x80
_raw_spin_lock+0x4c/0x60
bus_to_subsys+0x24/0xac
bus_for_each_dev+0x30/0xcc
soc_device_match+0x4c/0xe0
r8a7795_sysc_init+0x18/0x60
rcar_sysc_pd_init+0xb0/0x33c
do_one_initcall+0x128/0x2bc
Before, bus_for_each_dev() handled this gracefully by checking that
the back-pointer to the private structure was valid.
Fix this by adding a NULL check for bus_kset to bus_to_subsys().
Fixes: 83b9148df2 ("driver core: bus: bus iterator cleanups")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/0a92979f6e790737544638e8a4c19b0564e660a2.1676983596.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work falls
into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be moved
into read-only memory (i.e. const) The recent work with Rust has
pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only making
things safer overall. This is the contuation of that work (started
last release with kobject changes) in moving struct bus_type to be
constant. We didn't quite make it for this release, but the
remaining patches will be finished up for the release after this
one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
6oeFOjD3JDju3cQsfGgd
=Su6W
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work
falls into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be
moved into read-only memory (i.e. const) The recent work with Rust
has pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only
making things safer overall. This is the contuation of that work
(started last release with kobject changes) in moving struct
bus_type to be constant. We didn't quite make it for this release,
but the remaining patches will be finished up for the release after
this one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems"
[ Geert Uytterhoeven points out that that last sentence isn't true, and
that there's a pending report that has a fix that is queued up - Linus ]
* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
OPP: fix error checking in opp_migrate_dentry()
debugfs: update comment of debugfs_rename()
i3c: fix device.h kernel-doc warnings
dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
Revert "driver core: add error handling for devtmpfs_create_node()"
Revert "devtmpfs: add debug info to handle()"
Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
driver core: cpu: don't hand-override the uevent bus_type callback.
devtmpfs: remove return value of devtmpfs_delete_node()
devtmpfs: add debug info to handle()
driver core: add error handling for devtmpfs_create_node()
driver core: bus: update my copyright notice
driver core: bus: add bus_get_dev_root() function
driver core: bus: constify bus_unregister()
driver core: bus: constify some internal functions
driver core: bus: constify bus_get_kset()
driver core: bus: constify bus_register/unregister_notifier()
driver core: remove private pointer from struct bus_type
...
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
A quiet release for regmap, we've seen several cleanups, an update for a
change in the MDIO APIs and one small fix.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmPzatMACgkQJNaLcl1U
h9BJjwf/Z8lGWykBdFhQw2vvauWMjm4cBCpLONK70mXOVouApepVW8xQAXT8sQ86
ADyeSGJmB1P+b/TyO4q7gqF0qBKE5T9gOhjhmp++robCEHXzX3/h2YtobTWpFdzO
34UOAi0PBIGhnEwgK9HWb/rYMS3DBy6oKd6IxVK1TX6LXxjW3OoTuxd+lNjDE7nM
eGMnyMF1TcxAsUdlTMiyUgt0e/xxir4UOvmzYz7rkbVQf6AyrmOUyMt+iTuW7jQ6
DUUwpAfVilEN5K6Drc+4Lv+ydCw4PATXLdYX3cBQygeptvQ6PCxgulXvT7OcQMph
rCirnu9gcYcKLFcFI5LhxQOZ7VAcBQ==
=tjm7
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A quiet release for regmap: we've seen several cleanups, an update for
a change in the MDIO APIs and one small fix"
* tag 'regmap-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Remove unused mask_invert flag
regmap-irq: Remove unused type_invert flag
regmap: Reorder fields in 'struct regmap_bus' to save some memory
regmap: apply reg_base and reg_downshift for single register ops
Core
----
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used
to describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols
---------
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP
path manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF
---
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key
to better support decap on GRE tunnel devices not operating
in collect metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols
by livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter
---------
- Remove the CLUSTERIP target. It has been marked as obsolete
for years, and we still have WARN splats wrt. races of
the out-of-band /proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to
the existing 'delete' commands, but do not return an error if
the referenced object (set, chain, rule...) did not exist.
Driver API
----------
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into multiple
files, drop some of the unnecessarily granular locks and factor out
common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless Extensions
for Wi-Fi 7 devices at all. Everyone should switch to using nl80211
interface instead.
- Improve the CAN bit timing configuration. Use extack to return error
messages directly to user space, update the SJW handling, including
the definition of a new default value that will benefit CAN-FD
controllers, by increasing their oscillator tolerance.
New hardware / drivers
----------------------
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers
-------
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- enetc: support XDP_REDIRECT for XDP non-linear buffers
- enetc: improve reconfig, avoid link flap and waiting for idle
- enetc: support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP1VIYACgkQMUZtbf5S
IrvsChAApz0rNL/sPKxXTEfxZ1tN7D3sYxYKQPomxvl5BV+MvicrLddJy3KmzEFK
nnJNO3nuRNuH422JQ/ylZ4mGX1opa6+5QJb0UINImXUI7Fm8HHBIuPGkv7d5CheZ
7JexFqjPJXUy9nPyh1Rra+IA9AcRd2U7jeGEZR38wb99bHJQj5Bzdk20WArEB0el
n44aqg49LXH71bSeXRz77x5SjkwVtYiccQxLcnmTbjLU2xVraLvI2J+wAhHnVXWW
9lrU1+V4Ex2Xcd1xR0L0cHeK+meP1TrPRAeF+JDpVI3a/zJiE7cZjfHdG/jH5xWl
leZJqghVozrZQNtewWWO7XhUFhMDgFu3W/1vNLjSHPZEqaz1JpM67J1+ql6s63l4
LMWoXbcYZz+SL9ZRCoPkbGue/5fKSHv8/Jl9Sh58+eTS+c/zgN8uFGRNFXLX1+EP
n8uvt985PxMd6x1+dHumhOUzxnY4Sfi1vjitSunTsNFQ3Cmp4SO0IfBVJWfLUCuC
xz5hbJGJJbSpvUsO+HWyCg83E5OWghRE/Onpt2jsQSZCrO9HDg4FRTEf3WAMgaqc
edb5KfbRZPTJQM08gWdluXzSk1nw3FNP2tXW4XlgUrEbjb+fOk0V9dQg2gyYTxQ1
Nhvn8ZQPi6/GMMELHAIPGmmW1allyOGiAzGlQsv8EmL+OFM6WDI=
=xXhC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya).
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang).
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney).
- Enable thermal cooling for Tegra194 (Yi-Wei Wang).
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss).
- Various dt binding updates for qcom-cpufreq-nvmem and opp-v2-kryo-cpu
(Christian Marangi).
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh).
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König).
- Make the TEO cpuidle governor check CPU utilization in order to refine
idle state selection (Kajetan Puchalski).
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in that
driver with arch_cpu_idle() to allow MWAIT to be used (Li RongQing).
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann).
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh).
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki).
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski).
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald).
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap).
- Fix possible name leak in powercap_register_zone() (Yang Yingliang).
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui).
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada).
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui).
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman).
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring).
- Fix error checking in opp_migrate_dentry() (Qi Zheng).
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio).
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler).
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmPuJfMSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/5kQAJNOVImLEPLerLP8xufw30//LuDU5Gi0
STsyDOMql/I2MpkeqeCcgrSbpy6NlEglOvg16gfpQ3qqTCLF9ypENxs9E5BGGvW0
aEdCzvaoqmvi9PCr/jmj0EPP70/U+rIX5m/k0QdjLh9x0aLoAEe3uRJTfR9QVqXf
I7JX0N9kjKi7YxpA5DlkHrS7J7GPPiWlesJ3p4wXuHMo3jf+6fgkoPFt8yRrGWeh
AHzGT2BLrsy7aAUjGZB65Qx9q3fnSXMmXOjmn0Xh2njQah+zRZDwrNzwoY2HTLL/
KQ6/Ww16USYRZtCS1fmGwAj9I+ddq6AOvhPCMn0vLXXmKVAMUrVVWnQS/0+vpm9y
suUMK9Tndkgxd1vjby2246ThJn27uDd/ERFan4ouQo2j22uICY+SDo3osj2hMXka
wq4zthXkY8KgjZ+MuXnZxPhcOvo8KRvfxAU0fy5efQnSkbtwY9UlMvjPBMBHm/RA
21/6kjQNtq5vMmI37oC8DH+oPrRQ7sUKuY7HNqwO9P3QNKWVmNe7cF5UtXXxME7Q
ULvP1d+u+TNNdHFLryPwCSzBO34wQEccdRZBjalZ8tBe6JiDWUFHC3giSURZSuzZ
GDvzVaNX6PkgToyv4inBTB8lTp6pAuUjaWNvNJzVvUXiEKHB0ihzg5vpJW5NdwlH
15Tn8cjH7pp0
=lZLx
-----END PGP SIGNATURE-----
Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add EPP support to the AMD P-state cpufreq driver, add support
for new platforms to the Intel RAPL power capping driver, intel_idle
and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
drop the custom cpufreq driver for loongson1 that is not necessary any
more (and the corresponding cpufreq platform device), fix assorted
issues and clean up code.
Specifics:
- Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
Karny, Arnd Bergmann, Bagas Sanjaya)
- Drop the custom cpufreq driver for loongson1 that is not necessary
any more and the corresponding cpufreq platform device (Keguang
Zhang)
- Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
entries (Paul E. McKenney)
- Enable thermal cooling for Tegra194 (Yi-Wei Wang)
- Register module device table and add missing compatibles for
cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)
- Various dt binding updates for qcom-cpufreq-nvmem and
opp-v2-kryo-cpu (Christian Marangi)
- Make kobj_type structure in the cpufreq core constant (Thomas
Weißschuh)
- Make cpufreq_unregister_driver() return void (Uwe Kleine-König)
- Make the TEO cpuidle governor check CPU utilization in order to
refine idle state selection (Kajetan Puchalski)
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in
that driver with arch_cpu_idle() to allow MWAIT to be used (Li
RongQing)
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy)
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann)
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh)
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki)
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski)
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald)
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap)
- Fix possible name leak in powercap_register_zone() (Yang Yingliang)
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui)
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada)
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui)
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman)
- Add missing 'cache-unified' property in the example for kryo OPP
bindings (Rob Herring)
- Fix error checking in opp_migrate_dentry() (Qi Zheng)
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio)
- Modify some power management utilities to use the canonical ftrace
path (Ross Zwisler)
- Correct spelling problems for Documentation/power/ as reported by
codespell (Randy Dunlap)"
* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
Documentation: amd-pstate: disambiguate user space sections
cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
PM: Add EXPORT macros for exporting PM functions
cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
MIPS: loongson32: Drop obsolete cpufreq platform device
powercap: intel_rapl: Fix handling for large time window
cpuidle: driver: Update microsecond values of state parameters as needed
cpuidle: sysfs: make kobj_type structures constant
cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
PM: EM: fix memory leak with using debugfs_lookup()
PM: domains: fix memory leak with using debugfs_lookup()
cpufreq: Make kobj_type structure constant
cpufreq: davinci: Fix clk use after free
cpufreq: amd-pstate: avoid uninitialized variable use
cpufreq: Make cpufreq_unregister_driver() return void
OPP: fix error checking in opp_migrate_dentry()
dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
...
This pull request contains the following branches:
doc.2023.01.05a: Documentation updates.
fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably:
o Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number
of callbacks.
o Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized.
o Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have doen this for mnay years.)
o Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled,
so this should not (yet) affect production use cases.)
kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of
polled grace periods, thus reducing memory footprint by almost
two orders of magnitude, admittedly on a microbenchmark.
This series also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs
where kfree_rcu(p), which can block, was typed instead of the
intended kfree_rcu(p, rh).
srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that
causes SRCU to fail when booted on a system with a non-zero boot
CPU. This surprising situation actually happens for kdump kernels
on the powerpc architecture. It also adds an srcu_down_read()
and srcu_up_read(), which act like srcu_read_lock() and
srcu_read_unlock(), but allow an SRCU read-side critical section
to be handed off from one task to another.
srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option.
There are a few more commits that are not yet acked or pulled
into maintainer trees, and these will be in a pull request for
a later merge window.
tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes:
o A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang.
o A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can result
in a too-short grace period.
o A race between shrinking RCU tasks down to a single callback list
and queuing a new callback to some other CPU, but where that
queuing is delayed for more than an RCU grace period. This can
result in that callback being stranded on the non-boot CPU.
torture.2023.01.05a: Torture-test updates and fixes.
torturescript.2023.01.03a: Torture-test scripting updates and fixes.
stall.2023.01.09a: Provide additional RCU CPU stall-warning information
in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and
restore the full five-minute timeout limit for expedited RCU
CPU stall warnings.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh
oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU
Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X
myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8
qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV
vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2
BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb
NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07
cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ
FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr
AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e
4AFBFd/+VedUGg==
=bBYK
-----END PGP SIGNATURE-----
Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Documentation updates
- Miscellaneous fixes, perhaps most notably:
- Throttling callback invocation based on the number of callbacks
that are now ready to invoke instead of on the total number of
callbacks
- Several patches that suppress false-positive boot-time
diagnostics, for example, due to lockdep not yet being
initialized
- Make expedited RCU CPU stall warnings dump stacks of any tasks
that are blocking the stalled grace period. (Normal RCU CPU
stall warnings have done this for many years)
- Lazy-callback fixes to avoid delays during boot, suspend, and
resume. (Note that lazy callbacks must be explicitly enabled, so
this should not (yet) affect production use cases)
- Make kfree_rcu() and friends take advantage of polled grace periods,
thus reducing memory footprint by almost two orders of magnitude,
admittedly on a microbenchmark
This also begins the transition from kfree_rcu(p) to
kfree_rcu_mightsleep(p). This transition was motivated by bugs where
kfree_rcu(p), which can block, was typed instead of the intended
kfree_rcu(p, rh)
- SRCU updates, perhaps most notably fixing a bug that causes SRCU to
fail when booted on a system with a non-zero boot CPU. This
surprising situation actually happens for kdump kernels on the
powerpc architecture
This also adds an srcu_down_read() and srcu_up_read(), which act like
srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side
critical section to be handed off from one task to another
- Clean up the now-useless SRCU Kconfig option
There are a few more commits that are not yet acked or pulled into
maintainer trees, and these will be in a pull request for a later
merge window
- RCU-tasks updates, perhaps most notably these fixes:
- A strange interaction between PID-namespace unshare and the
RCU-tasks grace period that results in a low-probability but
very real hang
- A race between an RCU tasks rude grace period on a single-CPU
system and CPU-hotplug addition of the second CPU that can
result in a too-short grace period
- A race between shrinking RCU tasks down to a single callback
list and queuing a new callback to some other CPU, but where
that queuing is delayed for more than an RCU grace period. This
can result in that callback being stranded on the non-boot CPU
- Torture-test updates and fixes
- Torture-test scripting updates and fixes
- Provide additional RCU CPU stall-warning information in kernels built
with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute
timeout limit for expedited RCU CPU stall warnings
* tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep()
kernel/notifier: Remove CONFIG_SRCU
init: Remove "select SRCU"
fs/quota: Remove "select SRCU"
fs/notify: Remove "select SRCU"
fs/btrfs: Remove "select SRCU"
fs: Remove CONFIG_SRCU
drivers/pci/controller: Remove "select SRCU"
drivers/net: Remove "select SRCU"
drivers/md: Remove "select SRCU"
drivers/hwtracing/stm: Remove "select SRCU"
drivers/dax: Remove "select SRCU"
drivers/base: Remove CONFIG_SRCU
rcu: Disable laziness if lazy-tracking says so
rcu: Track laziness during boot and suspend
rcu: Remove redundant call to rcu_boost_kthread_setaffinity()
rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts
rcu: Align the output of RCU CPU stall warning messages
rcu: Add RCU stall diagnosis information
sched: Add helper nr_context_switches_cpu()
...
- Improve the scalability of the CFS bandwidth unthrottling logic
with large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with
the generic scheduler code. Add __cpuidle methods as noinstr to
objtool's noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS,
to query previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period,
to improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- ... Misc other cleanups, fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzbJwRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1iIvA//ZcEaB8Z6ChLRQjM+bsaudKJu3pdLQbPK
iYbP8Da+LsAfxbEfYuGV3m+jIp0LlBOtsI/EezxQrXV+V7FvNyAX9Y00eEu/zlj8
7Jn3LMy/DBYTwH7LwVdcU0MyIVI8ZPc6WNnkx0LOtGZn8n+qfHPSDzcP3CW+a5AV
UvllPYpYyEmsX0Eby7CF4Ue8mSmbViw/xR3rNr8ZSve0c25XzKabw8O9kE3jiHxP
d/zERJoAYeDyYUEuZqhfn5dTlB4an4IjNEkAfRE5SQ09RA8Gkxsa5Ar8gob9e9M1
eQsdd4/bdhnrkM8L5qDZczqmgCTZ2bukQrxkBXhRDhLgoFxwAn77b+2ZjmIW3Lae
AyGqRcDSg1q2oxaYm5ZiuO/t26aDOZu9vPHyHRDGt95EGbZlrp+GgeePyfCigJYz
UmPdZAAcHdSymnnnlcvdG37WVvaVkpgWZzd8LbtBi23QR+Zc4WQ2IlgnUS5WKNNf
VOBcAcP6E1IslDotZDQCc2dPFFQoQQEssVooyUc5oMytm7BsvxXLOeHG+Ncu/8uc
H+U8Qn8jnqTxJbC5hkWQIJlhVKCq2FJrHxxySYTKROfUNcDgCmxboFeAcXTCIU1K
T0S+sdoTS/CvtLklRkG0j6B8N4N98mOd9cFwUV3tX+/gMLMep3hCQs5L76JagvC5
skkQXoONNaM=
=l1nN
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Improve the scalability of the CFS bandwidth unthrottling logic with
large number of CPUs.
- Fix & rework various cpuidle routines, simplify interaction with the
generic scheduler code. Add __cpuidle methods as noinstr to objtool's
noinstr detection and fix boatloads of cpuidle bugs & quirks.
- Add new ABI: introduce MEMBARRIER_CMD_GET_REGISTRATIONS, to query
previously issued registrations.
- Limit scheduler slice duration to the sysctl_sched_latency period, to
improve scheduling granularity with a large number of SCHED_IDLE
tasks.
- Debuggability enhancement on sys_exit(): warn about disabled IRQs,
but also enable them to prevent a cascade of followup problems and
repeat warnings.
- Fix the rescheduling logic in prio_changed_dl().
- Micro-optimize cpufreq and sched-util methods.
- Micro-optimize ttwu_runnable()
- Micro-optimize the idle-scanning in update_numa_stats(),
select_idle_capacity() and steal_cookie_task().
- Update the RSEQ code & self-tests
- Constify various scheduler methods
- Remove unused methods
- Refine __init tags
- Documentation updates
- Misc other cleanups, fixes
* tag 'sched-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (110 commits)
sched/rt: pick_next_rt_entity(): check list_entry
sched/deadline: Add more reschedule cases to prio_changed_dl()
sched/fair: sanitize vruntime of entity being placed
sched/fair: Remove capacity inversion detection
sched/fair: unlink misfit task from cpu overutilized
objtool: mem*() are not uaccess safe
cpuidle: Fix poll_idle() noinstr annotation
sched/clock: Make local_clock() noinstr
sched/clock/x86: Mark sched_clock() noinstr
x86/pvclock: Improve atomic update of last_value in pvclock_clocksource_read()
x86/atomics: Always inline arch_atomic64*()
cpuidle: tracing, preempt: Squash _rcuidle tracing
cpuidle: tracing: Warn about !rcu_is_watching()
cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG
cpuidle: drivers: firmware: psci: Dont instrument suspend code
KVM: selftests: Fix build of rseq test
exit: Detect and fix irq disabled state in oops
cpuidle, arm64: Fix the ARM64 cpuidle logic
cpuidle: mvebu: Fix duplicate flags assignment
sched/fair: Limit sched slice duration
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY+5NlQAKCRCRxhvAZXjc
orOaAP9i2h3OJy95nO2Fpde0Bt2UT+oulKCCcGlvXJ8/+TQpyQD/ZQq47gFQ0EAz
Br5NxeyGeecAb0lHpFz+CpLGsxMrMwQ=
=+BG5
-----END PGP SIGNATURE-----
Merge tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping
Pull vfs idmapping updates from Christian Brauner:
- Last cycle we introduced the dedicated struct mnt_idmap type for
mount idmapping and the required infrastucture in 256c8aed2b ("fs:
introduce dedicated idmap type for mounts"). As promised in last
cycle's pull request message this converts everything to rely on
struct mnt_idmap.
Currently we still pass around the plain namespace that was attached
to a mount. This is in general pretty convenient but it makes it easy
to conflate namespaces that are relevant on the filesystem with
namespaces that are relevant on the mount level. Especially for
non-vfs developers without detailed knowledge in this area this was a
potential source for bugs.
This finishes the conversion. Instead of passing the plain namespace
around this updates all places that currently take a pointer to a
mnt_userns with a pointer to struct mnt_idmap.
Now that the conversion is done all helpers down to the really
low-level helpers only accept a struct mnt_idmap argument instead of
two namespace arguments.
Conflating mount and other idmappings will now cause the compiler to
complain loudly thus eliminating the possibility of any bugs. This
makes it impossible for filesystem developers to mix up mount and
filesystem idmappings as they are two distinct types and require
distinct helpers that cannot be used interchangeably.
Everything associated with struct mnt_idmap is moved into a single
separate file. With that change no code can poke around in struct
mnt_idmap. It can only be interacted with through dedicated helpers.
That means all filesystems are and all of the vfs is completely
oblivious to the actual implementation of idmappings.
We are now also able to extend struct mnt_idmap as we see fit. For
example, we can decouple it completely from namespaces for users that
don't require or don't want to use them at all. We can also extend
the concept of idmappings so we can cover filesystem specific
requirements.
In combination with the vfs{g,u}id_t work we finished in v6.2 this
makes this feature substantially more robust and thus difficult to
implement wrong by a given filesystem and also protects the vfs.
- Enable idmapped mounts for tmpfs and fulfill a longstanding request.
A long-standing request from users had been to make it possible to
create idmapped mounts for tmpfs. For example, to share the host's
tmpfs mount between multiple sandboxes. This is a prerequisite for
some advanced Kubernetes cases. Systemd also has a range of use-cases
to increase service isolation. And there are more users of this.
However, with all of the other work going on this was way down on the
priority list but luckily someone other than ourselves picked this
up.
As usual the patch is tiny as all the infrastructure work had been
done multiple kernel releases ago. In addition to all the tests that
we already have I requested that Rodrigo add a dedicated tmpfs
testsuite for idmapped mounts to xfstests. It is to be included into
xfstests during the v6.3 development cycle. This should add a slew of
additional tests.
* tag 'fs.idmapped.v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (26 commits)
shmem: support idmapped mounts for tmpfs
fs: move mnt_idmap
fs: port vfs{g,u}id helpers to mnt_idmap
fs: port fs{g,u}id helpers to mnt_idmap
fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap
fs: port i_{g,u}id_{needs_}update() to mnt_idmap
quota: port to mnt_idmap
fs: port privilege checking helpers to mnt_idmap
fs: port inode_owner_or_capable() to mnt_idmap
fs: port inode_init_owner() to mnt_idmap
fs: port acl to mnt_idmap
fs: port xattr to mnt_idmap
fs: port ->permission() to pass mnt_idmap
fs: port ->fileattr_set() to pass mnt_idmap
fs: port ->set_acl() to pass mnt_idmap
fs: port ->get_acl() to pass mnt_idmap
fs: port ->tmpfile() to pass mnt_idmap
fs: port ->rename() to pass mnt_idmap
fs: port ->mknod() to pass mnt_idmap
fs: port ->mkdir() to pass mnt_idmap
...
Merge updates of the powercap framework, generic PM domains, Energy
Model and operating performance points for 6.3-rc1:
- Fix possible name leak in powercap_register_zone() (Yang Yingliang).
- Add Meteor Lake and Emerald Rapids support to the intel_rapl power
capping driver (Zhang Rui).
- Modify the idle_inject power capping facility to support 100% idle
injection (Srinivas Pandruvada).
- Fix large time windows handling in the intel_rapl power capping
driver (Zhang Rui).
- Fix memory leaks with using debugfs_lookup() in the generic PM
domains and Energy Model code (Greg Kroah-Hartman).
- Add missing 'cache-unified' property in example for kryo OPP bindings
(Rob Herring).
- Fix error checking in opp_migrate_dentry() (Qi Zheng).
- Remove "select SRCU" (Paul E. McKenney).
- Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
Dybcio).
* powercap:
powercap: intel_rapl: Fix handling for large time window
powercap: idle_inject: Support 100% idle injection
powercap: intel_rapl: add support for Emerald Rapids
powercap: intel_rapl: add support for Meteor Lake
powercap: fix possible name leak in powercap_register_zone()
* pm-domains:
PM: domains: fix memory leak with using debugfs_lookup()
* pm-em:
PM: EM: fix memory leak with using debugfs_lookup()
* pm-opp:
OPP: fix error checking in opp_migrate_dentry()
dt-bindings: opp: v2-qcom-level: Let qcom,opp-fuse-level be a 2-long array
drivers/opp: Remove "select SRCU"
dt-bindings: opp: opp-v2-kryo-cpu: Add missing 'cache-unified' property in example
Merge cpuidle updates, PM core updates and changes related to system
sleep handling for 6.3-rc1:
- Make the TEO cpuidle governor check CPU utilization in order to refine
idle state selection (Kajetan Puchalski).
- Make Kconfig select the haltpoll cpuidle governor when the haltpoll
cpuidle driver is selected and replace a default_idle() call in that
driver with arch_cpu_idle() which allows MWAIT to be used (Li
RongQing).
- Add Emerald Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
avoid randconfig build failures (Arnd Bergmann).
- Make kobj_type structures used in the cpuidle sysfs interface
constant (Thomas Weißschuh).
- Make the cpuidle driver registration code update microsecond values
of idle state parameters in accordance with their nanosecond values
if they are provided (Rafael Wysocki).
- Make the PSCI cpuidle driver prevent topology CPUs from being
suspended on PREEMPT_RT (Krzysztof Kozlowski).
- Document that pm_runtime_force_suspend() cannot be used with
DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald).
- Add EXPORT macros for exporting PM functions from drivers (Richard
Fitzgerald).
- Drop "select SRCU" from system sleep Kconfig (Paul E. McKenney).
- Remove /** from non-kernel-doc comments in hibernation code (Randy
Dunlap).
* pm-cpuidle:
cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
cpuidle: driver: Update microsecond values of state parameters as needed
cpuidle: sysfs: make kobj_type structures constant
cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
intel_idle: add Emerald Rapids Xeon support
cpuidle-haltpoll: Replace default_idle() with arch_cpu_idle()
cpuidle-haltpoll: select haltpoll governor
cpuidle: teo: Introduce util-awareness
cpuidle: teo: Optionally skip polling states in teo_find_shallower_state()
* pm-core:
PM: Add EXPORT macros for exporting PM functions
PM: runtime: Document that force_suspend() is incompatible with SMART_SUSPEND
* pm-sleep:
PM: sleep: Remove "select SRCU"
PM: hibernate: swap: don't use /** for non-kernel-doc comments
For some reason, the drivers/base/class.c file still had the "old style"
of exports, at the end of the file. Move the exports to the proper
location, right after the function, to be correct.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230214144117.158956-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 31b4b6730f as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 90a9d5ff22 as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 9d3fe6aa6b as it is
reported to cause boot regressions.
Link: https://lore.kernel.org/r/Y+rSXg14z1Myd8Px@dev-arch.thelio-3990X
Reported-by: Nathan Chancellor <nathan@kernel.org>
Cc: Longlong Xia <xialonglong1@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of having to change the uevent bus_type callback by hand at
runtime, set it at build time based on the build configuration options,
making this much simpler to maintain and understand (and allow to make
the structure constant.)
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230210102408.1083177-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The only caller of device_del() does not check the return value. And
there's nothing we can do when cleaning things up on a remove path.
Let's make it a void function.
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230210095444.4067307-4-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Because handle() is the core function for processing devtmpfs requests,
Let's add some debug info in handle() to help users know why failed.
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230210095444.4067307-3-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's been some work done recently to the drivers/base/bus.c file so
update the copyright notice in it to make those who track those types of
things have an easier job.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230210091318.733561-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Instead of poking around in the struct bus_type directly for the
dev_root pointer, provide a function to return it properly reference
counted, if it is present in the bus. This will be needed to move the
pointer out of struct bus_type in the future.
Use the function in the driver core code at the same time it is
introduced to verify that it works properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230209093556.19132-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The functions add_probe_files() and remove_probe_files() should be
taking a const * to bus_type, not just a *, so fix that up. These
functions should really be removed entirely and an attribute group used
instead, but for now, make this change so that other const work can
continue.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-21-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The bus_register_notifier() and bus_unregister_notifier() functions
should be taking a const * to bus_type, not just a * so fix that up.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-19-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that the driver code has been refactored to not rely on the pointer
from a struct bus_type to the private structure it can be safely removed
from the structure entirely.
This will allow most bus_type structures to now be marked as const.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-18-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A local function to the driver core to determine if a bus really is
registered with the kernel or not. To be used only by the driver core
code, as part of the driver registration path as it's not really "safe"
because the bus could be unregistered instantly after being called.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function really is a bus function, not a driver one, so move it
from driver.c to bus.c so that we can clean up some internal bus logic
easier.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-15-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_sort_breadthfirst() function to use bus_to_subsys() and
not use the back-pointer to the private structure.
This also allows us to get rid of bus_get_device_klist() which was only
being used by this one internal function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-14-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_for_each_dev(), bus_find_device, and bus_for_each_drv()
functions to use bus_to_subsys() and not use the back-pointer to the
private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-13-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_add_driver() and bus_remove_driver() functions to use
bus_to_subsys() and not use the back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-12-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_register_notifier() and bus_unregister_notifier() public
functions to use bus_to_subsys() and not use the back-pointer to the
private structure as well as the bus_notify() function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-11-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_get_kset() function function to use bus_to_subsys() and
not use the back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-10-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the subsys_interface_register and subsys_interface_unregister()
functions to use bus_to_subsys() and not use the back-pointer to the
private structure.
This also requires changing the parameters on subsys_dev_iter_init() to
iterate over the list properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-9-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_register() and bus_unregister() functions to use
bus_to_subsys() and not use the back-pointer to the private structure.
Because bus_add_groups() and bus_remove_groups() were only called in one
place, remove those one-line-wrapper functions and call the real sysfs
group function where it is needed instead, saving another layer of
indirection.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the bus_add_device(), bus_probe_device(), and
bus_remove_device() functions to use bus_to_subsys() and not use the
back-pointer to the private structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert the drivers_autoprobe show/store and uevent sysfs callbacks to
use bus_to_subsys() and not use the back-pointer to the private
structure.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
bus_create_file() and bus_remove_file() can be made to take a constant
bus pointer, as it should not be modifying anything in the bus
structure. Make this change and move the functions to use the internal
subsys_get/put() logic as well, to prevent the use of the back-pointer
in struct bus_type.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
All of the bus find and iterator functions do not modify the struct
bus_type passed to them, so mark them as constant to enforce this rule.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the quest to make 'struct bus_type' constant and in read-only memory,
we need to stop using the private pointer to the subsys_private
structure. First step in doing this is to create a helper function that
turns a 'struct bus_type' into 'struct subsys_private' called
bus_to_subsys().
bus_to_subsys() walks the list of registered busses in the system and
finds the matching one based on the pointer to the bus_type itself. As
this is a short list, and this function is not on any fast path, it
should not be noticable.
Implement bus_get() and bus_put() using this new helper function.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We need to control the reference count of the subsys private structure
instead of directly manipulating the kset reference count of it, so wrap
that logic up in a subsys_get() and subsys_put() function to make it more
obvious as to what is happening.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230208111330.439504-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink could only detect a single and simple cycle because it relied
mainly on device link cycle detection code that only checked for cycles
between devices. The expectation was that the firmware wouldn't have
complicated cycles and multiple cycles between devices. That expectation
has been proven to be wrong.
For example, fw_devlink could handle:
+-+ +-+
|A+------> |B+
+-+ +++
^ |
| |
+----------+
But it couldn't handle even something as "simple" as:
+---------------------+
| |
v |
+-+ +-+ +++
|A+------> |B+------> |C|
+-+ +++ +-+
^ |
| |
+----------+
But firmware has even more complicated cycles like:
+---------------------+
| |
v |
+-+ +---+ +++
+--+A+------>| B +-----> |C|<--+
| +-+ ++--+ +++ |
| ^ | ^ | |
| | | | | |
| +---------+ +---------+ |
| |
+------------------------------+
And this is without including parent child dependencies or nodes in the
cycle that are just firmware nodes that'll never have a struct device
created for them.
The proper way to treat these devices it to not force any probe ordering
between them, while still enforce dependencies between node in the
cycles (A, B and C) and their consumers.
So this patch goes all out and just deals with all types of cycles. It
does this by:
1. Following dependencies across device links, parent-child and fwnode
links.
2. When it find cycles, it mark the device links and fwnode links as
such instead of just deleting them or making the indistinguishable
from proxy SYNC_STATE_ONLY device links.
This way, when new nodes get added, we can immediately find and mark any
new cycles whether the new node is a device or firmware node.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-9-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Consolidate the code that computes the flags to be used when creating a
device link from a fwnode link.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-8-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To improve detection and handling of dependency cycles, we need to be
able to mark fwnode links as being part of cycles. fwnode links marked
as being part of a cycle should not block their consumers from probing.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-7-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two
purposes:
1. To allow a parent device to proxy its child device's dependency on a
supplier so that the supplier doesn't get its sync_state() callback
before the child device/consumer can be added and probed. In this
usage scenario, we need to ignore cycles for ensure correctness of
sync_state() callbacks.
2. When there are dependency cycles in firmware, we don't know which of
those dependencies are valid. So, we have to ignore them all wrt
probe ordering while still making sure the sync_state() callbacks
come correctly.
However, when detecting dependency cycles, there can be multiple
dependency cycles between two devices that we need to detect. For
example:
A -> B -> A and A -> C -> B -> A.
To detect multiple cycles correct, we need to be able to differentiate
DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above.
To allow this differentiation, add a DL_FLAG_CYCLE that can be use to
mark use case (2). We can then use the DL_FLAG_CYCLE to decide which
DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for
dependency cycles.
Fixes: 2de9d8e0d2 ("driver core: fw_devlink: Improve handling of cyclic dependencies")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-6-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_devlink shouldn't defer the probe of a device to wait on a supplier
that'll never have a struct device or will never be probed by a driver.
We currently check if a supplier falls into this category, but don't
check its ancestors. We need to check the ancestors too because if the
ancestor will never probe, then the supplier will never probe either.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a device X is bound successfully to a driver, if it has a child
firmware node Y that doesn't have a struct device created by then, we
delete fwnode links where the child firmware node Y is the supplier. We
did this to avoid blocking the consumers of the child firmware node Y
from deferring probe indefinitely.
While that a step in the right direction, it's better to make the
consumers of the child firmware node Y to be consumers of the device X
because device X is probably implementing whatever functionality is
represented by child firmware node Y. By doing this, we capture the
device dependencies more accurately and ensure better
probe/suspend/resume ordering.
Signed-off-by: Saravana Kannan <saravanak@google.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4
Link: https://lore.kernel.org/r/20230207014207.1678715-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.
Take advantage of this to constify the structure definitions to prevent
modification at runtime.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20230204-kobj_type-driver-core-v1-1-b9f809419f2c@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230202141621.2296458-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patch series "Introduce per NUMA node memory error statistics", v2.
Background
==========
In the RFC for Kernel Support of Memory Error Detection [1], one advantage
of software-based scanning over hardware patrol scrubber is the ability to
make statistics visible to system administrators. The statistics include
2 categories:
* Memory error statistics, for example, how many memory error are
encountered, how many of them are recovered by the kernel. Note these
memory errors are non-fatal to kernel: during the machine check
exception (MCE) handling kernel already classified MCE's severity to be
unnecessary to panic (but either action required or optional).
* Scanner statistics, for example how many times the scanner have fully
scanned a NUMA node, how many errors are first detected by the scanner.
The memory error statistics are useful to userspace and actually not
specific to scanner detected memory errors, and are the focus of this
patchset.
Motivation
==========
Memory error stats are important to userspace but insufficient in kernel
today. Datacenter administrators can better monitor a machine's memory
health with the visible stats. For example, while memory errors are
inevitable on servers with 10+ TB memory, starting server maintenance when
there are only 1~2 recovered memory errors could be overreacting; in cloud
production environment maintenance usually means live migrate all the
workload running on the server and this usually causes nontrivial
disruption to the customer. Providing insight into the scope of memory
errors on a system helps to determine the appropriate follow-up action.
In addition, the kernel's existing memory error stats need to be
standardized so that userspace can reliably count on their usefulness.
Today kernel provides following memory error info to userspace, but they
are not sufficient or have disadvantages:
* HardwareCorrupted in /proc/meminfo: number of bytes poisoned in total,
not per NUMA node stats though
* ras:memory_failure_event: only available after explicitly enabled
* /dev/mcelog provides many useful info about the MCEs, but doesn't
capture how memory_failure recovered memory MCEs
* kernel logs: userspace needs to process log text
Exposing memory error stats is also a good start for the in-kernel memory
error detector. Today the data source of memory error stats are either
direct memory error consumption, or hardware patrol scrubber detection
(either signaled as UCNA or SRAO). Once in-kernel memory scanner is
implemented, it will be the main source as it is usually configured to
scan memory DIMMs constantly and faster than hardware patrol scrubber.
How Implemented
===============
As Naoya pointed out [2], exposing memory error statistics to userspace is
useful independent of software or hardware scanner. Therefore we
implement the memory error statistics independent of the in-kernel memory
error detector. It exposes the following per NUMA node memory error
counters:
/sys/devices/system/node/node${X}/memory_failure/total
/sys/devices/system/node/node${X}/memory_failure/recovered
/sys/devices/system/node/node${X}/memory_failure/ignored
/sys/devices/system/node/node${X}/memory_failure/failed
/sys/devices/system/node/node${X}/memory_failure/delayed
These counters describe how many raw pages are poisoned and after the
attempted recoveries by the kernel, their resolutions: how many are
recovered, ignored, failed, or delayed respectively. This approach can be
easier to extend for future use cases than /proc/meminfo, trace event, and
log. The following math holds for the statistics:
* total = recovered + ignored + failed + delayed
These memory error stats are reset during machine boot.
The 1st commit introduces these sysfs entries. The 2nd commit populates
memory error stats every time memory_failure attempts memory error
recovery. The 3rd commit adds documentations for introduced stats.
[1] https://lore.kernel.org/linux-mm/7E670362-C29E-4626-B546-26530D54F937@gmail.com/T/#mc22959244f5388891c523882e61163c6e4d703af
[2] https://lore.kernel.org/linux-mm/7E670362-C29E-4626-B546-26530D54F937@gmail.com/T/#m52d8d7a333d8536bd7ce74253298858b1c0c0ac6
This patch (of 3):
Today kernel provides following memory error info to userspace, but each
has its own disadvantage
* HardwareCorrupted in /proc/meminfo: number of bytes poisoned in total,
not per NUMA node stats though
* ras:memory_failure_event: only available after explicitly enabled
* /dev/mcelog provides many useful info about the MCEs, but
doesn't capture how memory_failure recovered memory MCEs
* kernel logs: userspace needs to process log text
Exposes per NUMA node memory error stats as sysfs entries:
/sys/devices/system/node/node${X}/memory_failure/total
/sys/devices/system/node/node${X}/memory_failure/recovered
/sys/devices/system/node/node${X}/memory_failure/ignored
/sys/devices/system/node/node${X}/memory_failure/failed
/sys/devices/system/node/node${X}/memory_failure/delayed
These counters describe how many raw pages are poisoned and after the
attempted recoveries by the kernel, their resolutions: how many are
recovered, ignored, failed, or delayed respectively. The following math
holds for the statistics:
* total = recovered + ignored + failed + delayed
Link: https://lkml.kernel.org/r/20230120034622.2698268-1-jiaqiyan@google.com
Link: https://lkml.kernel.org/r/20230120034622.2698268-2-jiaqiyan@google.com
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Now that the SRCU Kconfig option is unconditionally selected, there is
no longer any point in conditional compilation based on CONFIG_SRCU.
Therefore, remove the #ifdef and throw away the #else clause.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Use the pr_fmt() macro to prefix all the output with "devtmpfs: ".
while at it, convert printk(<LEVEL>) to pr_<level>().
Signed-off-by: Longlong Xia <xialonglong1@huawei.com>
Link: https://lore.kernel.org/r/20230202033203.1239239-2-xialonglong1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the lock_class_key structure out of struct bus_type and into the
dynamic structure we create already for all bus_types registered with
the kernel. This saves on static space and removes one more writable
field in struct bus_type.
In the future, the same field can be moved out of the struct class logic
because it shares this same private structure.
Most everyone will never notice this change, as lockdep is not enabled
in real systems so no memory or logic changes are happening for them.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230201083349.4038660-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__platform_driver_probe() pokes around in some bus and driver private
lists and locks in a way that is not needed at all. The code only wants
to know if a device was bound to the driver that was registered, so walk
all devices on the bus to see if there was a match. If there is not a
match, return an error. This is the same logic as was originally
present, but just done in a simpler and more obvious way that is not a
layering violation.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230131082459.301603-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the reworking of the function __platform_driver_probe() over the
years, it turns out that the variable 'code' does not actually do
anything or mean anything anymore and can be removed to simplify the
logic when trying to read and understand what this function is actually
doing.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230131082459.301603-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge tag 'v6.2-rc6' into sched/core, to pick up fixes
Pick up fixes before merging another batch of cpuidle updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
reg_base and reg_downshift currently don't have any effect if used with
a regmap_bus or regmap_config which only offers single register
operations (ie. reg_read, reg_write and optionally reg_update_bits).
Fix that and take them into account also for regmap_bus with only
reg_read and read_write operations by applying reg_base and
reg_downshift in _regmap_bus_reg_write, _regmap_bus_reg_read.
Also apply reg_base and reg_downshift in _regmap_update_bits, but only
in case the operation is carried out with a reg_update_bits call
defined in either regmap_bus or regmap_config.
Fixes: 0074f3f2b1 ("regmap: allow a defined reg_base to be added to every address")
Fixes: 86fc59ef81 ("regmap: add configurable downshift for addresses")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/Y9clyVS3tQEHlUhA@makrotopia.org
Signed-off-by: Mark Brown <broonie@kernel.org>
The soc_bus code pokes around in the internal bus structures assuming
that it "knows" if a field is not set that it has not been registered
yet. That isn't a safe assumption, so just remove the layering
violation entirely and keep track if the bus has been registered or not
ourselves.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20230130171059.1784057-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The uevent() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up all existing uevent() callbacks to
have the correct signature to preserve the build.
Cc: Christine Caulfield <ccaulfie@redhat.com>
Cc: David Teigland <teigland@redhat.com>
Cc: Bob Peterson <rpeterso@redhat.com>
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
device_get_devnode() should take a constant * to struct device as it
does not modify it in any way, so modify the function definition to do
this and move it out of device.h as it does not need to be exposed to
the whole kernel tree.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Won Chung <wonchung@google.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-8-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clear the class private pointer if __class_register() fails for it, so
as to allow its users to verify that the class is usable by checking
the value of that pointer.
For consistency, clear that pointer before freeing the object pointed
to by it in class_release().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/4463268.LvFx2qVVIh@kreacher
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The normal call sequence of using transport class is:
Add path:
transport_setup_device()
transport_setup_classdev() // call sas_host_setup() here
transport_add_device() // if fails, need call transport_destroy_device()
transport_configure_device()
Remove path:
transport_remove_device()
transport_remove_classdev // call sas_host_remove() here
transport_destroy_device()
If transport_add_device() fails, need call transport_destroy_device()
to free memory, but in this case, ->remove() is not called, and the
resources allocated in ->setup() are leaked. So fix these leaks by
calling ->remove() in transport_add_class_device() if it returns error.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221115031638.3816551-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct acpi_pld_info *pld should be freed before the return of allocation
failure, to prevent memory leak, add the ACPI_FREE() to fix it.
Fixes: bc443c31de ("driver core: location: Check for allocations failure")
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/1669102648-11517-1-git-send-email-guohanjun@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When calling kobject_add() failed in device_add(), it will call
cleanup_glue_dir() to free resource. But in kobject_add(),
dev->kobj.parent has been set to NULL. This will cause resource leak.
The process is as follows:
device_add()
get_device_parent()
class_dir_create_and_add()
kobject_add() //kobject_get()
...
dev->kobj.parent = kobj;
...
kobject_add() //failed, but set dev->kobj.parent = NULL
...
glue_dir = get_glue_dir(dev) //glue_dir = NULL, and goto
//"Error" label
...
cleanup_glue_dir() //becaues glue_dir is NULL, not call
//kobject_put()
The preceding problem may cause insmod mac80211_hwsim.ko to failed.
sysfs: cannot create duplicate filename '/devices/virtual/mac80211_hwsim'
Call Trace:
<TASK>
dump_stack_lvl+0x8e/0xd1
sysfs_warn_dup.cold+0x1c/0x29
sysfs_create_dir_ns+0x224/0x280
kobject_add_internal+0x2aa/0x880
kobject_add+0x135/0x1a0
get_device_parent+0x3d7/0x590
device_add+0x2aa/0x1cb0
device_create_groups_vargs+0x1eb/0x260
device_create+0xdc/0x110
mac80211_hwsim_new_radio+0x31e/0x4790 [mac80211_hwsim]
init_mac80211_hwsim+0x48d/0x1000 [mac80211_hwsim]
do_one_initcall+0x10f/0x630
do_init_module+0x19f/0x5e0
load_module+0x64b7/0x6eb0
__do_sys_finit_module+0x140/0x200
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
</TASK>
kobject_add_internal failed for mac80211_hwsim with -EEXIST, don't try to
register things with the same name in the same directory.
Fixes: cebf8fd169 ("driver core: fix race between creating/querying glue dir and its cleanup")
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Link: https://lore.kernel.org/r/20221123012042.335252-1-shaozhengchao@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to 'admin-guide/mm/memory-hotplug.rst', the memory block ID,
instead of the section index, is shown by '/sys/devices/system/memory/
memoryX/phys_index'.
Fix the comments to match with 'admin-guide/mm/memory-hotplug.rst'.
Besides, use the existing helper memory_block_id() to convert the section
index to the memory block index.
No functional change intended.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20230120055727.355483-2-gshan@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The main change is to build the cache topology information for all
the CPUs from the primary CPU. Currently the cacheinfo for secondary CPUs
is created during the early boot on the respective CPU itself. Preemption
and interrupts are disabled at this stage. On PREEMPT_RT kernels, allocating
memory and even parsing the PPTT table for ACPI based systems triggers a:
'BUG: sleeping function called from invalid context'
To prevent this bug, the cacheinfo is now allocated from the primary CPU
when preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on any architecture specific mechanism if done so
early.
The other minor change included here is to handle shared caches at
different levels when not all the CPUs on the system have the same
cache hierarchy.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmPKg60ACgkQAEG6vDF+
4pi91hAAoSluqjbUHzzCW+OIIjKAAvQrAw6bsKGvSpcUfYno1Lry+9y76L6TMYSy
OPtiKGxcJyzhdlCwIpJzgaX9nTz7uiu70euNZiAp11XA2KlphtLoI3TMUa60jD4i
ZGfn9UiAp719Vog5m3CmZXjHZ6drI0HloL8ZWTl4VDATUu5pfcx4uYPT2o63Xc62
k5QglaRJWFhFAJ+R6R9vQS2zfeMI9xvehl72445wb8pxxPW2f91dvBhJqJgKlziw
gHKx+D1DnpAUd+v+7HAEmzjXKlY6JnQybmBHmRayllVAa8kGUtvhTcBlRGNsNBzR
m7VBFKq+eSk7VgxOgka1qXVtHUrlaEWf5qWnG+w4XEiE1VgzNagjaFRaGQQneKI/
z3yNKG8Xjp+3BdSz0pUDJVEWFnnjueAUEh6/xODEXnWdX166abQZslLIHCvmcnM8
q7blasuj2mxyCZFC1tyK9WHI7/KCe0cmHbdau3qs0j9bvhzfdB3DwMLsdRjXQTOv
8FVX0Z5EKY9/bW2oqCg/mb3KOWbmFX2ZHho4cds3IV+9GGB8JkD/6b8vpGejmmx2
E2vInzhP3gLd9WiWQWDjg4+aklE29P/nDAA8BSPnW3TtEGAFJMZZQRgNlCZnBW56
Tx2/lE0VD5/UX+1MqSFGchi+KEX3mcZykcra9VNt+uZH26a8gBE=
=nH4b
-----END PGP SIGNATURE-----
Merge tag 'archtopo-cacheinfo-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
"cacheinfo and arch_topology updates for v6.3
The main change is to build the cache topology information for all
the CPUs from the primary CPU. Currently the cacheinfo for secondary CPUs
is created during the early boot on the respective CPU itself. Preemption
and interrupts are disabled at this stage. On PREEMPT_RT kernels, allocating
memory and even parsing the PPTT table for ACPI based systems triggers a:
'BUG: sleeping function called from invalid context'
To prevent this bug, the cacheinfo is now allocated from the primary CPU
when preemption and interrupts are enabled and before booting secondary
CPUs. The cache levels/leaves are computed from DT/ACPI PPTT information
only, without relying on any architecture specific mechanism if done so
early.
The other minor change included here is to handle shared caches at
different levels when not all the CPUs on the system have the same
cache hierarchy."
* tag 'archtopo-cacheinfo-updates-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
cacheinfo: Fix shared_cpu_map to handle shared caches at different levels
arch_topology: Build cacheinfo from primary CPU
ACPI: PPTT: Update acpi_find_last_cache_level() to acpi_get_cache_info()
ACPI: PPTT: Remove acpi_find_cache_levels()
cacheinfo: Check 'cache-unified' property to count cache leaves
cacheinfo: Return error code in init_of_cache_level()
cacheinfo: Use RISC-V's init_cache_level() as generic OF implementation
In test_async_probe_init, second set of asynchronous devices are saved
in sync_dev[sync_id], which should be async_dev[async_id].
This makes these devices not unregistered when exit.
> modprobe test_async_driver_probe && \
> modprobe -r test_async_driver_probe && \
> modprobe test_async_driver_probe
...
> sysfs: cannot create duplicate filename '/devices/platform/test_async_driver.4'
> kobject_add_internal failed for test_async_driver.4 with -EEXIST,
don't try to register things with the same name in the same directory.
Fixes: 57ea974fb8 ("driver core: Rewrite test_async_driver_probe to cover serialization and NUMA affinity")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Link: https://lore.kernel.org/r/20221125063541.241328-1-chenzhongjin@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The 'parent' returned by fwnode_graph_get_port_parent()
with refcount incremented when 'prev' is not NULL, it
needs be put when finish using it.
Because the parent is const, introduce a new variable to
store the returned fwnode, then put it before returning
from fwnode_graph_get_next_endpoint().
Fixes: b5b41ab6b0 ("device property: Check fwnode->secondary in fwnode_graph_get_next_endpoint()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-and-tested-by: Daniel Scally <djrscally@gmail.com>
Link: https://lore.kernel.org/r/20221123022542.2999510-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
The cacheinfo sets up the shared_cpu_map by checking whether the caches
with the same index are shared between CPUs. However, this will trigger
slab-out-of-bounds access if the CPUs do not have the same cache hierarchy.
Another problem is the mismatched shared_cpu_map when the shared cache does
not have the same index between CPUs.
CPU0 I D L3
index 0 1 2 x
^ ^ ^ ^
index 0 1 2 3
CPU1 I D L2 L3
This patch checks each cache is shared with all caches on other CPUs.
Reviewed-by: Pierre Gondois <pierre.gondois@arm.com>
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Link: https://lore.kernel.org/r/20230117105133.4445-2-yongxuan.wang@sifive.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
commit 3fcbf1c77d ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.
The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce ("arch_topology: Drop LLC identifier stash from
the CPU topology")
allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.
When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].
Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.
[1]:
| BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
| in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
| preempt_count: 1, expected: 0
| RCU nest depth: 1, expected: 1
| 3 locks held by swapper/111/0:
| #0: (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
| #1: (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
| #2: (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last enabled at (0): copy_process+0x5dc/0x1ab8
| softirqs last disabled at (0): 0x0
| Preemption disabled at:
| migrate_enable+0x30/0x130
| CPU: 111 PID: 0 Comm: swapper/111 Tainted: G W 6.0.0-rc4-rt6-[...]
| Call trace:
| __kmalloc+0xbc/0x1e8
| detect_cache_attributes+0x2d4/0x5f0
| update_siblings_masks+0x30/0x368
| store_cpu_topology+0x78/0xb8
| secondary_start_kernel+0xd0/0x198
| __secondary_switched+0xb0/0xb4
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The DeviceTree Specification v0.3 specifies that the cache node
'[d-|i-|]cache-size' property is required. The 'cache-unified'
property is specifies whether the cache level is separate
or unified.
If the cache-size property is missing, no cache leaves is accounted.
This can lead to a 'BUG: KASAN: slab-out-of-bounds' [1] bug.
Check 'cache-unified' property and always account for at least
one cache leaf when parsing the device tree.
[1] https://lore.kernel.org/all/0f19cb3f-d6cf-4032-66d2-dedc9d09a0e3@linaro.org/
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Tested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20230104183033.755668-4-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The logic to touch the bus notifier was open-coded in numberous places
in the driver core. Clean that up by creating a local bus_notify()
function and have everyone call this function instead, making the
reading of the caller code simpler and easier to maintain over time.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230111092331.3946745-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Make init_of_cache_level() return an error code when the cache
information parsing fails to help detecting missing information.
init_of_cache_level() is only called for riscv. Returning an error
code instead of 0 will prevent detect_cache_attributes() to allocate
memory if an incomplete DT is parsed.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-3-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
RISC-V's implementation of init_of_cache_level() is following
the Devicetree Specification v0.3 regarding caches, cf.:
- s3.7.3 'Internal (L1) Cache Properties'
- s3.8 'Multi-level and Shared Cache Nodes'
Allow reusing the implementation by moving it.
Also make 'levels', 'leaves' and 'level' unsigned int.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-2-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
When CONFIG_OF_IRQ is not enabled, there will be a stub method that always
returns 0 when getting IRQ. Thus, the if-branch can be removed safely.
Signed-off-by: Soha Jin <soha@lohu.info>
Link: https://lore.kernel.org/r/20221111094542.270540-1-soha@lohu.info
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are no more users of software_node_register_nodes() and
software_node_unregister_nodes(). Remove them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Daniel Scally <dan.scally@ideasonboard.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221228094922.84119-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Switch property entry test to use software_node_register_node_group() API.
The current one is going to be removed soon.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Tested-by: Daniel Scally <dan.scally@ideasonboard.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221228094922.84119-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
struct platform_driver::remove returning an integer made driver authors
expect that returning an error code was proper error handling. However
the driver core ignores the error and continues to remove the device
because there is nothing the core could do anyhow and reentering the
remove callback again is only calling for trouble.
So this is an source for errors typically yielding resource leaks in the
error path.
As there are too many platform drivers to neatly convert them all to
return void in a single go, do it in several steps after this patch:
a) Convert all drivers to implement .remove_new() returning void instead
of .remove() returning int;
b) Change struct platform_driver::remove() to return void and so make
it identical to .remove_new();
c) Change all drivers back to .remove() now with the better prototype;
d) drop struct platform_driver::remove_new().
While this touches all drivers eventually twice, steps a) and c) can be
done one driver after another and so reduces coordination efforts
immensely and simplifies review.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20221209150914.3557650-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The MDIO subsystem is getting rid of MII_ADDR_C45 and thus also
encoding associated encoding of the C45 device address and register
address into one value. regmap-mdio also uses this encoding for the
C45 bus.
Move to the new C45 helpers for MDIO access and provide regmap-mdio
helper macros.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Michael Walle <michael@walle.cc>
Link: https://lore.kernel.org/r/20230116111509.4086236-1-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
pm_runtime_force_suspend() cannot be used with DPM_FLAG_SMART_SUSPEND, so
note this in the kerneldoc.
If DPM_FLAG_SMART_SUSPEND is set and the PM core cannot skip system resume
it will call pm_runtime_active() on the driver. This can lead to an
inconsistent state where:
pm_runtime_force_suspend() called ->runtime_suspend
but
device_resume_noirq() called pm_runtime_set_active()
This leaves the driver actually suspended but marked as active.
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
OMAP was the one and only user.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195541.782536366@infradead.org
The macro to_subsys_private() needs to switch to using
container_of_const() as it turned out to being incorrectly casting a
const pointer to a non-const one. Make this change and fix up the one
offending user to be correctly handling a const pointer properly.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20230111093327.3955063-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is not used outside of its compilation unit, so there's no need to
export this variable.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Acked-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20221227232152.3094584-1-javierm@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I got the following null-ptr-deref report while doing fault injection test:
BUG: kernel NULL pointer dereference, address: 0000000000000058
CPU: 2 PID: 278 Comm: 37-i2c-ds2482 Tainted: G B W N 6.1.0-rc3+
RIP: 0010:klist_put+0x2d/0xd0
Call Trace:
<TASK>
klist_remove+0xf1/0x1c0
device_release_driver_internal+0x196/0x210
bus_remove_device+0x1bd/0x240
device_add+0xd3d/0x1100
w1_add_master_device+0x476/0x490 [wire]
ds2482_probe+0x303/0x3e0 [ds2482]
This is how it happened:
w1_alloc_dev()
// The dev->driver is set to w1_master_driver.
memcpy(&dev->dev, device, sizeof(struct device));
device_add()
bus_add_device()
dpm_sysfs_add() // It fails, calls bus_remove_device.
// error path
bus_remove_device()
// The dev->driver is not null, but driver is not bound.
__device_release_driver()
klist_remove(&dev->p->knode_driver) <-- It causes null-ptr-deref.
// normal path
bus_probe_device() // It's not called yet.
device_bind_driver()
If dev->driver is set, in the error path after calling bus_add_device()
in device_add(), bus_remove_device() is called, then the device will be
detached from driver. But device_bind_driver() is not called yet, so it
causes null-ptr-deref while access the 'knode_driver'. To fix this, set
dev->driver to null in the error path before calling bus_remove_device().
Fixes: 57eee3d23e ("Driver core: Call device_pm_add() after bus_add_device() in device_add()")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221205034904.2077765-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some genpd providers doesn't ensure that it has turned off at hardware.
This is fine until the consumer really requires during some special
scenarios that the power domain collapse at hardware before it is
turned ON again.
An example is the reset sequence of Adreno GPU which requires that the
'gpucc cx gdsc' power domain should move to OFF state in hardware at
least once before turning in ON again to clear the internal state.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20230102161757.v5.1.I3e6b1f078ad0f1ca9358c573daa7b70ec132cdbe@changeid
struct subsys_dev_iter is not used by any code outside of
drivers/base/bus.c so move it into that file and out of the global bus.h
file.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-6-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function subsys_dev_iter_exit() is not used outside of
drivers/base/bus.c so make it static to that file and remove the global
export.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-5-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function subsys_dev_iter_next() is only used in drivers/base/bus.c
so make it static to that file and remove the global export.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This function has not been called by any code in the kernel tree in many
many years so remove it as it is unused.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
No one calls this function outside of drivers/base/bus.c so make it
static so it does not need to be exported anymore.
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230109175810.2965448-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Support zstd-compressed debug info
- Allow W=1 builds to detect objects shared among multiple modules
- Add srcrpm-pkg target to generate a source RPM package
- Make the -s option detection work for future GNU Make versions
- Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
- Allow W=1 builds to detect -Wundef warnings in any preprocessed files
- Raise the minimum supported version of binutils to 2.25
- Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
- Use $(file ...) to read a file if GNU Make >= 4.2 is used
- Print error if GNU Make older than 3.82 is used
- Allow modpost to detect section mismatches with Clang LTO
- Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmOeImsVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsG06IP/iVjuWFvnjDZT4X8X6zN8aKp1vtR
EMkmoRtt5cD4CLb1MG4N7irYHgedQSx4rYceP45MyW1I3egl6Ct14RDyeQ1xSIZb
XFTLDCZvfl/up3MdiqNAqKRS7x5lk9++7F0t+2SoQxKQyJvm735XreX+VhZ1FeLB
qcHrmzJ5veky5Ry/3OkNUgKFBjKEAL+qKMc55uvkXqfTb3KoBa2r4VC1OaoYGRru
R8oF9qQRnGVQAl/LbBVchmgSjxryxPrCvBGiKlK03VkXdzEMHMimEJh3BQ6e0PGo
gajdk+4liy7z+jQnI7jFhvJjGKzkEP/Bc99M/uS92QX5MgpH6mqpHMoqqPiqW87K
RmZH37FqRu1Vo8dpibmH6r2K6YD/HHRjaDHk1VuuCQYEn0dsNmokPXOqd/1v0I1i
TXPjWOw1AID5vMJWllqxFhpeVvf0vx5BT/UNrh68MLqlJZzv2eMVJb4fNy6640ml
U0NclMnOa3eOmf5z1T7/LqDRTa63Q0kpanRrBpcmVOaqW+ZpQ3SQjh4uBN1PyJHL
cX3Skc341DyRlFiT54QhGKlm57MEb2gjhBZ3Z4J+b7sEFgvjXH/W8vcOGIKlppmA
CfYMyres4OV+fJc89ONkWsvLiOP1OeUGPvytm33J5QMKXc8SzOLP0D/F8kjrDflm
EROKuZ4EA5ej/rOy
=Ig/Y
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Support zstd-compressed debug info
- Allow W=1 builds to detect objects shared among multiple modules
- Add srcrpm-pkg target to generate a source RPM package
- Make the -s option detection work for future GNU Make versions
- Add -Werror to KBUILD_CPPFLAGS when CONFIG_WERROR=y
- Allow W=1 builds to detect -Wundef warnings in any preprocessed files
- Raise the minimum supported version of binutils to 2.25
- Use $(intcmp ...) to compare integers if GNU Make >= 4.4 is used
- Use $(file ...) to read a file if GNU Make >= 4.2 is used
- Print error if GNU Make older than 3.82 is used
- Allow modpost to detect section mismatches with Clang LTO
- Include vmlinuz.efi into kernel tarballs for arm64 CONFIG_EFI_ZBOOT=y
* tag 'kbuild-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (29 commits)
buildtar: fix tarballs with EFI_ZBOOT enabled
modpost: Include '.text.*' in TEXT_SECTIONS
padata: Mark padata_work_init() as __ref
kbuild: ensure Make >= 3.82 is used
kbuild: refactor the prerequisites of the modpost rule
kbuild: change module.order to list *.o instead of *.ko
kbuild: use .NOTINTERMEDIATE for future GNU Make versions
kconfig: refactor Makefile to reduce process forks
kbuild: add read-file macro
kbuild: do not sort after reading modules.order
kbuild: add test-{ge,gt,le,lt} macros
Documentation: raise minimum supported version of binutils to 2.25
kbuild: add -Wundef to KBUILD_CPPFLAGS for W=1 builds
kbuild: move -Werror from KBUILD_CFLAGS to KBUILD_CPPFLAGS
kbuild: Port silent mode detection to future gnu make.
init/version.c: remove #include <generated/utsrelease.h>
firmware_loader: remove #include <generated/utsrelease.h>
modpost: Mark uuid_le type to be suitable only for MEI
kbuild: add ability to make source rpm buildable using koji
kbuild: warn objects shared among multiple modules
...
Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass in
a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be used
no matter what the const value is. This prevents every subsystem from
having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject, objects
as being "non-mutable". The changes to the kobject and driver core in
this pull request are the result of that, as there are lots of paths
where kobjects and device pointers are not modified at all, so marking
them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object rules.
All of this has been bike-shedded in quite a lot of detail on lkml with
different names and implementations resulting in the tiny version we
have in here, much better than my original proposal. Lots of subsystem
maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with no
problems, OTHER than some merge issues with other trees that should be
obvious when you hit them (block tree deletes a driver that this tree
modifies, iommufd tree modifies code that this tree also touches). If
there are merge problems with these trees, please let me know.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY5wz3A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yks0ACeKYUlVgCsER8eYW+x18szFa2QTXgAn2h/VhZe
1Fp53boFaQkGBjl8mGF8
=v+FB
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.2-rc1.
The "big" change in here is the addition of a new macro,
container_of_const() that will preserve the "const-ness" of a pointer
passed into it.
The "problem" of the current container_of() macro is that if you pass
in a "const *", out of it can comes a non-const pointer unless you
specifically ask for it. For many usages, we want to preserve the
"const" attribute by using the same call. For a specific example, this
series changes the kobj_to_dev() macro to use it, allowing it to be
used no matter what the const value is. This prevents every subsystem
from having to declare 2 different individual macros (i.e.
kobj_const_to_dev() and kobj_to_dev()) and having the compiler enforce
the const value at build time, which having 2 macros would not do
either.
The driver for all of this have been discussions with the Rust kernel
developers as to how to properly mark driver core, and kobject,
objects as being "non-mutable". The changes to the kobject and driver
core in this pull request are the result of that, as there are lots of
paths where kobjects and device pointers are not modified at all, so
marking them as "const" allows the compiler to enforce this.
So, a nice side affect of the Rust development effort has been already
to clean up the driver core code to be more obvious about object
rules.
All of this has been bike-shedded in quite a lot of detail on lkml
with different names and implementations resulting in the tiny version
we have in here, much better than my original proposal. Lots of
subsystem maintainers have acked the changes as well.
Other than this change, included in here are smaller stuff like:
- kernfs fixes and updates to handle lock contention better
- vmlinux.lds.h fixes and updates
- sysfs and debugfs documentation updates
- device property updates
All of these have been in the linux-next tree for quite a while with
no problems"
* tag 'driver-core-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (58 commits)
device property: Fix documentation for fwnode_get_next_parent()
firmware_loader: fix up to_fw_sysfs() to preserve const
usb.h: take advantage of container_of_const()
device.h: move kobj_to_dev() to use container_of_const()
container_of: add container_of_const() that preserves const-ness of the pointer
driver core: fix up missed drivers/s390/char/hmcdrv_dev.c class.devnode() conversion.
driver core: fix up missed scsi/cxlflash class.devnode() conversion.
driver core: fix up some missing class.devnode() conversions.
driver core: make struct class.devnode() take a const *
driver core: make struct class.dev_uevent() take a const *
cacheinfo: Remove of_node_put() for fw_token
device property: Add a blank line in Kconfig of tests
device property: Rename goto label to be more precise
device property: Move PROPERTY_ENTRY_BOOL() a bit down
device property: Get rid of __PROPERTY_ENTRY_ARRAY_EL*SIZE*()
kernfs: fix all kernel-doc warnings and multiple typos
driver core: pass a const * into of_device_uevent()
kobject: kset_uevent_ops: make name() callback take a const *
kobject: kset_uevent_ops: make filter() callback take a const *
kobject: make kobject_namespace take a const *
...
- Convert flexible array members, fix -Wstringop-overflow warnings,
and fix KCFI function type mismatches that went ignored by
maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
add more __alloc_size attributes, and introduce full testing
of all allocator functions. Finally remove the ksize() side-effect
so that each allocation-aware checker can finally behave without
exceptions.
- Introduce oops_limit (default 10,000) and warn_limit (default off)
to provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook).
- Introduce overflows_type() and castable_to_type() helpers for
cleaner overflow checking.
- Improve code generation for strscpy() and update str*() kern-doc.
- Convert strscpy and sigphash tests to KUnit, and expand memcpy
tests.
- Always use a non-NULL argument for prepare_kernel_cred().
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
- Adjust orphan linker section checking to respect CONFIG_WERROR
(Xin Li).
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
- Fix um vs FORTIFY warnings for always-NULL arguments.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
AHXcibguPRQBPAdiPQ==
=yaaN
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
- Convert flexible array members, fix -Wstringop-overflow warnings, and
fix KCFI function type mismatches that went ignored by maintainers
(Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
more __alloc_size attributes, and introduce full testing of all
allocator functions. Finally remove the ksize() side-effect so that
each allocation-aware checker can finally behave without exceptions
- Introduce oops_limit (default 10,000) and warn_limit (default off) to
provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook)
- Introduce overflows_type() and castable_to_type() helpers for cleaner
overflow checking
- Improve code generation for strscpy() and update str*() kern-doc
- Convert strscpy and sigphash tests to KUnit, and expand memcpy tests
- Always use a non-NULL argument for prepare_kernel_cred()
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)
- Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
Li)
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)
- Fix um vs FORTIFY warnings for always-NULL arguments
* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
ksmbd: replace one-element arrays with flexible-array members
hpet: Replace one-element array with flexible-array member
um: virt-pci: Avoid GCC non-NULL warning
signal: Initialize the info in ksignal
lib: fortify_kunit: build without structleak plugin
panic: Expose "warn_count" to sysfs
panic: Introduce warn_limit
panic: Consolidate open-coded panic_on_warn checks
exit: Allow oops_limit to be disabled
exit: Expose "oops_count" to sysfs
exit: Put an upper limit on how often we can oops
panic: Separate sysctl logic from CONFIG_SMP
mm/pgtable: Fix multiple -Wstringop-overflow warnings
mm: Make ksize() a reporting-only function
kunit/fortify: Validate __alloc_size attribute results
drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
driver core: Add __alloc_size hint to devm allocators
overflow: Introduce overflows_type() and castable_to_type()
coredump: Proactively round up to kmalloc bucket size
...
- More userfaultfs work from Peter Xu.
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
- Some filemap cleanups from Vishal Moola.
- David Hildenbrand added the ability to selftest anon memory COW handling.
- Some cpuset simplifications from Liu Shixin.
- Addition of vmalloc tracing support by Uladzislau Rezki.
- Some pagecache folioifications and simplifications from Matthew Wilcox.
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword. This series shold have been in the
non-MM tree, my bad.
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages.
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages.
- Peter Xu utilized the PTE marker code for handling swapin errors.
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient.
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand.
- zram support for multiple compression streams from Sergey Senozhatsky.
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway.
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations.
- Vishal Moola removed the try_to_release_page() wrapper.
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache.
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking.
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend.
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range().
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen.
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect.
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages().
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting.
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines.
- Many singleton patches, as usual.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
CodAgiA51qwzId3GRytIo/tfWZSezgA=
=d19R
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- More userfaultfs work from Peter Xu
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying
- Some filemap cleanups from Vishal Moola
- David Hildenbrand added the ability to selftest anon memory COW
handling
- Some cpuset simplifications from Liu Shixin
- Addition of vmalloc tracing support by Uladzislau Rezki
- Some pagecache folioifications and simplifications from Matthew
Wilcox
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
it
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword.
This series should have been in the non-MM tree, my bad
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages
- Peter Xu utilized the PTE marker code for handling swapin errors
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand
- zram support for multiple compression streams from Sergey Senozhatsky
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations
- Vishal Moola removed the try_to_release_page() wrapper
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range()
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages()
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines
- Many singleton patches, as usual
* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
mm: mmu_gather: allow more than one batch of delayed rmaps
mm: fix typo in struct pglist_data code comment
kmsan: fix memcpy tests
mm: add cond_resched() in swapin_walk_pmd_entry()
mm: do not show fs mm pc for VM_LOCKONFAULT pages
selftests/vm: ksm_functional_tests: fixes for 32bit
selftests/vm: cow: fix compile warning on 32bit
selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
mm,thp,rmap: fix races between updates of subpages_mapcount
mm: memcg: fix swapcached stat accounting
mm: add nodes= arg to memory.reclaim
mm: disable top-tier fallback to reclaim on proactive reclaim
selftests: cgroup: make sure reclaim target memcg is unprotected
selftests: cgroup: refactor proactive reclaim code to reclaim_until()
mm: memcg: fix stale protection of reclaim target memcg
mm/mmap: properly unaccount memory on mas_preallocate() failure
omfs: remove ->writepage
jfs: remove ->writepage
...
A few new APIs here, support for the FSI bus (which is used in some
PowerPC systems) plus a couple of new APIs, one allowing abstractions
built on top of regmap to tell if the regmap can be used in an atomic
context and one providing a callback for an in flight device which can't
do interrupt masking very well.
There's also a fix that I never got round to sending because it really
should be fixed better but that's not happened yet and it does avoid the
problem, the fix was in -next for a long time.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmOXG3wACgkQJNaLcl1U
h9BqHgf/YmdUx3/jh8hmDufqJCKbML0SlIb1ODQlLHsjpMSuCGlmQGSCHa/peVMk
1c6Tn2FboJ5+mHUQixMx6jhlSsJ1fO0i0TbRjj9vL6eLJKCtfdiBS2UmJuGFtyKB
swOISPEsVIWrc2t/e8/DjZ3JznwdFup81vjcYUhlA6Xglk5Ch0szb5+p2ElSWwI9
GA6wDUe0YB3eqU6vSAsjHN/hhUUC2BkGPv1fLzW11kNsoxJbxJ7KsUVmbQQMEMRg
HXXmdlooZqH9og47jGLH+3v3onJb7ZnKkx+wU6no98mb++v0OuiLUzj0IA3TLKk4
OacxbPLBk3cLmpdaPD9eimwV7ZcdVQ==
=a/WH
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A few new APIs here, support for the FSI bus (which is used in some
PowerPC systems) plus a couple of new APIs, one allowing abstractions
built on top of regmap to tell if the regmap can be used in an atomic
context and one providing a callback for an in flight device which
can't do interrupt masking very well.
There's also a fix that I never got round to sending because it really
should be fixed better but that's not happened yet and it does avoid
the problem, the fix was in -next for a long time"
* tag 'regmap-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Add handle_mask_sync() callback
regmap: Add FSI bus support
regmap: add regmap_might_sleep()
regmap-irq: Use the new num_config_regs property in regmap_add_irq_chip_fwnode
- Fix nasty and hard to debug race condition introduced by mistake
in the runtime PM core code and clean up that code somewhat on
top of the fix (Rafael Wysocki).
- Generalize of_perf_domain_get_sharing_cpumask phandle format (Hector
Martin).
- Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin).
- Update Qualcomm cpufreq driver, including:
* CPU clock provider support,
* Generic cleanups or reorganization.
* Potential memleak fix.
* Fix of the return value of cpufreq_driver->get().
(Manivannan Sadhasivam, Chen Hui).
- Update Qualcomm cpufreq driver's DT bindings, including:
* Support for CPU clock provider.
* Missing cache-related properties fixes.
* Support for QDU1000/QRU1000.
(Manivannan Sadhasivam, Rob Herring, Melody Olvera).
- Add support for ti,am625 SoC and enable build of ti-cpufreq for
ARCH_K3 (Dave Gerlach, and Vibhore Vardhan).
- Use flexible array to simplify memory allocation in the tegra186
cpufreq driver (Christophe JAILLET).
- Convert cpufreq statistics code to use sysfs_emit_at() (ye xingchen).
- Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
Gherdovich).
- Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq driver
(Xiongfeng Wang).
- Initialize the kobj_unregister completion before calling
kobject_init_and_add() in the cpufreq core code (Yongqiang Liu).
- Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
Nathan Chancellor).
- Make intel_pstate accept initial EPP value of 0x80 (Srinivas
Pandruvada).
- Make read-only array sys_clk_src in the SPEAr cpufreq driver static
(Colin Ian King).
- Make array speeds in the longhaul cpufreq driver static (Colin Ian
King).
- Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
Shevchenko).
- Drop a reference to CVS from cpufreq documentation (Conghui Wang).
- Improve kernel messages printed by the PSCI cpuidle driver (Ulf
Hansson).
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson).
- Modify the tasks freezing code to avoid using pr_cont() and refine an
error message printed by it (Rafael Wysocki).
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo).
- Fix mistake in a kerneldoc comment in the hibernation code (xiongxin).
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa).
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in the
generic power domains code (Abel Vesa).
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo).
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo).
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo).
- Fix compiler warnings with make W=1 in the idle_inject power capping
driver (Srinivas Pandruvada).
- Use kstrtobool() instead of strtobool() in the power capping sysfs
interface (Christophe JAILLET).
- Add SCMI Powercap based power capping driver (Cristian Marussi).
- Add Emerald Rapids support to the intel-uncore-freq driver (Artem
Bityutskiy).
- Repair slips in kernel-doc comments in the generic notifier code
(Lukas Bulwahn).
- Fix several DT issues in the OPP library reorganize code around
opp-microvolt-<named> DT property (Viresh Kumar).
- Allow any of opp-microvolt, opp-microamp, or opp-microwatt properties
to be present without the others present (James Calligeros).
- Fix clock-latency-ns property in DT example (Serge Semin).
- Add a private governor_data for devfreq governors (Kant Fan).
- Reorganize devfreq code to use device_match_of_node() and
devm_platform_get_and_ioremap_resource() instead of open coding
them (ye xingchen, Minghao Chi).
- Make cpupower choose base_cpu to display default cpupower details
instead of picking CPU 0 (Saket Kumar Bhaskar).
- Add Georgian translation to cpupower documentation (Zurab
Kargareteli).
- Introduce powercap intel-rapl library, powercap-info command, and
RAPL monitor into cpupower (Thomas Renninger).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmOXWKsSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxzKsP/jEKIwMSG4KHyXjJSDopFvppA13468ma
ao2G5EnbtPgZOiN66BOcAfPB+pzBM8WBCnpy8sfNzcpQSaGaJr+flQDqQV/1QG/H
GNQ0MUYN6TF/zfz/hKawDtQJihw9OrJgqQfUJIyc7Djo8ntSBu299XAt3X8VB5D1
azU1WwfOnEhr8evkqd8DS81fwm6b5cWvLkfG3Qvk2VxlwC/BCFdygqNjwOXmMNMb
DPYWv1xoVhSKzJsPHbAtzFq6veLsw2Glf2xPDyjf9ZPB0ujrftFoRoeCrC/neBDb
5bB4P5Injg3IB7SAHf97XgGAH2biUKwVnQhVUOTWXdQ7u/xDbH5fOLFJkBOBP6n6
gZiEOqzg5wVXk+ZfKx4fjsf4LvB1r+nM2tmx/bzhxyt9UDLUfB9kY0PMXLRuYqyn
ITvk00CJ/hkwD98pql4pCnc1PYZLUv/CHiaqTjwwOKuue3Jb3OTSPrSWtYIyTyNx
s2eBz/CxGSg4Q25u3loIiNVAaCOul6SZq+Iz6BlVP8sy3q62LWi8mp5b+kb8HFWH
lk8GpavqOLF6brxpPL/n0vav2bCmdwblMjTcowtGbLgiGSZaD97AkPFTN2H7tGPv
iUZDTdK3H24aqY62yKzo2HK3PhwNCg06gF0VTsuvJ7iIQmfeUpLjB/3qGeJNjlEQ
20fQ6YU/NytB
=B9Uu
-----END PGP SIGNATURE-----
Merge tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These include two new drivers (cpufreq driver for Apple SoC CPU
P-states and the SCMI Powercap based power capping driver), other new
hardware support and driver extensions (Qualcomm cpufreq driver and
its DT bindings, TI cpufreq driver, intel_pstate, intel-uncore-freq),
a bunch of fixes and cleanups all over and a cpupower utility update
including new features related to RAPL support.
Specifics:
- Fix nasty and hard to debug race condition introduced by mistake in
the runtime PM core code and clean up that code somewhat on top of
the fix (Rafael Wysocki)
- Generalize of_perf_domain_get_sharing_cpumask phandle format
(Hector Martin)
- Add new cpufreq driver for Apple SoC CPU P-states (Hector Martin)
- Update Qualcomm cpufreq driver (Manivannan Sadhasivam, Chen Hui):
- CPU clock provider support
- Generic cleanups or reorganization
- Potential memleak fix
- Fix of the return value of cpufreq_driver->get()
- Update Qualcomm cpufreq driver's DT bindings (Manivannan
Sadhasivam, Rob Herring, Melody Olvera):
- Support for CPU clock provider
- Missing cache-related properties fixes
- Support for QDU1000/QRU1000
- Add support for ti,am625 SoC and enable build of ti-cpufreq for
ARCH_K3 (Dave Gerlach, and Vibhore Vardhan)
- Use flexible array to simplify memory allocation in the tegra186
cpufreq driver (Christophe JAILLET)
- Convert cpufreq statistics code to use sysfs_emit_at() (ye
xingchen)
- Allow intel_pstate to use no-HWP mode on Sapphire Rapids (Giovanni
Gherdovich)
- Add missing pci_dev_put() to the amd_freq_sensitivity cpufreq
driver (Xiongfeng Wang)
- Initialize the kobj_unregister completion before calling
kobject_init_and_add() in the cpufreq core code (Yongqiang Liu)
- Defer setting boost MSRs in the ACPI cpufreq driver (Stuart Hayes,
Nathan Chancellor)
- Make intel_pstate accept initial EPP value of 0x80 (Srinivas
Pandruvada)
- Make read-only array sys_clk_src in the SPEAr cpufreq driver static
(Colin Ian King)
- Make array speeds in the longhaul cpufreq driver static (Colin Ian
King)
- Use str_enabled_disabled() helper in the ACPI cpufreq driver (Andy
Shevchenko)
- Drop a reference to CVS from cpufreq documentation (Conghui Wang)
- Improve kernel messages printed by the PSCI cpuidle driver (Ulf
Hansson)
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson)
- Modify the tasks freezing code to avoid using pr_cont() and refine
an error message printed by it (Rafael Wysocki)
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo)
- Fix mistake in a kerneldoc comment in the hibernation code
(xiongxin)
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa)
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in
the generic power domains code (Abel Vesa)
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo)
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo)
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo)
- Fix compiler warnings with make W=1 in the idle_inject power
capping driver (Srinivas Pandruvada)
- Use kstrtobool() instead of strtobool() in the power capping sysfs
interface (Christophe JAILLET)
- Add SCMI Powercap based power capping driver (Cristian Marussi)
- Add Emerald Rapids support to the intel-uncore-freq driver (Artem
Bityutskiy)
- Repair slips in kernel-doc comments in the generic notifier code
(Lukas Bulwahn)
- Fix several DT issues in the OPP library reorganize code around
opp-microvolt-<named> DT property (Viresh Kumar)
- Allow any of opp-microvolt, opp-microamp, or opp-microwatt
properties to be present without the others present (James
Calligeros)
- Fix clock-latency-ns property in DT example (Serge Semin)
- Add a private governor_data for devfreq governors (Kant Fan)
- Reorganize devfreq code to use device_match_of_node() and
devm_platform_get_and_ioremap_resource() instead of open coding
them (ye xingchen, Minghao Chi)
- Make cpupower choose base_cpu to display default cpupower details
instead of picking CPU 0 (Saket Kumar Bhaskar)
- Add Georgian translation to cpupower documentation (Zurab
Kargareteli)
- Introduce powercap intel-rapl library, powercap-info command, and
RAPL monitor into cpupower (Thomas Renninger)"
* tag 'pm-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
PM: runtime: Adjust white space in the core code
cpufreq: Remove CVS version control contents from documentation
cpufreq: stats: Convert to use sysfs_emit_at() API
cpufreq: ACPI: Only set boost MSRs on supported CPUs
PM: sleep: Refine error message in try_to_freeze_tasks()
PM: sleep: Avoid using pr_cont() in the tasks freezing code
PM: runtime: Relocate rpm_callback() right after __rpm_callback()
PM: runtime: Do not call __rpm_callback() from rpm_idle()
PM / devfreq: event: use devm_platform_get_and_ioremap_resource()
PM / devfreq: event: Use device_match_of_node()
PM / devfreq: Use device_match_of_node()
powercap: idle_inject: Fix warnings with make W=1
PM: hibernate: Complain about memory map mismatches during resume
dt-bindings: cpufreq: cpufreq-qcom-hw: Add QDU1000/QRU1000 cpufreq
cpufreq: tegra186: Use flexible array to simplify memory allocation
cpupower: rapl monitor - shows the used power consumption in uj for each rapl domain
cpupower: Introduce powercap intel-rapl library and powercap-info command
cpupower: Add Georgian translation
cpufreq: intel_pstate: Add Sapphire Rapids support in no-HWP mode
cpufreq: amd_freq_sensitivity: Add missing pci_dev_put()
...
- Core:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for PCI/MSI[-X]
and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows device
manufactures to provide implementation defined storage for MSI messages
contrary to the uniform and specification defined storage mechanisms for
PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations
of the MSI-X table, but also gives the device manufacturer the freedom to
store the message in arbitrary places, even in host memory which is shared
with the device.
There have been several attempts to glue this into the current MSI code,
but after lengthy discussions it turned out that there is a fundamental
design problem in the current PCI/MSI-X implementation. This needs some
historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management was
completely different from what we have today in the actively developed
architectures. Interrupt management was completely architecture specific
and while there were attempts to create common infrastructure the
commonalities were rudimentary and just providing shared data structures and
interfaces so that drivers could be written in an architecture agnostic
way.
The initial PCI/MSI[-X] support obviously plugged into this model which
resulted in some basic shared infrastructure in the PCI core code for
setting up MSI descriptors, which are a pure software construct for holding
data relevant for a particular MSI interrupt, but the actual association to
Linux interrupts was completely architecture specific. This model is still
supported today to keep museum architectures and notorious stranglers
alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel,
which was creating yet another architecture specific mechanism and resulted
in an unholy mess on top of the existing horrors of x86 interrupt handling.
The x86 interrupt management code was already an incomprehensible maze of
indirections between the CPU vector management, interrupt remapping and the
actual IO/APIC and PCI/MSI[-X] implementation.
At roughly the same time ARM struggled with the ever growing SoC specific
extensions which were glued on top of the architected GIC interrupt
controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86 vector
domain and interrupt remapping and also allowed ARM to handle the zoo of
SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that it is
not available the PCI/MSI[-X] domains have the vector domain as their
parent. This reduced the required interaction between the domains pretty
much to the initialization phase where it is obviously required to
establish the proper parent relation ship in the components of the
hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware
it's clear that the actual PCI/MSI[-X] interrupt controller is not a global
entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the easy
solution of providing "global" PCI/MSI domains which was possible because
the PCI/MSI[-X] handling is uniform across the devices. This also allowed
to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in
turn made it simple to keep the existing architecture specific management
alive.
A similar problem was created in the ARM world with support for IP block
specific message storage. Instead of going all the way to stack a IP block
specific domain on top of the generic MSI domain this ended in a construct
which provides a "global" platform MSI domain which allows overriding the
irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the MSI
infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into the
existing infrastructure. Some of this just works by chance on particular
platforms but will fail in hard to diagnose ways when the driver is used
on platforms where the underlying MSI interrupt management code does not
expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront to
avoid resource exhaustion. They are expanded at run-time when the guest
actually tries to use them. The way how this is implemented is that the
host disables MSI-X and then re-enables it with a larger number of
vectors again. That works by chance because most device drivers set up
all interrupts before the device actually will utilize them. But that's
not universally true because some drivers allocate a large enough number
of vectors but do not utilize them until it's actually required,
e.g. for acceleration support. But at that point other interrupts of the
device might be in active use and the MSI-X disable/enable dance can
just result in losing interrupts and therefore hard to diagnose subtle
problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS
is not longer providing a uniform storage and configuration model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for PCI/IMS.
PCI/IMS has been verified with the work in progress IDXD driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
- Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC
DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf
br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2
wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y
wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7
fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+
CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT
gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR
BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y
NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5
/FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM
03JfvdxnueM3gw==
=9erA
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt core and driver subsystem:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for
PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows
device manufactures to provide implementation defined storage for MSI
messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
message store which is uniform accross all devices). The PCI/MSI[-X]
uniformity allowed us to get away with "global" PCI/MSI domains.
IMS not only allows to overcome the size limitations of the MSI-X
table, but also gives the device manufacturer the freedom to store the
message in arbitrary places, even in host memory which is shared with
the device.
There have been several attempts to glue this into the current MSI
code, but after lengthy discussions it turned out that there is a
fundamental design problem in the current PCI/MSI-X implementation.
This needs some historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management
was completely different from what we have today in the actively
developed architectures. Interrupt management was completely
architecture specific and while there were attempts to create common
infrastructure the commonalities were rudimentary and just providing
shared data structures and interfaces so that drivers could be written
in an architecture agnostic way.
The initial PCI/MSI[-X] support obviously plugged into this model
which resulted in some basic shared infrastructure in the PCI core
code for setting up MSI descriptors, which are a pure software
construct for holding data relevant for a particular MSI interrupt,
but the actual association to Linux interrupts was completely
architecture specific. This model is still supported today to keep
museum architectures and notorious stragglers alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
kernel, which was creating yet another architecture specific mechanism
and resulted in an unholy mess on top of the existing horrors of x86
interrupt handling. The x86 interrupt management code was already an
incomprehensible maze of indirections between the CPU vector
management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
implementation.
At roughly the same time ARM struggled with the ever growing SoC
specific extensions which were glued on top of the architected GIC
interrupt controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86
vector domain and interrupt remapping and also allowed ARM to handle
the zoo of SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that
it is not available the PCI/MSI[-X] domains have the vector domain as
their parent. This reduced the required interaction between the
domains pretty much to the initialization phase where it is obviously
required to establish the proper parent relation ship in the
components of the hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
hardware it's clear that the actual PCI/MSI[-X] interrupt controller
is not a global entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the
easy solution of providing "global" PCI/MSI domains which was possible
because the PCI/MSI[-X] handling is uniform across the devices. This
also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
unchanged which in turn made it simple to keep the existing
architecture specific management alive.
A similar problem was created in the ARM world with support for IP
block specific message storage. Instead of going all the way to stack
a IP block specific domain on top of the generic MSI domain this ended
in a construct which provides a "global" platform MSI domain which
allows overriding the irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the
MSI infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into
the existing infrastructure. Some of this just works by chance on
particular platforms but will fail in hard to diagnose ways when the
driver is used on platforms where the underlying MSI interrupt
management code does not expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront
to avoid resource exhaustion. They are expanded at run-time when the
guest actually tries to use them. The way how this is implemented is
that the host disables MSI-X and then re-enables it with a larger
number of vectors again. That works by chance because most device
drivers set up all interrupts before the device actually will utilize
them. But that's not universally true because some drivers allocate a
large enough number of vectors but do not utilize them until it's
actually required, e.g. for acceleration support. But at that point
other interrupts of the device might be in active use and the MSI-X
disable/enable dance can just result in losing interrupts and
therefore hard to diagnose subtle problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
that IMS is not longer providing a uniform storage and configuration
model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per
device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for
PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place"
* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
irqchip/ti-sci-inta: Fix kernel doc
irqchip/gic-v2m: Mark a few functions __init
irqchip/gic-v2m: Include arm-gic-common.h
irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
iommu/amd: Enable PCI/IMS
iommu/vt-d: Enable PCI/IMS
x86/apic/msi: Enable PCI/IMS
PCI/MSI: Provide pci_ims_alloc/free_irq()
PCI/MSI: Provide IMS (Interrupt Message Store) support
genirq/msi: Provide constants for PCI/IMS support
x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
PCI/MSI: Provide prepare_desc() MSI domain op
PCI/MSI: Split MSI-X descriptor setup
genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
genirq/msi: Provide msi_domain_alloc_irq_at()
genirq/msi: Provide msi_domain_ops:: Prepare_desc()
genirq/msi: Provide msi_desc:: Msi_data
genirq/msi: Provide struct msi_map
x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
...
There are few major updates in the SoC specific drivers, mainly the usual
reworks and support for variants of the existing SoC. While this remains
Arm centric for the most part, the branch now also contains updates to
risc-v and loongarch specific code in drivers/soc/.
Notable changes include:
- Support for the newly added Qualcomm Snapdragon variants
(MSM8956, MSM8976, SM6115, SM4250, SM8150, SA8155 and SM8550) in the
soc ID, rpmh, rpm, spm and powerdomain drivers.
- Documentation for the somewhat controversial qcom,board-id
properties that are required for booting a number of machines
- A new SoC identification driver for the loongson-2 (loongarch)
platform
- memory controller updates for stm32, tegra, and renesas.
- a new DT binding to better describe LPDDR2/3/4/5 chips in
the memory controller subsystem
- Updates for Tegra specific drivers across multiple subsystems,
improving support for newer SoCs and better identification
- Minor fixes for Broadcom, Freescale, Apple, Renesas, Sifive,
TI, Mediatek and Marvell SoC drivers
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmOSAZ8ACgkQmmx57+YA
GNmoDw/9Hdz2rx6TtdjA2eMKFt97bK0EgrQADT1d4lPQzXzZzFDC9ZxL0bwZRujZ
b8Q6WrMMgcRiWmzmRlxQWMWEdBU8Y0OzeYlo4lbyCSOV+UA2OA/eu6rm0chapBgM
1/lkquYLUUcB31wg+NmADoKy5Ejxj9SL1Va36Nvng4YpHDrYHKt4gPyCr/EV+KRO
Q8JpH7vEzQ0P5CGUzeri2UlYWDdF1GXmObqQGF8pq9s6Qz4ACe63r+eJFXAQFiXK
xewRK7PuvqmQWLVaEnN8dAcSna5P4aIGKOVjQyZjCCp6qTvfm4d2hxTl4dt9gVtt
vbQPiPQ5ORRzeMmW6wHxSIdy2QCa9CKQDXuMRoOWHfGMrAZQaxruISpcmHYv9Ug+
nSfedIEtxtmpGK2SZ1Mvndkezbb0o5QXZF4+kxqpiE/EaxVWmxiecmrUqyvJ5RVv
RuaZeMQpeOaWElnxb2P/T5uLuoHGhDdZ98HXICuCWPAitvA2rRK4Rv3dqTeclPLa
HR9gVYgZK3CSj+e9xbe5uczIc664bscRl9unghtB3UWkGTiLt2rroX4T2pTU/2xf
YvzDHC+f42NEkXUzcs4cZ87R8iY2hr0LmePY5/lqF9k6qx0Rc3syNc7q4N4EBxGC
2y5dDpKXfFL6fEV4YNeGpNcrwmCwnNppcePjmNvgrdtqmUUB/mY=
=heNV
-----END PGP SIGNATURE-----
Merge tag 'soc-drivers-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC driver updates from Arnd Bergmann:
"There are few major updates in the SoC specific drivers, mainly the
usual reworks and support for variants of the existing SoC. While this
remains Arm centric for the most part, the branch now also contains
updates to risc-v and loongarch specific code in drivers/soc/.
Notable changes include:
- Support for the newly added Qualcomm Snapdragon variants (MSM8956,
MSM8976, SM6115, SM4250, SM8150, SA8155 and SM8550) in the soc ID,
rpmh, rpm, spm and powerdomain drivers.
- Documentation for the somewhat controversial qcom,board-id
properties that are required for booting a number of machines
- A new SoC identification driver for the loongson-2 (loongarch)
platform
- memory controller updates for stm32, tegra, and renesas.
- a new DT binding to better describe LPDDR2/3/4/5 chips in the
memory controller subsystem
- Updates for Tegra specific drivers across multiple subsystems,
improving support for newer SoCs and better identification
- Minor fixes for Broadcom, Freescale, Apple, Renesas, Sifive, TI,
Mediatek and Marvell SoC drivers"
* tag 'soc-drivers-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (137 commits)
soc: qcom: socinfo: Add SM6115 / SM4250 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for SM6115 / SM4250 and variants
soc: qcom: socinfo: Add SM8150 and SA8155 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for SM8150 and SA8155
dt-bindings: soc: qcom: apr: document generic qcom,apr compatible
soc: qcom: Select REMAP_MMIO for ICC_BWMON driver
soc: qcom: Select REMAP_MMIO for LLCC driver
soc: qcom: rpmpd: Add SM4250 support
dt-bindings: power: rpmpd: Add SM4250 support
dt-bindings: soc: qcom: aoss: Add compatible for SM8550
soc: qcom: llcc: Add configuration data for SM8550
dt-bindings: arm: msm: Add LLCC compatible for SM8550
soc: qcom: llcc: Add v4.1 HW version support
soc: qcom: socinfo: Add SM8550 ID
soc: qcom: rpmh-rsc: Avoid unnecessary checks on irq-done response
soc: qcom: rpmh-rsc: Add support for RSC v3 register offsets
soc: qcom: rpmhpd: Add SM8550 power domains
dt-bindings: power: rpmpd: Add SM8550 to rpmpd binding
soc: qcom: socinfo: Add MSM8956/76 SoC IDs to the soc_id table
dt-bindings: arm: qcom,ids: Add SoC IDs for MSM8956 and MSM8976
...
Merge cpuidle changes, updates related to system sleep amd generic power
domains code fixes for 6.2-rc1:
- Improve kernel messages printed by the cpuidle PCI driver (Ulf
Hansson).
- Make the DT cpuidle driver return the correct number of parsed idle
states, clean it up and clarify a comment in it (Ulf Hansson).
- Modify the tasks freezing code to avoid using pr_cont() and refine an
error message printed by it (Rafael Wysocki).
- Make the hibernation core code complain about memory map mismatches
during resume to help diagnostics (Xueqin Luo).
- Fix mistake in a kerneldoc comment in the hibernation code (xiongxin).
- Reverse the order of performance and enabling operations in the
generic power domains code (Abel Vesa).
- Power off[on] domains in hibernate .freeze[thaw]_noirq hook of in the
generic power domains code (Abel Vesa).
- Consolidate genpd_restore_noirq() and genpd_resume_noirq() (Shawn
Guo).
- Pass generic PM noirq hooks to genpd_finish_suspend() (Shawn Guo).
- Drop generic power domain status manipulation during hibernate
restore (Shawn Guo).
* pm-cpuidle:
cpuidle: dt: Clarify a comment and simplify code in dt_init_idle_driver()
cpuidle: dt: Return the correct numbers of parsed idle states
cpuidle: psci: Extend information in log about OSI/PC mode
* pm-sleep:
PM: sleep: Refine error message in try_to_freeze_tasks()
PM: sleep: Avoid using pr_cont() in the tasks freezing code
PM: hibernate: Complain about memory map mismatches during resume
PM: hibernate: Fix mistake in kerneldoc comment
* pm-domains:
PM: domains: Reverse the order of performance and enabling ops
PM: domains: Power off[on] domain in hibernate .freeze[thaw]_noirq hook
PM: domains: Consolidate genpd_restore_noirq() and genpd_resume_noirq()
PM: domains: Pass generic PM noirq hooks to genpd_finish_suspend()
PM: domains: Drop genpd status manipulation for hibernate restore
utsrelease.h is potentially generated on each build.
By removing this unused include we can get rid of some spurious
recompilations.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Provide a public callback handle_mask_sync() that drivers can use when
they have more complex IRQ masking logic. The default implementation is
regmap_irq_handle_mask_sync(), used if the chip doesn't provide its own
callback.
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/e083474b3d467a86e6cb53da8072de4515bd6276.1669100542.git.william.gray@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Some inconsistent usage of white space in the PM-runtime core code
causes that code to be somewhat harder to read that it would have
been otherwise, so adjust the white space in there to be more
consistent with the rest of the code.
No expected functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Use fwnode_handle_put() on the node pointer to release the refcount.
Change fwnode_handle_node() to fwnode_handle_put().
Fixes: 233872585d ("device property: Add fwnode_get_next_parent()")
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Daniel Scally <djrscally@gmail.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20221207112219.2652411-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
to_fw_sysfs() was changed in commit 23680f0b7d ("driver core: make
struct class.dev_uevent() take a const *") to pass in a const pointer
but not pass it back out to handle some changes in the driver core.
That isn't the best idea as it could cause problems if used incorrectly,
so switch to use the container_of_const() macro instead which will
preserve the const status of the pointer and enforce it by the compiler.
Fixes: 23680f0b7d ("driver core: make struct class.dev_uevent() take a const *")
Cc: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221205121206.166576-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Switch to the new domain id aware interfaces to phase out the previous
ones. No functional change.
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221124230314.513924920@linutronix.de
Because rpm_callback() is a wrapper around __rpm_callback(), and the
only caller of it after the change eliminating an invocation of it
from rpm_idle(), move the former next to the latter to make the code
a bit easier to follow.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Calling __rpm_callback() from rpm_idle() after adding device links
support to the former is a clear mistake.
Not only it causes rpm_idle() to carry out unnecessary actions, but it
is also against the assumption regarding the stability of PM-runtime
status across __rpm_callback() invocations, because rpm_suspend() and
rpm_resume() may run in parallel with __rpm_callback() when it is called
by rpm_idle() and the device's PM-runtime status can be updated by any
of them.
Fixes: 21d5c57b37 ("PM / runtime: Use device links")
Link: https://lore.kernel.org/linux-pm/36aed941-a73e-d937-2721-4f0decd61ce0@quicinc.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Merge series from Eddie James <eajames@linux.ibm.com>:
The SBEFIFO hardware can now be attached over a new I2C endpoint interface
called the I2C Responder (I2CR). In order to use the existing SBEFIFO
driver, add a regmap driver for the FSI bus and an endpoint driver for the
I2CR. Then, refactor the SBEFIFO and OCC drivers to clean up and use the
new regmap driver or the I2CR interface.
This branch just has the regmap change so it can be shared with the FSI
code.
The ->set_performance_state() needs to be called before ->power_on()
when a genpd is powered on, and after ->power_off() when a genpd is
powered off. Do this in order to let the provider know to which
performance state to power on the genpd, on the power on sequence, and
also to maintain the performance for that genpd until after powering off,
on power off sequence.
There is no scenario where a consumer would need its genpd enabled and
then its performance state increased. Instead, in every scenario, the
consumer needs the genpd to be enabled from the start at a specific
performance state.
And same logic applies to the powering down. No consumer would need its
genpd performance state dropped right before powering down.
Now, there are currently two vendors which use ->set_performance_state()
in their genpd providers. One of them is Tegra, but the only genpd provider
(PMC) that makes use of ->set_performance_state() doesn't implement the
->power_on() or ->power_off(), and so it will not be affected by the ops
reversal.
The other vendor that uses it is Qualcomm, in multiple genpd providers
actually (RPM, RPMh and CPR). But all Qualcomm genpd providers that make
use of ->set_performance_state() need the order between enabling ops and
the performance setting op to be reversed. And the reason for that is that
it currently translates into two different voltages in order to power on
a genpd to a specific performance state. Basically, ->power_on() switches
to the minimum (enabling) voltage for that genpd, and then
->set_performance_state() sets it to the voltage level required by the
consumer.
By reversing the call order, we rely on the provider to know what to do
on each call, but most popular usecase is to cache the performance state
and postpone the voltage setting until the ->power_on() gets called.
As for the reason of still needing the ->power_on() and ->power_off() for a
provider which could get away with just having ->set_performance_state()
implemented, there are consumers that do not (nor should) provide an
opp-table. For those consumers, ->set_performance_state() will not be
called, and so they will enable the genpd to its minimum performance state
by a ->power_on() call. Same logic goes for the disabling.
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The dev_uevent() in struct class should not be modifying the device that
is passed into it, so mark it as a const * and propagate the function
signature changes out into all relevant subsystems that use this
callback.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Johan Hovold <johan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
Cc: Raed Salem <raeds@nvidia.com>
Cc: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Avihai Horon <avihaih@nvidia.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Jakob Koschel <jakobkoschel@gmail.com>
Cc: Antoine Tenart <atenart@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Wang Yufen <wangyufen@huawei.com>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-media@vger.kernel.org
Cc: linux-nvme@lists.infradead.org
Cc: linux-pm@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221123122523.1332370-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fw_token is used for DT/ACPI systems to identify CPUs sharing caches.
For DT based systems, fw_token is set to a pointer to a DT node.
commit 3da72e1837 ("cacheinfo: Decrement refcount in
cache_setup_of_node()")
doesn't increment the refcount of fw_token anymore in
cache_setup_of_node(). fw_token is indeed used as a token and not
as a (struct device_node*), so no reference to fw_token should be
kept.
However, [1] is triggered when hotplugging a CPU multiple times
since cache_shared_cpu_map_remove() decrements the refcount to
fw_token at each CPU unplugging, eventually reaching 0.
Remove of_node_put() for fw_token in cache_shared_cpu_map_remove().
[1]
------------[ cut here ]------------
refcount_t: saturated; leaking memory.
WARNING: CPU: 4 PID: 32 at lib/refcount.c:22 refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
Modules linked in:
CPU: 4 PID: 32 Comm: cpuhp/4 Tainted: G W 6.1.0-rc1-14091-g9fdf2ca7b9c8 #76
Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Oct 31 2022
pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
lr : refcount_warn_saturate (lib/refcount.c:22 (discriminator 3))
[...]
Call trace:
[...]
of_node_release (drivers/of/dynamic.c:335)
kobject_put (lib/kobject.c:677 lib/kobject.c:704 ./include/linux/kref.h:65 lib/kobject.c:721)
of_node_put (drivers/of/dynamic.c:49)
free_cache_attributes.part.0 (drivers/base/cacheinfo.c:712)
cacheinfo_cpu_pre_down (drivers/base/cacheinfo.c:718)
cpuhp_invoke_callback (kernel/cpu.c:247 (discriminator 4))
cpuhp_thread_fun (kernel/cpu.c:785)
smpboot_thread_fn (kernel/smpboot.c:164 (discriminator 3))
kthread (kernel/kthread.c:376)
ret_from_fork (arch/arm64/kernel/entry.S:861)
---[ end trace 0000000000000000 ]---
Fixes: 3da72e1837 ("cacheinfo: Decrement refcount in cache_setup_of_node()")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://lore.kernel.org/r/20221116094958.2141072-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Seems the blank line to separate entries in Kconfig was missing.
Add it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/20221122133600.49897-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the fwnode_property_match_string() the goto label out has
an additional task. Rename the label to be more precise on
what is going to happen if goto it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221122133600.49897-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The name() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up the single existing name() callback
to have the correct signature to preserve the build.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The filter() callback in struct kset_uevent_ops does not modify the
kobject passed into it, so make the pointer const to enforce this
restriction. When doing so, fix up all existing filter() callbacks to
have the correct signature to preserve the build.
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Christian König <christian.koenig@amd.com> for the changes to
Link: https://lore.kernel.org/r/20221121094649.1556002-3-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The call, kobject_get_ownership(), does not modify the kobject passed
into it, so make it const. This propagates down into the kobj_type
function callbacks so make the kobject passed into them also const,
ensuring that nothing in the kobject is being changed here.
This helps make it more obvious what calls and callbacks do, and do not,
modify structures passed to them.
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Roopa Prabhu <roopa@nvidia.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: linux-nfs@vger.kernel.org
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20221121094649.1556002-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the dawn of MMIO gpio-regmap users, it is desirable to let
gpio-regmap ask the regmap if it might sleep during an access so
it can pass that information to gpiochip. Add a new regmap_might_sleep()
to query the regmap.
Signed-off-by: Michael Walle <michael@walle.cc>
Link: https://lore.kernel.org/r/20221121150843.1562603-1-michael@walle.cc
Signed-off-by: Mark Brown <broonie@kernel.org>
Adjust to reality and remove another layer of pointless Kconfig
indirection. CONFIG_GENERIC_MSI_IRQ is good enough to serve
all purposes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122014.524842979@linutronix.de
When a range of descriptors is freed then all of them are not associated to
a linux interrupt. Remove the filter and add a warning to the free function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ashok Raj <ashok.raj@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20221111122013.888850936@linutronix.de
Not only platform devices described by OF have named interrupts, but
devices described by ACPI also have named interrupts. The fwnode is an
abstraction to different standards, and using fwnode_irq_get_byname can
support more devices.
Signed-off-by: Soha Jin <soha@lohu.info>
Tested-by: Wende Tan <twd2.me@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.
This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
strtobool() is the same as kstrtobool().
However, the latter is more used within the kernel.
In order to remove strtobool() and slightly simplify kstrtox.h, switch to
the other function name.
While at it, include the corresponding header file (<linux/kstrtox.h>)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/02ba683a5c0716638ad8ca11e8b0fdca97c4f294.1667336095.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Refcounts to DT nodes are only incremented in the function
and never decremented. Decrease the refcounts when necessary.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20221026185954.991547-1-pierre.gondois@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
driver_allows_async_probing is only used in drivers/base/dd.c, so mark
it static and remove the declaration in drivers/base/base.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221030092255.872280-1-hch@lst.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no in-kernel user of this function, so it is not needed anymore
and can be removed.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221109140711.105222-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no in-kernel user of this function, so it is not needed anymore
and can be removed.
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221109140711.105222-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The arch timer cannot wake up the Qualcomm Technologies, Inc. (QTI) SoCs
from the deeper CPUidle states. To be able to wakeup from these deeper
states, another always-on timer needs to be programmed through the so
called CONTROL_TCS.
As the RSC is part of CPU subsystem and the corresponding APSS RSC device
is attached to the cluster PM domain (through genpd), it holds the
responsibility to program the always-on timer, before entering any of these
deeper CPUidle states.
However, programming the timer requires information about the next hrtimer
wakeup for the cluster PM domain, which is currently only known by genpd.
Therefore, let's share this data through a new genpd helper function,
dev_pm_genpd_get_next_hrtimer().
Signed-off-by: Maulik Shah <quic_mkshah@quicinc.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
[Ulf: Reworked the code and updated the commit message]
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # SM8450
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221018152837.619426-5-ulf.hansson@linaro.org
Commit faa87ce919 ("regmap-irq: Introduce config registers for irq
types") added the num_config_regs, then commit 9edd4f5aee ("regmap-irq:
Deprecate type registers and virtual registers") suggested to replace
num_type_reg with it. However, regmap_add_irq_chip_fwnode wasn't modified
to use the new property. Later on, commit 255a03bb1b ("ASoC: wcd9335:
Convert irq chip to config regs") removed the old num_type_reg property
from the WCD9335 driver's struct regmap_irq_chip, causing a null pointer
dereference in regmap_irq_set_type when it tried to index d->type_buf as
it was never allocated in regmap_add_irq_chip_fwnode:
[ 39.199374] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 39.200006] Call trace:
[ 39.200014] regmap_irq_set_type+0x84/0x1c0
[ 39.200026] __irq_set_trigger+0x60/0x1c0
[ 39.200040] __setup_irq+0x2f4/0x78c
[ 39.200051] request_threaded_irq+0xe8/0x1a0
Use num_config_regs in regmap_add_irq_chip_fwnode instead of num_type_reg,
and fall back to it if num_config_regs isn't defined to maintain backward
compatibility.
Fixes: faa87ce919 ("regmap-irq: Introduce config registers for irq types")
Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com>
Link: https://lore.kernel.org/r/20221107202114.823975-1-y.oudjana@protonmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The callbacks in struct class namespace() and get_ownership() do not
modify the struct device passed to them, so mark the pointer as constant
and fix up all callbacks in the kernel to have the correct function
signature.
This helps make it more obvious what calls and callbacks do, and do not,
modify structures passed to them.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20221001165426.2690912-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Round up allocations with kmalloc_size_roundup() so that devres's use
of ksize() is always accurate and no special handling of the memory is
needed by KASAN, UBSAN_BOUNDS, nor FORTIFY_SOURCE.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221018090406.never.856-kees@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If class_add_groups() returns error, the 'cp->subsys' need be
unregister, and the 'cp' need be freed.
We can not call kset_unregister() here, because the 'cls' will
be freed in callback function class_release() and it's also
freed in caller's error path, it will cause double free.
So fix this by calling kobject_del() and kfree_const(name) to
cleanup kobject. Besides, call kfree() to free the 'cp'.
Fault injection test can trigger this:
unreferenced object 0xffff888102fa8190 (size 8):
comm "modprobe", pid 502, jiffies 4294906074 (age 49.296s)
hex dump (first 8 bytes):
70 6b 74 63 64 76 64 00 pktcdvd.
backtrace:
[<00000000e7c7703d>] __kmalloc_track_caller+0x1ae/0x320
[<000000005e4d70bc>] kstrdup+0x3a/0x70
[<00000000c2e5e85a>] kstrdup_const+0x68/0x80
[<000000000049a8c7>] kvasprintf_const+0x10b/0x190
[<0000000029123163>] kobject_set_name_vargs+0x56/0x150
[<00000000747219c9>] kobject_set_name+0xab/0xe0
[<0000000005f1ea4e>] __class_register+0x15c/0x49a
unreferenced object 0xffff888037274000 (size 1024):
comm "modprobe", pid 502, jiffies 4294906074 (age 49.296s)
hex dump (first 32 bytes):
00 40 27 37 80 88 ff ff 00 40 27 37 80 88 ff ff .@'7.....@'7....
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
backtrace:
[<00000000151f9600>] kmem_cache_alloc_trace+0x17c/0x2f0
[<00000000ecf3dd95>] __class_register+0x86/0x49a
Fixes: ced6473e74 ("driver core: class: add class_groups support")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221026082803.3458760-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently PageHWPoison flag does not behave well when experiencing memory
hotremove/hotplug. Any data field in struct page is unreliable when the
associated memory is offlined, and the current mechanism can't tell
whether a memory block is onlined because a new memory devices is
installed or because previous failed offline operations are undone.
Especially if there's a hwpoisoned memory, it's unclear what the best
option is.
So introduce a new mechanism to make struct memory_block remember that a
memory block has hwpoisoned memory inside it. And make any online event
fail if the onlining memory block contains hwpoison. struct memory_block
is freed and reallocated over ACPI-based hotremove/hotplug, but not over
sysfs-based hotremove/hotplug. So the new counter can distinguish these
cases.
Link: https://lkml.kernel.org/r/20221024062012.1520887-5-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
On platforms which use SHUTDOWN as hibernation mode, the genpd noirq
hooks will be called like below.
genpd_freeze_noirq() genpd_restore_noirq()
↓ ↑
Create snapshot image Restore target kernel
↓ ↑
genpd_thaw_noirq() genpd_freeze_noirq()
↓ ↑
Write snapshot image Read snapshot image
↓ ↑
power_down() Kernel boot
As of today suspend hooks genpd_suspend[resume]_noirq() manages domain
on/off state, but hibernate hooks genpd_freeze[thaw]_noirq() doesn't.
This results in a different behavior of domain power state between suspend
and hibernate freeze, i.e. domain is powered off for the former while on
for the later. It causes a problem on platforms like i.MX where the
domain needs to be powered on/off by calling clock and regulator interface.
When the platform restores from hibernation, the domain is off in hardware
and genpd_restore_noirq() tries to power it on, but will never succeed
because software state of domain (clock and regulator) is left on from the
last hibernate freeze, so kernel thinks that clock and regulator are
enabled while they are actually not turned on in hardware. The
consequence would be that devices in the power domain will access
registers without clock or power, and cause hardware lockup.
Power off[on] domain in hibernate .freeze[thaw]_noirq hook for reasons:
- Align the behavior between suspend and hibernate freeze.
- Have power state of domains stay in sync between hardware and software
for hibernate freeze, and thus fix the lockup issue seen on i.MX
platform.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Most of the logic between genpd_restore_noirq() and genpd_resume_noirq()
are identical. The suspended_count decrement for restore should be the
right thing to do anyway, considering there is an increment in
genpd_finish_suspend() for hibernation. So consolidate these two
functions into genpd_finish_resume().
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
While argument `poweroff` works fine for genpd_finish_suspend() to handle
distinction between suspend and poweroff, it won't scale if we want to
use it for freeze as well. Pass generic PM noirq hooks as arguments
instead, so that the function can possibly cover freeze case too.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The genpd status manipulation for hibernate restore has really never
worked as intended. For example, if the genpd->status was GENPD_STATE_ON,
the parent domain's `sd_count` must have been increased, so it needs to
be adjusted too. So drop this status manipulation.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A common exploit pattern for ROP attacks is to abuse prepare_kernel_cred()
in order to construct escalated privileges[1]. Instead of providing a
short-hand argument (NULL) to the "daemon" argument to indicate using
init_cred as the base cred, require that "daemon" is always set to
an actual task. Replace all existing callers that were passing NULL
with &init_task.
Future attacks will need to have sufficiently powerful read/write
primitives to have found an appropriately privileged task and written it
to the ROP stack as an argument to succeed, which is similarly difficult
to the prior effort needed to escalate privileges before struct cred
existed: locate the current cred and overwrite the uid member.
This has the added benefit of meaning that prepare_kernel_cred() can no
longer exceed the privileges of the init task, which may have changed from
the original init_cred (e.g. dropping capabilities from the bounding set).
[1] https://google.com/search?q=commit_creds(prepare_kernel_cred(0))
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steve French <sfrench@samba.org>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Link: https://lore.kernel.org/r/20221026232943.never.775-kees@kernel.org
- Fix the documentation of the *_match_string() family of functions to
properly cover the return value (Andy Shevchenko).
- Fix a possible integer overflow during multiplication in the ACPI
PCC code (Manank Patel).
- Make the ACPI device resources code skip IRQ override on Asus
Vivobook S5602ZA (Tamim Khan).
- Add LATT2021 to the list of device IDs that are ignored when
returned by _DEP, because there are no drivers for them in the
kernel and no plans to add such drivers (Hans de Goede).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmNb7XoSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx4AgP/1Q78/7GisCGmymtggxYqYwo7Zr0F6GO
yZ/OA4+TQOHeYLjjDH3njlmQnQLyLm8xEwPw+9ThK3p35eVMy01UoQdC5nf2mz9o
mzvR2Vh/7Ed0W58DP4KEyo5IAKGBYWD3SGSoKY0H8v0VBcQt2QLO0LFeZjulTFnb
TsIXTMG/e5ymwYN1tDil/Fiwpr2HUq1D+/jL7bxjw5Pb/IkYCuQXv/5DPyi3TH6e
d8d0CRoc8HKBOjyvAhuQ6bfVotYE/qSOz3gpVXcBwQGiTkPW6Ytk65sfXLEMO0Bz
LTKyQTkduJHPiEMNw32iAwTCrRIfsu5s+98Z/gYCHDY1oojQBMoFQhELzL2/HnLu
1Ab0v9sm8M6yPqVMC2F+wTQLAjkP4LZuTGt/xBkZ4VYlJnpr03ChyBlwut1AjHAs
Xsspal6FDupGI6VY935QBOMwVF75WEFAoR/CfKFIJhkEGxN7VrxlXlLo6ks+06Bi
85dLk0CiTxBcm3bfRaCgLt5AsQYYaDhRxEB70Hs0R4n9gkCTmuiMyZ+hdJYw5SCc
t0sPFF1Fd2pqaZtTMchV7nH9oaAeM/K+6TztXcR5b4iuhoTPEorVR+3mJwb80HRl
Y+Fhl/T7Q2kS+W5LxINsjkKyHBxkEdd0JadEfRlAESOfpwkIZIac0SHVPfv/KGt4
RrV/T6dbzkfi
=OG6y
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and device properties fixes from Rafael Wysocki:
"These fix device properties documentation and the ACPI PCC code, add a
new IRQ override quirk for resource handling and add one more item to
the list of device IDs to be ignored when returned by _DEP.
Specifics:
- Fix the documentation of the *_match_string() family of functions
to properly cover the return value (Andy Shevchenko)
- Fix a possible integer overflow during multiplication in the ACPI
PCC code (Manank Patel)
- Make the ACPI device resources code skip IRQ override on Asus
Vivobook S5602ZA (Tamim Khan)
- Add LATT2021 to the list of device IDs that are ignored when
returned by _DEP, because there are no drivers for them in the
kernel and no plans to add such drivers (Hans de Goede)"
* tag 'acpi-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: scan: Add LATT2021 to acpi_ignore_dep_ids[]
ACPI: resource: Skip IRQ override on Asus Vivobook S5602ZA
ACPI: PCC: Fix unintentional integer overflow
device property: Fix documentation for *_match_string() APIs
Platforms can provide the information about the availability of each
idle states via status flag. Platforms may have to disable one or more
idle states for various reasons like broken firmware or other unmet
dependencies.
Fix handling of such unavailable/disabled idle states by ignoring them
while parsing the states.
Fixes: a3381e3a65 ("PM / domains: Fix up domain-idle-states OF parsing")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The returned value on success is an index of the matching string,
starting from 0. Reflect this in the documentation.
Fixes: 3f5c8d3187 ("device property: Add fwnode_property_match_string()")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Constify parameter in device_dma_supported() and device_get_dma_attr()
since they don't alter anything related to it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The device parameter is not altered in the device child node APIs,
constify them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fwnode and device parameters are not altered in the fwnode
connection match APIs, constify them.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It's not fully correct to take a const parameter pointer to a struct
and return a non-const pointer to a member of that struct.
Instead, introduce a const version of the dev_fwnode() API which takes
and returns const pointers and use it where it's applicable.
With this, convert dev_fwnode() to be a macro wrapper on top of const
and non-const APIs that chooses one based on the type.
Suggested-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Fixes: aade55c860 ("device property: Add const qualifier to device_get_match_data() parameter")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Link: https://lore.kernel.org/r/20221004092129.19412-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Core code:
- Provide a generic wrapper which can be utilized in drivers to handle
the problem of force threaded demultiplex interrupts on RT enabled
kernels. This avoids conditionals and horrible quirks in drivers all
over the place.
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT safe.
- Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation.
- Make irqchip_init() available for pure ACPI based systems.
- Provide a functional DT binding for the Realtek RTL interrupt chip.
- The usual DT updates and small code improvements all over the place.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmNGxRYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWJyD/0emJAlIuD0DzkEkoAtnHSq7eyGFMpI
PFMyZ0IYXlVWuxEmQMyd7E9M+fmlRqnnhErg6x7jPW1bKzoyIn1A7eNE/cvhXPru
BiTy6g2o7pNegUh5bQrE8p0Yyq6/HsVO4YyE3RGxpUQVh/qwB+RKnzUY6RfDj87z
naQx10+15b+76SXvTQpIrvQTWhfTswk9un2MYDkjHctfVgjcnb/8dTPQuXsZrdTQ
VBWWwjLpCKcqqQS1e9MQqmQKpVqGs/DGW8XNTPk3jI4QF1fIHjhNdcoI51/lM4Ri
r912FPE8R48FS9g0dQgpMxGmHjikYpf3rXXosn8uyWkt5zNy6CXOEEg3DRIoAIdg
czKve+bgZZXUK/QcSSdPuPthBoLKQCG5MZsVFNF8IArmPCHaiYcOQBe7pel3U4cc
MpQe9yUXJI40XgwTAyAOlidjmD69384nEhzbI5d/AfJI5ssdXcBMrFN/xEeBDWdz
Dg2+Yle9HNglxBA6E3GX3yiaCQJxHFhKMnqd1zhxWjXFRzkfGF7bBpRj1j+vXnzN
ap/wMQuMlOWriWsH3UkZtFrC4PvgByGVfzlzYA076CjutyYfQolQ8k0bLHnp2VSu
VWUn4WATfaxJcqij7vyI9BYtFXdrB/yYhFasDBepQbDgiy8WEAmX+bObvXWs9XYa
UGVCNGsYx2TKMA==
=2ok5
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull interrupt updates from Thomas Gleixner:
"Core code:
- Provide a generic wrapper which can be utilized in drivers to
handle the problem of force threaded demultiplex interrupts on RT
enabled kernels. This avoids conditionals and horrible quirks in
drivers all over the place
- Fix up affected pinctrl and GPIO drivers to make them cleanly RT
safe
Interrupt drivers:
- A new driver for the FSL MU platform specific MSI implementation
- Make irqchip_init() available for pure ACPI based systems
- Provide a functional DT binding for the Realtek RTL interrupt chip
- The usual DT updates and small code improvements all over the
place"
* tag 'irq-core-2022-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
irqchip: IMX_MU_MSI should depend on ARCH_MXC
irqchip/imx-mu-msi: Fix wrong register offset for 8ulp
irqchip/ls-extirq: Fix invalid wait context by avoiding to use regmap
dt-bindings: irqchip: Describe the IMX MU block as a MSI controller
irqchip: Add IMX MU MSI controller driver
dt-bindings: irqchip: renesas,irqc: Add r8a779g0 support
irqchip/gic-v3: Fix typo in comment
dt-bindings: interrupt-controller: ti,sci-intr: Fix missing reg property in the binding
dt-bindings: irqchip: ti,sci-inta: Fix warning for missing #interrupt-cells
irqchip: Allow extra fields to be passed to IRQCHIP_PLATFORM_DRIVER_END
platform-msi: Export symbol platform_msi_create_irq_domain()
irqchip/realtek-rtl: use parent interrupts
dt-bindings: interrupt-controller: realtek,rtl-intc: require parents
irqchip/realtek-rtl: use irq_domain_add_linear()
irqchip: Make irqchip_init() usable on pure ACPI systems
bcma: gpio: Use generic_handle_irq_safe()
gpio: mlxbf2: Use generic_handle_irq_safe()
platform/x86: intel_int0002_vgpio: Use generic_handle_irq_safe()
ssb: gpio: Use generic_handle_irq_safe()
pinctrl: amd: Use generic_handle_irq_safe()
...
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
* Add support for two new platforms (Zhang Rui).
* Adjust energy unit for Sapphire Rapids (Zhang Rui).
* Do not dump TRL if turbo is not supported (Artem Bityutskiy).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmNEVSISHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxKqUP/2Etq/OiSpMx7TYfLw9wpR6o+QOMSPuz
fjEByubypGBA387Z7F3siRBhy0tRiiU0h3Ti/cPtxkNVx7lLc+yEJYYvbkIvm9Fg
lhev8ky9m/JB9vq+r4GWnUWW3DntM84a3f/iNoqEMDeUUy0YhsPLfdIcEKN5y/qU
Q1BQ+4XY4PSwY0G315JT+EDxkMK/YcoJyi+1dlSO5iilwz9DJBT2iD8M+18Lz1k4
GWUryTvcANSimx4WzLFvCXGHagA5Yv7czgIMI3eBhTaln7P2G/Jn0LCrbVcHJk0t
dBgoK/FN9akbXzg6697No32soDLxf5uLzsUsYS3Xn/tdfp1b5jTX2o548qKjNBl7
x0ot2hUp713fS/4RVGeRTQcwzy1dMyzp35pMh+B61qKyVZTPcOXC6OcXRR5FmXOP
iZo84snmPRhQPT387HpRwhpv6Pm/M9EIdCpXY1e21V8fpGdwxAbo+sLN4Kh+bkbS
McxC7q5gOKFij8ucyVGUFQPMkkOuRi0kQMNVH8raobVUvsI8bAGau61LTpudyG6p
6kmcaKu4oEpl2cYpgQwdF5fT8NAp3H8cGuzkJRJqHijs0E4Kuqd9bigoPigUQTT2
1D92S/kiUcxJo4pJZ0eDaxB05LDPMeGjnm1DUz5kh8PJE060tjUMlaTG9Gh4HjlH
uQjgblji2NVc
=IsN+
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These update the turbostat utility, extend the macros used for
defining device power management callbacks and add a diagnostic
message to the generic power domains code.
Specifics:
- Add an error message to be printed when a power domain marked as
"always on" is not actually on during initialization (Johan
Hovold).
- Extend macros used for defining power management callbacks to allow
conditional exporting of noirq and late/early suspend/resume PM
callbacks (Paul Cercueil).
- Update the turbostat utility:
- Add support for two new platforms (Zhang Rui).
- Adjust energy unit for Sapphire Rapids (Zhang Rui).
- Do not dump TRL if turbo is not supported (Artem Bityutskiy)"
* tag 'pm-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/power turbostat: version 2022.10.04
tools/power turbostat: Use standard Energy Unit for SPR Dram RAPL domain
tools/power turbostat: Do not dump TRL if turbo is not supported
tools/power turbostat: Add support for MeteorLake platforms
tools/power turbostat: Add support for RPL-S
PM: Improve EXPORT_*_DEV_PM_OPS macros
PM: domains: log failures to register always-on domains
* Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
* The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
* The CD-ROM filesystems have been enabled in the defconfig.
* Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmNAWgwTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYicSiEACmuB9WuGZmAasKvmPgz7thyLqakg7/
cE4YK0MxgJxkhsXzYSAv1Fn+WUfX7DSzhK4OOM5wEngAYul7QoFdc84MF0DYKO+E
InjdOvVavzUsWYqETNCuMHPRK6xyzvfHCqqBDDxKHx5jUoicCQfFwJyHLw+cvouR
7WSJoFdvOEV01QyN5Qw9bQp7ASx61ZZX1yE6OAPc2/EJlDEA2QSnjBAi4M+n2ZCx
ZsQz+Dp9RfSU8/nIr13oGiL3Zm+kyXwdOS/8PaDqtrkyiGh6+vSeGqZZwRLVITP/
oUxqGEgnn2eFBD1y8vjsQNWMLWoi9Av4746Fxr8CEHX+jX1cp9CCkU2OkkLxaFcv
6XFtXPJIh/UjzVgPmjZxK+ArEX28QOM5IVyBFxsSl0dNtvyVqKpBXCV1RQ+fFHkO
ntHF3ZxibqOn8ZJmziCn0nzWSOqugNTdAhD4dJAbl58RB/IQtQT0OnHpmpXCG3xh
+/JBzy//xkr7u2HMqU69PzwPtWwZrENUV6jl5SHUDUoW8pySng2Pl4pbmTFqgWty
JTfc5EdyWOWyshhoSCtK2//bnVFryl2ntwGr3LIZrZxkiUiOeYjn+C/YedXZIRob
yy2CN+QanW/FXdIa4GMNeGc9sGDApd3/RtP+8L9mV1kWK6OE0EVskkI1UMCGXrIP
5JoE1jLMVhjcKQ==
=LJg6
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Improvements to the CPU topology subsystem, which fix some issues
where RISC-V would report bad topology information.
- The default NR_CPUS has increased to XLEN, and the maximum
configurable value is 512.
- The CD-ROM filesystems have been enabled in the defconfig.
- Support for THP_SWAP has been added for rv64 systems.
There are also a handful of cleanups and fixes throughout the tree.
* tag 'riscv-for-linus-6.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: enable THP_SWAP for RV64
RISC-V: Print SSTC in canonical order
riscv: compat: s/failed/unsupported if compat mode isn't supported
RISC-V: Increase range and default value of NR_CPUS
cpuidle: riscv-sbi: Fix CPU_PM_CPU_IDLE_ENTER_xyz() macro usage
perf: RISC-V: throttle perf events
perf: RISC-V: exclude invalid pmu counters from SBI calls
riscv: enable CD-ROM file systems in defconfig
riscv: topology: fix default topology reporting
arm64: topology: move store_cpu_topology() to shared code
am sending out early due to me travelling next week. There is a
lone mm patch for which Andrew gave an informal ack at
https://lore.kernel.org/linux-mm/20220817102500.440c6d0a3fce296fdf91bea6@linux-foundation.org.
I will send the bulk of ARM work, as well as other
architectures, at the end of next week.
ARM:
* Account stage2 page table allocations in memory stats.
x86:
* Account EPT/NPT arm64 page table allocations in memory stats.
* Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR accesses.
* Drop eVMCS controls filtering for KVM on Hyper-V, all known versions of
Hyper-V now support eVMCS fields associated with features that are
enumerated to the guest.
* Use KVM's sanitized VMCS config as the basis for the values of nested VMX
capabilities MSRs.
* A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed
a longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed
for good.
* A handful of fixes for memory leaks in error paths.
* Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow.
* Never write to memory from non-sleepable kvm_vcpu_check_block()
* Selftests refinements and cleanups.
* Misc typo cleanups.
Generic:
* remove KVM_REQ_UNHALT
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM2zwcUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNpbwf+MlVeOlzE5SBdrJ0TEnLmKUel1lSz
QnZzP5+D65oD0zhCilUZHcg6G4mzZ5SdVVOvrGJvA0eXh25ruLNMF6jbaABkMLk/
FfI1ybN7A82hwJn/aXMI/sUurWv4Jteaad20JC2DytBCnsW8jUqc49gtXHS2QWy4
3uMsFdpdTAg4zdJKgEUfXBmQviweVpjjl3ziRyZZ7yaeo1oP7XZ8LaE1nR2l5m0J
mfjzneNm5QAnueypOh5KhSwIvqf6WHIVm/rIHDJ1HIFbgfOU0dT27nhb1tmPwAcE
+cJnnMUHjZqtCXteHkAxMClyRq0zsEoKk0OGvSOOMoq3Q0DavSXUNANOig==
=/hqX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"The first batch of KVM patches, mostly covering x86.
ARM:
- Account stage2 page table allocations in memory stats
x86:
- Account EPT/NPT arm64 page table allocations in memory stats
- Tracepoint cleanups/fixes for nested VM-Enter and emulated MSR
accesses
- Drop eVMCS controls filtering for KVM on Hyper-V, all known
versions of Hyper-V now support eVMCS fields associated with
features that are enumerated to the guest
- Use KVM's sanitized VMCS config as the basis for the values of
nested VMX capabilities MSRs
- A myriad event/exception fixes and cleanups. Most notably, pending
exceptions morph into VM-Exits earlier, as soon as the exception is
queued, instead of waiting until the next vmentry. This fixed a
longstanding issue where the exceptions would incorrecly become
double-faults instead of triggering a vmexit; the common case of
page-fault vmexits had a special workaround, but now it's fixed for
good
- A handful of fixes for memory leaks in error paths
- Cleanups for VMREAD trampoline and VMX's VM-Exit assembly flow
- Never write to memory from non-sleepable kvm_vcpu_check_block()
- Selftests refinements and cleanups
- Misc typo cleanups
Generic:
- remove KVM_REQ_UNHALT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (94 commits)
KVM: remove KVM_REQ_UNHALT
KVM: mips, x86: do not rely on KVM_REQ_UNHALT
KVM: x86: never write to memory from kvm_vcpu_check_block()
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
KVM: nVMX: Make event request on VMXOFF iff INIT/SIPI is pending
KVM: nVMX: Make an event request if INIT or SIPI is pending on VM-Enter
KVM: SVM: Make an event request if INIT or SIPI is pending when GIF is set
KVM: x86: lapic does not have to process INIT if it is blocked
KVM: x86: Rename kvm_apic_has_events() to make it INIT/SIPI specific
KVM: x86: Rename and expose helper to detect if INIT/SIPI are allowed
KVM: nVMX: Make an event request when pending an MTF nested VM-Exit
KVM: x86: make vendor code check for all nested events
mailmap: Update Oliver's email address
KVM: x86: Allow force_emulation_prefix to be written without a reload
KVM: selftests: Add an x86-only test to verify nested exception queueing
KVM: selftests: Use uapi header to get VMX and SVM exit reasons/codes
KVM: x86: Rename inject_pending_events() to kvm_check_and_inject_events()
KVM: VMX: Update MTF and ICEBP comments to document KVM's subtle behavior
KVM: x86: Treat pending TRIPLE_FAULT requests as pending exceptions
KVM: x86: Morph pending exceptions to pending VM-Exits at queue time
...
Here is the big set of driver core and debug printk changes for 6.1-rc1.
Included in here is:
- dynamic debug updates for the core and the drm subsystem. The
drm changes have all been acked by the relevant maintainers.
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they
were not being used and they really did not actually do
anything.)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY0BYUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylozwCdFRlcghaf7XBUyNgRZRwMC+oQI8EAn1G/nEDE
6aFd2er41uK0IGQnSmYO
=OK0k
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core and debug printk changes for
6.1-rc1. Included in here is:
- dynamic debug updates for the core and the drm subsystem. The drm
changes have all been acked by the relevant maintainers
- kernfs fixes for syzbot reported problems
- kernfs refactors and updates for cgroup requirements
- magic number cleanups and removals from the kernel tree (they were
not being used and they really did not actually do anything)
- other tiny cleanups
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits)
docs: filesystems: sysfs: Make text and code for ->show() consistent
Documentation: NBD_REQUEST_MAGIC isn't a magic number
a.out: restore CMAGIC
device property: Add const qualifier to device_get_match_data() parameter
drm_print: add _ddebug descriptor to drm_*dbg prototypes
drm_print: prefer bare printk KERN_DEBUG on generic fn
drm_print: optimize drm_debug_enabled for jump-label
drm-print: add drm_dbg_driver to improve namespace symmetry
drm-print.h: include dyndbg header
drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro
drm_print: interpose drm_*dbg with forwarding macros
drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.
drm_print: condense enum drm_debug_category
debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops
driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs()
Documentation: ENI155_MAGIC isn't a magic number
Documentation: NBD_REPLY_MAGIC isn't a magic number
nbd: remove define-only NBD_MAGIC, previously magic number
Documentation: FW_HEADER_MAGIC isn't a magic number
Documentation: EEPROM_MAGIC_VALUE isn't a magic number
...
This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several user
visible changes too:
- Support for I/O ports in regmap-mmio.
- Support for accelerated noinc operations in regmap-mmio.
- Support for tracing the register values in bulk operations.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmM6sskACgkQJNaLcl1U
h9Bj1Qf+PLNz4gQq7ki06KI6+Liz6daN5j/bfUGSizx6rns5qOZxt1A/CwDFM5ZR
6aY+tGL4ksYfEUEZHsr7qaQKptWoLmwDVX0oYZqHdBf4Wf7I6iht8WCq68KwDCzz
zQoswzoLmVuRd6aplFifEF3SOqjBrTQO3gkXBteIeA6/i1pwO9wOJdM4ZU54FX+Q
zexv7H/E9uKVonrViBMLPczaPhge4+ILNEDekSUW4AZ0RmUZ6JjW3aOZbwR6ut1x
7bS3ric9xGhW4IQdOZISY+ARhPPworcgdK5GoqBfjWV2vYc0c1iCawvF73Wm/NJC
GMGc5FIBi3a82oCMmSR1dAcci9CRLw==
=TTeQ
-----END PGP SIGNATURE-----
Merge tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"This has been a busy release for regmap with one thing and other,
there's been an especially large interest in MMIO regmaps for some
reason. The bulk of the changes are cleanups but there are several
user visible changes too:
- Support for I/O ports in regmap-mmio
- Support for accelerated noinc operations in regmap-mmio
- Support for tracing the register values in bulk operations"
* tag 'regmap-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: mmio: replace return 0 with break in switch statement
regmap: spi-avmm: Use swabXX_array() helpers
regmap: mmio: Use swabXX_array() helpers
swab: Add array operations
regmap: trace: Remove unneeded blank lines
regmap: trace: Remove explicit castings
regmap: trace: Remove useless check for NULL for bulk ops
regmap: mmio: Fix rebase error
regmap: check right noinc bounds in debug print
regmap: introduce value tracing for regmap bulk operations
regmap/hexagon: Properly fix the generic IO helpers
regmap: mmio: Support accelerared noinc operations
regmap: Support accelerated noinc operations
regmap: Make use of get_unaligned_be24(), put_unaligned_be24()
regmap: mmio: Fix MMIO accessors to avoid talking to IO port
regmap: mmio: Introduce IO accessors that can talk to IO port
regmap: mmio: Get rid of broken 64-bit IO
regmap: mmio: Remove mmio_relaxed member from context
Always-on PM domains must be on during initialisation or the domain is
currently silently rejected.
Print an error message in case an always-on domain is not on to make it
easier to debug drivers getting this wrong (e.g. by setting an always-on
genpd flag without making sure that the state matches).
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmM5iLgPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDvUgP/1XJGuYUsx3yvhe+JQNlO4R/T7DoH4rKhVNI
KU2UPubXzE7e9D5Z91Po9QBF+Y/kyBdIL9Gh9GcK79dQzAJdPy4WxYPX3dWXKuvi
swnmOSeoTJgEXHYan7yViRS+lVk6QOrXUhpL2fiRlG4QOoX0KpMljPhr5+P0LM7Y
QmerbXAuufa1sGz51Cdsptn16hSNke8MMJ54UB0y9gHL0Rp8VuIqQCIno9t93RKM
JXCKpTFUwMbtGX0twREyN6kPCVb9cbXL8NGj1jz+MoUcesCnsaObxmIST/cxTml8
79U+5PVu+HCBE/sVco0MVKBMHw4MAnHZyQl4+8snsyH7NVlOJFQ9VH2+1GykOjJ+
4TCHZVUbHAgpkp1cB85tlANGUSdpn9mAR+nPc70eNTOYoSXgNITstjea0oIMVusA
KavzKzUCGuHeyhVeEpCdreaxfcUhpiSfYXbLIfVzrzYTsbmxLXcqHxwETxAFduW5
kp9owyj1ZpEmqQHTqHM7YgALjOq/u08/IJ55eY50XPy4/eztaOS+quMYWHc6k3TR
M6lqtx+AGBCc62lUibLf6O+tsJjguM/98zLdVhiWThSNsMhjQ8yMYPwlr6VDVCvi
FnsvYLxHi6rhJlEK+1WOMZ3Omub8l9dkrL/jVDPPCkHwkGtHB0nmGQLjpMryN01v
uhtroZZ4
=JTl5
-----END PGP SIGNATURE-----
Merge tag 'irqchip-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core
Pull irqchip updates from Marc Zyngier:
- A new driver for the FSL MU widget that provides platform MSI
- An update for the Realtek RTL irqchip to use a DT binding that
actually describes the hardware
- A handful of DT updates, as well as minor code and spelling fixes
Link: https://lore.kernel.org/r/20221002125554.3902840-1-maz@kernel.org
The memory-notify-based approach aims to handle meory-less nodes, however,
it just adds the complexity of code as pointed by David in thread [1].
The handling of memory-less nodes is introduced by commit 4faf8d950e
("hugetlb: handle memory hot-plug events"). >From its commit message, we
cannot find any necessity of handling this case. So, we can simply
register/unregister sysfs entries in register_node/unregister_node to
simlify the code.
BTW, hotplug callback added because in hugetlb_register_all_nodes() we
register sysfs nodes only for N_MEMORY nodes, seeing commit 9b5e5d0fdc,
which said it was a preparation for handling memory-less nodes via memory
hotplug. Since we want to remove memory hotplug, so make sure we only
register per-node sysfs for online (N_ONLINE) nodes in
hugetlb_register_all_nodes().
https://lore.kernel.org/linux-mm/60933ffc-b850-976c-78a0-0ee6e0ea9ef0@redhat.com/ [1]
Link: https://lkml.kernel.org/r/20220914072603.60293-3-songmuchun@bytedance.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patch series "simplify handling of per-node sysfs creation and removal",
v4.
This patch (of 2):
The following commit offload per-node sysfs creation and removal to a
kworker and did not say why it is needed. And it also said "I don't know
that this is absolutely required". It seems like the author was not sure
as well. Since it only complicates the code, this patch will revert the
changes to simplify the code.
39da08cb07 ("hugetlb: offload per node attribute registrations")
We could use memory hotplug notifier to do per-node sysfs creation and
removal instead of inserting those operations to node registration and
unregistration. Then, it can reduce the code coupling between node.c and
hugetlb.c. Also, it can simplify the code.
Link: https://lkml.kernel.org/r/20220914072603.60293-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220914072603.60293-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies).
- Update the AMD P-state driver (Perry Yuan):
* Fix wrong lowest perf fetch.
* Map desired perf into pstate scope for powersave governor.
* Update pstate frequency transition delay time.
* Fix initial highest_perf value.
* Clean up.
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba).
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski).
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye xingchen,
and Yang Yingliang).
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen Yan,
and Viresh Kumar).
- Minor cleanups around functions called at module_init() (Xiu Jianfeng).
- Use module_init and add module_exit for bmips driver (Zhang Jianhua).
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno).
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OrYSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxeKAP/jFiZ1lhTGRngiVLMV6a6SSSy5xzzXZZ
b/V0oqsuUvWWo6CzVmfU4QfmKGr55+77NgI9Yh5qN6zJTEJmunuCYwVD80KdxPDJ
8SjMUNCACiVwfryLR1gFJlO+0BN4CWTxvto2gjGxzm0l1UQBACf71wm9MQCP8b7A
gcBNuOtM7o5NLywDB+/528SiF9AXfZKjkwXhJACimak5yQytaCJaqtOWtcG2KqYF
USunmqSB3IIVkAa5LJcwloc8wxHYo5mTPaWGGuSA65hfF42k3vJQ2/b8v8oTVza7
bKzhegErIYtL6B9FjB+P1FyknNOvT7BYr+4RSGLvaPySfjMn1bwz9fM1Epo59Guk
Azz3ExpaPixDh+x7b89W1Gb751FZU/zlWT+h1CNy5sOP/ChfxgCEBHw0mnWJ2Y0u
CPcI/Ch0FNQHG+PdbdGlyfvORHVh7te/t6dOhoEHXBue+1r3VkOo8tRGY9x+2IrX
/JB968u1r0oajF0btGwaDdbbWlyMRTzjrxVl3bwsuz/Kv/0JxsryND2JT0zkKAMZ
qYT29HQxhdE0Duw1chgAK6X+BsgP58Bu6LeM3mVcwnGPZE9QvcFa0GQh7z+H71AW
3yOGNmMVMqQSThBYFC6GDi7O2N1UEsLOMV9+ThTRh6D11nU4uiITM5QVIn8nWZGR
z3IZ52Jg0oeJ
=+3IL
-----END PGP SIGNATURE-----
Merge tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for some new hardware, extend the existing hardware
support, fix some issues and clean up code
Specifics:
- Add isupport for Tiger Lake in no-HWP mode to intel_pstate (Doug
Smythies)
- Update the AMD P-state driver (Perry Yuan):
- Fix wrong lowest perf fetch
- Map desired perf into pstate scope for powersave governor
- Update pstate frequency transition delay time
- Fix initial highest_perf value
- Clean up
- Move max CPU capacity to sugov_policy in the schedutil cpufreq
governor (Lukasz Luba)
- Add SM6115 to cpufreq-dt blocklist (Adam Skladowski)
- Add support for Tegra239 and minor cleanups (Sumit Gupta, ye
xingchen, and Yang Yingliang)
- Add freq qos for qcom cpufreq driver and minor cleanups (Xuewen
Yan, and Viresh Kumar)
- Minor cleanups around functions called at module_init() (Xiu
Jianfeng)
- Use module_init and add module_exit for bmips driver (Zhang
Jianhua)
- Add AlderLake-N support to intel_idle (Zhang Rui)
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang)
- Remove redundant check from cpuidle_switch_governor() (Yu Liao)
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang)
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang)
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki)
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello)
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang)
- Update the intel_rapl power capping driver:
- Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
- Add support for RAPTORLAKE_S (Zhang Rui).
- Fix UBSAN shift-out-of-bounds issue (Chao Qin)
- Handle -EPROBE_DEFER when regulator is not probed on
mtk-ci-devfreq.c (AngeloGioacchino Del Regno)
- Fix message typo and use dev_err_probe() in rockchip-dfi.c
(Christophe JAILLET)"
* tag 'pm-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
cpufreq: qcom-cpufreq-hw: Add cpufreq qos for LMh
cpufreq: Add __init annotation to module init funcs
cpufreq: tegra194: change tegra239_cpufreq_soc to static
PM / devfreq: rockchip-dfi: Fix an error message
PM / devfreq: mtk-cci: Handle sram regulator probe deferral
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
intel_idle: Add AlderLake-N support
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
cpufreq: tegra194: Add support for Tegra239
cpufreq: qcom-cpufreq-hw: Fix uninitialized throttled_freq warning
cpufreq: intel_pstate: Add Tigerlake support in no-HWP mode
powercap: intel_rapl: Add support for RAPTORLAKE_S
cpufreq: amd-pstate: Fix initial highest_perf value
cpuidle: Remove redundant check in cpuidle_switch_governor()
PM: wakeup: Add extra debugging statement for multiple active IRQs
cpufreq: tegra194: Remove the unneeded result variable
PM: suspend: move from strlcpy() with unused retval to strscpy()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
...
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki).
- Rename ACPI device object reference counting functions (Rafael
Wysocki).
- Rearrange ACPI device object initialization code (Rafael Wysocki).
- Drop parent field from struct acpi_device (Rafael Wysocki).
- Extend the the int3472-tps68470 driver to support multiple consumers
of a single TPS68470 along with the requisite framework-level
support (Daniel Scally).
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus).
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw).
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus).
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello).
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry).
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen).
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv).
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko).
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello).
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin).
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede).
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner).
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
- Drop unneeded result variable from ec_write() (ye xingchen).
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver (Hanjun
Guo).
- Reorder symbols to get rid of a few forward declarations in the ACPI
fan driver (Uwe Kleine-König).
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander).
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam).
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki).
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang).
- Do not initialize ret in main() in the pfrut utility (Shi junming).
- Drop useless ACPI DSDT override documentation (Rafael Wysocki).
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare).
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into an
integer value (Andy Shevchenko).
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko).
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmM7OhkSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx/TkQALQ4TN451dPSj9jcYSNY6qZ/9b4P9Iym
TmRf3wO3+IVZQ8JajeKKRuVKNsW3sC0RcFkJJVmgZkydJBr1Uui2L0ZLzi8axGNy
RlbZm5NyBeFnlP0fA8Gb2iRMXVAUcRIx+RvZCulxxFmgQ8UhoU4wlVZWlEcko4TQ
hGp++lJYcRHR1NbVLSXZhFvzopKLdhGL6vB1Awsjb/I7TVqn23+k4jVRV1DYkIQ7
qgFM+Z7osRVZiVQbaPoOgdykeSa43qXu7Vgs7F/QeJuIiUYx59xDh0/WCJBxnuDM
cHGiaNnvuJghKmCg43X8+joaHEH/jCFyvBVGfiSzRvjz03WOPRs1XztwdEiCi+py
RcZGzrPaXmkCjNeytPRooiifyqm95HT7aMBN/aTvKBXDaGRrfPheXF+i2idl24HM
NrHqMaa0+5qoDGHLUEaf5znlCHfS+3lwq6+lGVrq/UGf6B3cP+9HwOyevEW493JX
4nuv69Y517moR9W3mBU8sAn5mUjshcka7pghRj7QnuoqRqWLbU3lIz8oUDHr84cI
ixpIPvt2KlZ5UjnN9aqu/6k70JkJvy4SrKjnx4iqu03ePmMrRc0Hcpy7+VMlgumD
tgN9aW+YDgy0/Z5QmO1MOvFodVmA5sX6+gnX1neAjuDdIo3LkJptlkO1fCx2jfQu
cgPQk1CtPOos
=xyUK
-----END PGP SIGNATURE-----
Merge tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"ACPI and PNP updates for 6.1-rc1.
These rearrange the ACPI device object initialization code (to get rid
of a redundant parent pointer from struct acpi_device among other
things), unify the _UID handling, drop support for some _OSI strings
that should not be necessary any more, add new IDs to support more
hardware and some more quirks, fix a few issues and clean up code all
over.
Specifics:
- Reimplement acpi_get_pci_dev() using the list of physical devices
associated with the given ACPI device object (Rafael Wysocki)
- Rename ACPI device object reference counting functions (Rafael
Wysocki)
- Rearrange ACPI device object initialization code (Rafael Wysocki)
- Drop parent field from struct acpi_device (Rafael Wysocki)
- Extend the the int3472-tps68470 driver to support multiple
consumers of a single TPS68470 along with the requisite
framework-level support (Daniel Scally)
- Filter out non-memory resources in is_memory(), add a helper
function to find all memory type resources of an ACPI device object
and use that function in 3 places (Heikki Krogerus)
- Add IRQ override quirks for Asus Vivobook K3402ZA/K3502ZA and ASUS
model S5402ZA (Tamim Khan, Kellen Renshaw)
- Fix acpi_dev_state_d0() kerneldoc (Sakari Ailus)
- Fix up suspend-to-idle support on ASUS Rembrandt laptops (Mario
Limonciello)
- Clean up ACPI platform devices support code (Andy Shevchenko, John
Garry)
- Clean up ACPI bus management code (Andy Shevchenko, ye xingchen)
- Add support for multiple DMA windows with different offsets to the
ACPI device enumeration code and use it on LoongArch (Jianmin Lv)
- Clean up the ACPI LPSS (Intel SoC) driver (Andy Shevchenko)
- Add a quirk for Dell Inspiron 14 2-in-1 for StorageD3Enable (Mario
Limonciello)
- Drop unused dev_fmt() and redundant 'HMAT' prefix from the HMAT
parsing code (Liu Shixin)
- Make ACPI FPDT parsing code avoid calling acpi_os_map_memory() on
invalid physical addresses (Hans de Goede)
- Silence missing-declarations warning related to Apple device
properties management (Lukas Wunner)
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton)
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan)
- Fix Tx acknowledge in the PCC address space handler (Huisong Li)
- Use wait_for_completion_timeout() for PCC mailbox operations
(Huisong Li)
- Release resources on PCC address space setup failure path (Rafael
Mendonca)
- Remove unneeded result variables from APEI code (ye xingchen)
- Print total number of records found during BERT log parsing (Dmitry
Monakhov)
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello)
- Drop unneeded result variable from ec_write() (ye xingchen)
- Remove the leftover struct acpi_ac_bl from the ACPI AC driver
(Hanjun Guo)
- Reorder symbols to get rid of a few forward declarations in the
ACPI fan driver (Uwe Kleine-König)
- Add Toshiba Satellite/Portege Z830 ACPI backlight quirk (Arvid
Norlander)
- Add ARM DMA-330 controller to the supported list in the ACPI AMBA
driver (Vijayenthiran Subramaniam)
- Drop references to non-functional 01.org/linux-acpi web site from
MAINTAINERS and Kconfig help texts (Rafael Wysocki)
- Replace strlcpy() with unused retval with strscpy() in the ACPI
support code (Wolfram Sang)
- Do not initialize ret in main() in the pfrut utility (Shi junming)
- Drop useless ACPI DSDT override documentation (Rafael Wysocki)
- Fix a few typos and wording mistakes in the ACPI device enumeration
documentation (Jean Delvare)
- Introduce acpi_dev_uid_to_integer() to convert a _UID string into
an integer value (Andy Shevchenko)
- Use acpi_dev_uid_to_integer() in several places to unify _UID
handling (Andy Shevchenko)
- Drop unused pnpid32_to_pnpid() declaration from PNP code (Gaosheng
Cui)"
* tag 'acpi-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (79 commits)
ACPI: LPSS: Deduplicate skipping device in acpi_lpss_create_device()
ACPI: LPSS: Replace loop with first entry retrieval
ACPI: x86: s2idle: Add another ID to s2idle_dmi_table
ACPI: x86: s2idle: Fix a NULL pointer dereference
MAINTAINERS: Drop records pointing to 01.org/linux-acpi
ACPI: Kconfig: Drop link to https://01.org/linux-acpi
ACPI: docs: Drop useless DSDT override documentation
ACPI: DPTF: Drop stale link from Kconfig help
ACPI: x86: s2idle: Add a quirk for ASUSTeK COMPUTER INC. ROG Flow X13
ACPI: x86: s2idle: Add a quirk for Lenovo Slim 7 Pro 14ARH7
ACPI: x86: s2idle: Add a quirk for ASUS ROG Zephyrus G14
ACPI: x86: s2idle: Add a quirk for ASUS TUF Gaming A17 FA707RE
ACPI: x86: s2idle: Add module parameter to prefer Microsoft GUID
ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt
ACPI: x86: s2idle: Move _HID handling for AMD systems into structures
platform/x86: int3472: Add board data for Surface Go2 IR camera
platform/x86: int3472: Support multiple gpio lookups in board data
platform/x86: int3472: Support multiple clock consumers
ACPI: bus: Add iterator for dependent devices
ACPI: scan: Add acpi_dev_get_next_consumer_dev()
...
Merge cpuidle changes, PM core changes and power capping changes for
6.1-rc1:
- Add AlderLake-N support to intel_idle (Zhang Rui).
- Replace strlcpy() with unused retval with strscpy() in intel_idle
(Wolfram Sang).
- Remove redundant check from cpuidle_switch_governor() (Yu Liao).
- Replace strlcpy() with unused retval with strscpy() in the powernv
cpuidle driver (Wolfram Sang).
- Drop duplicate word from a comment in the coupled cpuidle driver
(Jason Wang).
- Make rpm_resume() return -EINPROGRESS if RPM_NOWAIT is passed to it
in the flags and the device is about to resume (Rafael Wysocki).
- Add extra debugging statement for multiple active IRQs to system
wakeup handling code (Mario Limonciello).
- Replace strlcpy() with unused retval with strscpy() in the core
system suspend support code (Wolfram Sang).
- Update the intel_rapl power capping driver:
* Use standard Energy Unit for SPR Dram RAPL domain (Zhang Rui).
* Add support for RAPTORLAKE_S (Zhang Rui).
* Fix UBSAN shift-out-of-bounds issue (Chao Qin).
* pm-cpuidle:
intel_idle: Add AlderLake-N support
cpuidle: Remove redundant check in cpuidle_switch_governor()
intel_idle: move from strlcpy() with unused retval to strscpy()
cpuidle: powernv: move from strlcpy() with unused retval to strscpy()
cpuidle: coupled: Drop duplicate word from a comment
* pm-core:
PM: runtime: Return -EINPROGRESS from rpm_resume() in the RPM_NOWAIT case
* pm-sleep:
PM: wakeup: Add extra debugging statement for multiple active IRQs
PM: suspend: move from strlcpy() with unused retval to strscpy()
* powercap:
powercap: intel_rapl: Use standard Energy Unit for SPR Dram RAPL domain
powercap: intel_rapl: fix UBSAN shift-out-of-bounds issue
powercap: intel_rapl: Add support for RAPTORLAKE_S
Merge new material related to CPPC, PCC, APEI and OSI strings handling
for 6.1-rc1:
- Disable frequency invariance in the CPPC library if registers used
by cppc_get_perf_ctrs() are accessed via PCC (Jeremy Linton).
- Add ACPI disabled check to acpi_cpc_valid() (Perry Yuan).
- Fix Tx acknowledge in the PCC address space handler (Huisong Li).
- Use wait_for_completion_timeout() for PCC mailbox operations (Huisong
Li).
- Release resources on PCC address space setup failure path (Rafael
Mendonca).
- Remove unneeded result variables from APEI code (ye xingchen).
- Print total number of records found during BERT log parsing (Dmitry
Monakhov).
- Drop support for 3 _OSI strings that should not be necessary any
more and update documentation on custom _OSI strings so that adding
new ones is not encouraged any more (Mario Limonciello).
* acpi-cppc:
ACPI: CPPC: Disable FIE if registers in PCC regions
ACPI: CPPC: Add ACPI disabled check to acpi_cpc_valid()
* acpi-pcc:
ACPI: PCC: Fix Tx acknowledge in the PCC address space handler
ACPI: PCC: replace wait_for_completion()
ACPI: PCC: Release resources on address space setup failure path
* acpi-apei:
ACPI: APEI: Remove unneeded result variables
ACPI: APEI: Add BERT error log footer
* acpi-osi:
ACPI: OSI: Update Documentation on custom _OSI strings
ACPI: OSI: Remove Linux-HPI-Hybrid-Graphics _OSI string
ACPI: OSI: Remove Linux-Lenovo-NV-HDMI-Audio _OSI string
ACPI: OSI: Remove Linux-Dell-Video _OSI string
The prospective callers of rpm_resume() passing RPM_NOWAIT to it may
be confused when it returns 0 without actually resuming the device
which may happen if the device is suspending at the given time and it
will only resume when the suspend in progress has completed. To avoid
that confusion, return -EINPROGRESS from rpm_resume() in that case.
Since none of the current callers passing RPM_NOWAIT to rpm_resume()
check its return value, this change has no functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Add const qualifier to the device_get_match_data() parameter.
Some of the future users may utilize this function without
forcing the type.
All the same, dev_fwnode() may be used with a const qualifier.
Reported-by: kernel test robot <lkp@intel.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220922135410.49694-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use IS_ERR_OR_NULL() helper in device_create_groups_vargs() to simplify code
and improve readiblity. No functional change.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220914140753.3799982-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In following scenario(diagram), when one thread X running dev_coredumpm()
adds devcd device to the framework which sends uevent notification to
userspace and another thread Y reads this uevent and call to
devcd_data_write() which eventually try to delete the queued timer that
is not initialized/queued yet.
So, debug object reports some warning and in the meantime, timer is
initialized and queued from X path. and from Y path, it gets reinitialized
again and timer->entry.pprev=NULL and try_to_grab_pending() stucks.
To fix this, introduce mutex and a boolean flag to serialize the behaviour.
cpu0(X) cpu1(Y)
dev_coredump() uevent sent to user space
device_add() ======================> user space process Y reads the
uevents writes to devcd fd
which results into writes to
devcd_data_write()
mod_delayed_work()
try_to_grab_pending()
del_timer()
debug_assert_init()
INIT_DELAYED_WORK()
schedule_delayed_work()
debug_object_fixup()
timer_fixup_assert_init()
timer_setup()
do_init_timer()
/*
Above call reinitializes
the timer to
timer->entry.pprev=NULL
and this will be checked
later in timer_pending() call.
*/
timer_pending()
!hlist_unhashed_lockless(&timer->entry)
!h->pprev
/*
del_timer() checks h->pprev and finds
it to be NULL due to which
try_to_grab_pending() stucks.
*/
Link: https://lore.kernel.org/lkml/2e1f81e2-428c-f11f-ce92-eb11048cb271@quicinc.com/
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Link: https://lore.kernel.org/r/1663073424-13663-1-git-send-email-quic_mojha@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This merges the driver core changes in 6.0-rc7 into driver-core-next as
they are needed here as well for testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Variable min_stride is assigned a value that is never read, fix this by
replacing the return 0 with a break statement. This also makes the case
statement consistent with the other cases in the switch statement.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220922080445.818020-1-colin.i.king@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 71066545b4.
It causes boot problems on some systems, so revert it for now until it
is worked out.
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Saravana Kannan <saravanak@google.com>
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Olof Johansson <olof@lixom.net>
Link: https://lore.kernel.org/r/CAOesGMjQHhTUMBGHQcME4JBkZCof2NEQ4gaM1GWFgH40+LN9AQ@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmMeQ2keHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGYRMH+gLNHiGirGZlm2GQ
tKaZQUy7MiXuIP0hGDonDIIIAmIVhnjm9MDG8KT4W8AvEd7ukncyYqJfwWeWQPhP
4mZcf6l3Z8Ke+qiaFpXpMPCxTyWcln1ox0EoNx2g9gdPxZntaRuuaTQVljUfTiey
aVPHxve8ip3G7jDoJnuLSxESOqWxkb8v/SshBP1E5bF5BZ+cgZRqq7FNigFqxjbk
wF29K09BVOPjdgkSvY/b0/SnL5KlSdMAv+FrPcJNGivcdIPgf/qJks5cI2HRUo7o
CpKgbcLorCVyD+d+zLonJBwIy3arbmKD8JqYnfdTSIqVOUqHXWUDfeydsH32u1Gu
lPSI2Hw=
=7LTL
-----END PGP SIGNATURE-----
Merge 6.0-rc5 into driver-core-next
We need the driver core and debugfs changes in this branch.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Directly check state of struct memory_block, no need a single function.
Link: https://lkml.kernel.org/r/20220827112043.187028-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYxuERw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylPqwCgjU6xlN2y/80HH+66k+yyzlxocE8AoLPgnGrA
dJZIGWFXExzO26tvMT52
=zGHA
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core and debugfs fixes for 6.0-rc5.
Included in here are:
- multiple attempts to get the arch_topology code to work properly on
non-cluster SMT systems. First attempt caused build breakages in
linux-next and 0-day, second try worked.
- debugfs fixes for a long-suffering memory leak. The pattern of
debugfs_remove(debugfs_lookup(...)) turns out to leak dentries, so
add debugfs_lookup_and_remove() to fix this problem. Also fix up
the scheduler debug code that highlighted this problem. Fixes for
other subsystems will be trickling in over the next few months for
this same issue once the debugfs function is merged.
All of these have been in linux-next since Wednesday with no reported
problems"
* tag 'driver-core-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
arch_topology: Make cluster topology span at least SMT CPUs
sched/debug: fix dentry leak in update_sched_domain_debugfs
debugfs: add debugfs_lookup_and_remove()
driver core: fix driver_set_override() issue with empty strings
Revert "arch_topology: Make cluster topology span at least SMT CPUs"
arch_topology: Make cluster topology span at least SMT CPUs
make_class_name has been removed since
commit 39aba963d9 ("driver core: remove CONFIG_SYSFS_DEPRECATED_V2
but keep it for block devices"), so remove it.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Link: https://lore.kernel.org/r/20220909063337.1146151-1-cuigaosheng1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmMZ3OEACgkQJNaLcl1U
h9CKOwf+O49lVDWbWr1YSR1hLvUtqV+rZsk5+kHCurTbEsa6eOuajx9hqHtEpSg9
CDHODkBk1TZmyfhS6/qWU5YUVGFpskkxebsJkxvYOCqaTi5UCDCLMRTWTP4yW5Gq
YRVgjFRmPAcnfaJCFB9/1l6NOMez0EfbWwApPPh7PNYp/KiexfzTFYzWbwLLhvmX
piY2tkG1SXrUjszFOZdXeukPU+HdQy40u1V7Akcs9ihwRMiIjeo0iqlE/R8QO5i4
vgevTPKPhsQJWGnZsqJvu9gtG6+Eoac6Kp4nqk4W/OVXpQNBhDHTI94D7X78+5hS
xUwgHjYT4PjR9HmuYMksgjYov4DmFw==
=ZO8J
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for how we handle controller constraints on SPI message sizes,
only impacting systems with SPI controllers with very low limits like
the AMD controller used in the Steam Deck"
* tag 'regmap-fix-v6.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: spi: Reserve space for register address/padding
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220905122615.12946-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Since we have a few helpers to swab elements of a given size in an array
use them instead of open coded variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831212744.56435-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is a few unneeded blank lines in some of event definitions,
remove them in order to make those definitions consistent with
the rest.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to have explicit castings to the same type the
variables are of. Remove the explicit castings.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220901132336.33234-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
If the buffer pointer is NULL we already are in troubles since
regmap bulk API expects caller to provide valid parameters,
it dereferences that without any checks before we call for
traces.
Moreover, the current code will print garbage in the case of
buffer is NULL and length is not 0.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220901132336.33234-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Python likes to send an empty string for some sysfs files, including the
driver_override field. When commit 23d99baf9d ("PCI: Use
driver_set_override() instead of open-coding") moved the PCI core to use
the driver core function instead of hand-rolling their own handler, this
showed up as a regression from some userspace tools, like DPDK.
Fix this up by actually looking at the length of the string first
instead of trusting that userspace got it correct.
Fixes: 23d99baf9d ("PCI: Use driver_set_override() instead of open-coding")
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: stable <stable@kernel.org>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220901163734.3583106-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit cb1f65c1e1 ("PM: s2idle: ACPI: Fix wakeup interrupts
handling") was introduced the kernel can now handle multiple
simultaneous interrupts during wakeup. Ths uncovered some existing
subtle firmware bugs where multiple IRQs are unintentionally active.
To help with fixing those bugs add an extra message when PM debugging
is enabled that can show the individual IRQs triggered as if a variety
are fired they'll potentially be lost as /sys/power/pm_wakeup_irq only
contains the first one that triggered the wakeup after resume is
complete but all may be needed to demonstrate the whole picture.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215770
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[ rjw: Added empty line after if () ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 6b66ca0bac as it
breaks the build on some arches as reported by the kernel test robot.
Link: https://lore.kernel.org/r/202209030824.SouwDV5M-lkp@intel.com
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 6b66ca0bac ("arch_topology: Make cluster topology span at least SMT CPUs")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently cpu_clustergroup_mask() will return CPU mask if cluster span more
or the same CPUs as cpu_coregroup_mask(). This will result topology borken
on non-Cluster SMT machines when building with CONFIG_SCHED_CLUSTER=y.
Test with:
qemu-system-aarch64 -enable-kvm -machine virt \
-net none \
-cpu host \
-bios ./QEMU_EFI.fd \
-m 2G \
-smp 48,sockets=2,cores=12,threads=2 \
-kernel $Image \
-initrd $Rootfs \
-nographic \
-append "rdinit=init console=ttyAMA0 sched_verbose loglevel=8"
We'll get below error:
[ 3.084568] BUG: arch topology borken
[ 3.084570] the SMT domain not a subset of the CLS domain
Since cluster is a level higher than SMT, fix this by making cluster
spans at least SMT CPUs.
Fixes: bfcc439743 ("arch_topology: Limit span of cpu_clustergroup_mask()")
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20220825092007.8129-1-yangyicong@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the gfp flag used for the memory allocation already has __GFP_ZERO,
then there is no need to explicitly clear the "struct devres_node". It is
already zeroed.
This saves a few cycles when using devm_zalloc() and co.
In the case of devres_alloc() (which calls __devres_alloc_node()), the
compiler could remove the test and the memset() because it should be able
to see that the __GFP_ZERO flag is set.
So this would make the code both faster and smaller.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/d255bd871484e63cdd628e819f929e2df59afb02.1658352383.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the case of firmware-upload, an instance of struct fw_upload is
allocated in firmware_upload_register(). This data needs to be freed
in fw_dev_release(). Create a new fw_upload_free() function in
sysfs_upload.c to handle the firmware-upload specific memory frees
and incorporate the missing kfree call for the fw_upload structure.
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220831002518.465274-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the following code within firmware_upload_unregister(), the call to
device_unregister() could result in the dev_release function freeing the
fw_upload_priv structure before it is dereferenced for the call to
module_put(). This bug was found by the kernel test robot using
CONFIG_KASAN while running the firmware selftests.
device_unregister(&fw_sysfs->dev);
module_put(fw_upload_priv->module);
The problem is fixed by copying fw_upload_priv->module to a local variable
for use when calling device_unregister().
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Cc: stable <stable@kernel.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220829174557.437047-1-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Architectures which do not have cacheinfo such as ARM 32-bit would spit
out the following during boot:
Early cacheinfo failed, ret = -2
Treat -ENOENT specifically to silence this error since it means that the
platform does not support reporting its cache information.
Fixes: 3fcbf1c77d ("arch_topology: Fix cache attributes detection in the CPU hotplug path")
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220805230736.1562801-1-f.fainelli@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0 ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan <saravanak@google.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A dangling pointless "ret 0" was left in and some unneeded
whitespace can go too.
Fixes: 81c0386c13 ("regmap: mmio: Support accelerared noinc operations")
Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220831141303.501548-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Make acpi_cpc_valid() check if ACPI is disabled, so that its callers
don't need to check that separately. This will also cause the AMD
pstate driver to refuse to load right away when ACPI is disabled.
Also update the warning message in amd_pstate_init() to mention the
ACPI disabled case for completeness.
Signed-off-by: Perry Yuan <Perry.Yuan@amd.com>
[ rjw: Subject edits, new changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We keep track of several kernel memory stats (total kernel memory, page
tables, stack, vmalloc, etc) on multiple levels (global, per-node,
per-memcg, etc). These stats give insights to users to how much memory
is used by the kernel and for what purposes.
Currently, memory used by KVM mmu is not accounted in any of those
kernel memory stats. This patch series accounts the memory pages
used by KVM for page tables in those stats in a new
NR_SECONDARY_PAGETABLE stat. This stat can be later extended to account
for other types of secondary pages tables (e.g. iommu page tables).
KVM has a decent number of large allocations that aren't for page
tables, but for most of them, the number/size of those allocations
scales linearly with either the number of vCPUs or the amount of memory
assigned to the VM. KVM's secondary page table allocations do not scale
linearly, especially when nested virtualization is in use.
From a KVM perspective, NR_SECONDARY_PAGETABLE will scale with KVM's
per-VM pages_{4k,2m,1g} stats unless the guest is doing something
bizarre (e.g. accessing only 4kb chunks of 2mb pages so that KVM is
forced to allocate a large number of page tables even though the guest
isn't accessing that much memory). However, someone would need to either
understand how KVM works to make that connection, or know (or be told) to
go look at KVM's stats if they're running VMs to better decipher the stats.
Furthermore, having NR_PAGETABLE side-by-side with NR_SECONDARY_PAGETABLE
is informative. For example, when backing a VM with THP vs. HugeTLB,
NR_SECONDARY_PAGETABLE is roughly the same, but NR_PAGETABLE is an order
of magnitude higher with THP. So having this stat will at the very least
prove to be useful for understanding tradeoffs between VM backing types,
and likely even steer folks towards potential optimizations.
The original discussion with more details about the rationale:
https://lore.kernel.org/all/87ilqoi77b.wl-maz@kernel.org
This stat will be used by subsequent patches to count KVM mmu
memory usage.
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220823004639.2387269-2-yosryahmed@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
We were using the wrong bound in the debug prints: this
needs to be the number of elements, not the number of bytes,
since we're indexing into an element-size typed array.
Fixes: c20cc099b3 ("regmap: Support accelerated noinc operations")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220823135700.265019-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently, only one-register io operations support tracepoints with
value logging. For the regmap bulk operations developer can view
hw_start/hw_done tracepoints with starting reg number and registers
count to be reading or writing. This patch injects tracepoints with
dumping registers values in the hex format to regmap bulk reading
and writing.
Signed-off-by: Dmitry Rokosov <ddrokosov@sberdevices.ru>
Link: https://lore.kernel.org/r/20220816181451.5628-1-ddrokosov@sberdevices.ru
Signed-off-by: Mark Brown <broonie@kernel.org>
This reverts commit 9cbffc7a59.
There are a few more issues to fix that have been reported in the thread
for the original series [1]. We'll need to fix those before this will work.
So, revert it for now.
[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
Fixes: 9cbffc7a59 ("driver core: Delete driver_deferred_probe_check_state()")
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Douglas Anderson <dianders@chromium.org>
Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220819221616.2107893-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the max_raw_read and max_raw_write limits in regmap_spi struct
do not take into account the additional size of the transmitted register
address and padding. This may result in exceeding the maximum permitted
SPI message size, which could cause undefined behaviour, e.g. data
corruption.
Fix regmap_get_spi_bus() to properly adjust the above mentioned limits
by reserving space for the register address/padding as set in the regmap
configuration.
Fixes: f231ff38b7 ("regmap: spi: Set regmap max raw r/w from max_transfer_size")
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Link: https://lore.kernel.org/r/20220818104851.429479-1-cristian.ciocaltea@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Use the newly added callback for accelerated noinc MMIO
to provide writesb, writesw, writesl, writesq, readsb, readsw,
readsl and readsq.
A special quirk is needed to deal with big endian regmaps: there
are no accelerated operations defined for big endian, so fall
back to calling the big endian operations itereatively for this
case.
The Hexagon architecture turns out to have an incomplete
<asm/io.h>: writesb() is not implemented. Fix this by doing
what other architectures do: include <asm-generic/io.h> into
the <asm/io.h> file.
Cc: Brian Cain <bcain@quicinc.com>
Cc: linux-hexagon@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-2-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Several architectures have accelerated operations for MMIO
operations writing to a single register, such as writesb, writesw,
writesl, writesq, readsb, readsw, readsl and readsq but regmap
currently cannot use them because we have no hooks for providing
an accelerated noinc back-end for MMIO.
Solve this by providing reg_[read/write]_noinc callbacks for
the bus abstraction, so that the regmap-mmio bus can use this.
Currently I do not see a need to support this for custom regmaps
so it is only added to the bus.
Callbacks are passed a void * with the array of values and a
count which is the number of items of the byte chunk size for
the specific register width.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220816204832.265837-1-linus.walleij@linaro.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge series from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
Currently regmap MMIO doesn't support IO ports, while being inconsistent
in used IO accessors. Fix the latter and extend framework with the
former.
Since we have a proper endianness converters for BE 24-bit data use
them. While at it, format the code using switch-cases as it's done
for the rest of the endianness handlers.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220726151213.71712-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Currently regmap MMIO is inconsistent with IO accessors. I.e.
the Big Endian counterparts are using ioreadXXbe() / iowriteXXbe()
which are not clean implementations of readXXbe().
That said, reimplement current Big Endian MMIO accessors by replacing
ioread()/iowrite() with respective read()/write() and swab() calls.
Note, there are no current in-kernel users that may utilize the
functionality of the IO ports on Big Endian hardware. All drivers
that use regmap MMIO either Little Endian, or they don't map IO
ports in a way that ioreadXX()/iowriteXX() may be utilized.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-5-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Some users may use regmap MMIO for IO ports, and this can be done
by assigning ioreadXX()/iowriteXX() and their Big Endian counterparts
to the regmap context.
Add IO port support with a corresponding flag added.
While doing that, make sure that user won't select relaxed MMIO access
along with IO port because the latter have no relaxed variants.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-4-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The current implementation, besides having no active users, is broken
by design of regmap. For 64-bit IO we need to supply 64-bit value,
otherwise there is no way to handle upper 32 bits in 64-bit register.
Hence, remove the broken IO accessors for good and wait for real user
that can fix entire regmap API for that.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-3-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no need to keep mmio_relaxed member in the context, it's
onetime used during generation of the context. Remove it.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/r/20220808203401.35153-2-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
arm64's method of defining a default cpu topology requires only minimal
changes to apply to RISC-V also. The current arm64 implementation exits
early in a uniprocessor configuration by reading MPIDR & claiming that
uniprocessor can rely on the default values.
This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64:
topology: Stop using MPIDR for topology information")', because the
current code just assigns default values for multiprocessor systems.
With the MPIDR references removed, store_cpu_topolgy() can be moved to
the common arch_topology code.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Here is the set of driver core and kernfs changes for 6.0-rc1.
"biggest" thing in here is some scalability improvements for kernfs for
large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed
and discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYuqCnw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym/JgCcCnaycJY00ZPRQm3LQCyzfJ0HgqoAn2qxGV+K
NKycLeXZSnuvIA87dycE
=/4Jk
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core / kernfs updates from Greg KH:
"Here is the set of driver core and kernfs changes for 6.0-rc1.
The "biggest" thing in here is some scalability improvements for
kernfs for large systems. Other than that, included in here are:
- arch topology and cache info changes that have been reviewed and
discussed a lot.
- potential error path cleanup fixes
- deferred driver probe cleanups
- firmware loader cleanups and tweaks
- documentation updates
- other small things
All of these have been in the linux-next tree for a while with no
reported problems"
* tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits)
docs: embargoed-hardware-issues: fix invalid AMD contact email
firmware_loader: Replace kmap() with kmap_local_page()
sysfs docs: ABI: Fix typo in comment
kobject: fix Kconfig.debug "its" grammar
kernfs: Fix typo 'the the' in comment
docs: driver-api: firmware: add driver firmware guidelines. (v3)
arch_topology: Fix cache attributes detection in the CPU hotplug path
ACPI: PPTT: Leave the table mapped for the runtime usage
cacheinfo: Use atomic allocation for percpu cache attributes
drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist
MAINTAINERS: Change mentions of mpm to olivia
docs: ABI: sysfs-devices-soc: Update Lee Jones' email address
docs: ABI: sysfs-class-pwm: Update Lee Jones' email address
Documentation/process: Add embargoed HW contact for LLVM
Revert "kernfs: Change kernfs_notify_list to llist."
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
...
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq policy
which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
fucntion parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmLoKy8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx3+oQAJNVU+W14EaRPWXQRMuwBC5zk3hb6T9q
JqmMd8coEd+9/4ABAeRAWso1B26rUzB6JyBvw3lGH9OXInpYmvnJEhEPrTpK2h0D
U9HxEARuGJolrDm0X9NAkn7tKKMC9GnvPS9z2s7s+N97VFFWC/QiU+PHB0SypGNb
JxRfbVJZQCuxmNG9UeK+xeHFQ9lM2Z9ZdTxR71G0n7nQPPR+sUvnFufFby3Aogf3
XnBYfia+YNqkUlefxxwB5a0cFwPXOUGsQkIf4d64gZnq1TgZ+71kht1GEF08PDFl
wV8v1rOWuXEae8dozuf5xszp/eVyAqzgB+IShT9APREOO3Wg6I16XdBm8R1TGwCK
JTdZqnm6HVKBNqchEwYViJILX69rrNUT+AwHBWhtKKDNh3qeTuwi/JGTeDVN++en
xf3TNKx3LV31Nq6nWJFzDGLehfZMnAPkhfYohUBI7FNyblpk4mJRVcZ0bYI7UNnS
als77uoipvb5KdFCtdhxYBHd/y867NvXKa1qsAuDxusAsfJHf4SnlMdbgOepBH2y
jJg06CGrMDU3TZ8BL+WpqUYk4irQnAMs/159Txh7A6/dOnTjE7S9NHrENCwmt2og
FrHSLH1eLX6Sa4RSibiGHPC7mNULP2/TOtryf3zFdlIVcjm3NEU3bnfzx7nlJn05
8t6ObMxgMhWT
=XeLV
-----END PGP SIGNATURE-----
Merge tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly minor improvements all over including new CPU IDs for
the Intel RAPL driver, an Energy Model rework to use micro-Watt as the
power unit, cpufreq fixes and cleanus, cpuidle updates, devfreq
updates, documentation cleanups and a new version of the pm-graph
suite of utilities.
Specifics:
- Make cpufreq_show_cpus() more straightforward (Viresh Kumar).
- Drop unnecessary CPU hotplug locking from store() used by cpufreq
sysfs attributes (Viresh Kumar).
- Make the ACPI cpufreq driver support the boost control interface on
Zhaoxin/Centaur processors (Tony W Wang-oc).
- Print a warning message on attempts to free an active cpufreq
policy which should never happen (Viresh Kumar).
- Fix grammar in the Kconfig help text for the loongson2 cpufreq
driver (Randy Dunlap).
- Use cpumask_var_t for an on-stack CPU mask in the ondemand cpufreq
governor (Zhao Liu).
- Add trace points for guest_halt_poll_ns grow/shrink to the haltpoll
cpuidle driver (Eiichi Tsukata).
- Modify intel_idle to treat C1 and C1E as independent idle states on
Sapphire Rapids (Artem Bityutskiy).
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
- Add new devfreq driver for Mediatek CCI (Cache Coherent
Interconnect) (Johnson Wang).
- Convert the Samsung Exynos SoC Bus bindings to DT schema of
exynos-bus.c (Krzysztof Kozlowski).
- Address kernel-doc warnings by adding the description for unused
function parameters in devfreq core (Mauro Carvalho Chehab).
- Use NULL to pass a null pointer rather than zero according to the
function propotype in imx-bus.c (Colin Ian King).
- Print error message instead of error interger value in
tegra30-devfreq.c (Dmitry Osipenko).
- Add checks to prevent setting negative frequency QoS limits for
CPUs (Shivnandan Kumar).
- Update the pm-graph suite of utilities to the latest revision 5.9
including multiple improvements (Todd Brandt).
- Drop pme_interrupt reference from the PCI power management
documentation (Mario Limonciello)"
* tag 'pm-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
PM: QoS: Add check to make sure CPU freq is non-negative
PM: hibernate: defer device probing when resuming from hibernation
intel_idle: make SPR C1 and C1E be independent
cpufreq: ondemand: Use cpumask_var_t for on-stack cpu mask
cpufreq: loongson2: fix Kconfig "its" grammar
pm-graph v5.9
cpufreq: Warn users while freeing active policy
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
PM / devfreq: tegra30: Add error message for devm_devfreq_add_device()
PM / devfreq: imx-bus: use NULL to pass a null pointer rather than zero
PM / devfreq: shut up kernel-doc warnings
dt-bindings: interconnect: samsung,exynos-bus: convert to dtschema
PM / devfreq: mediatek: Introduce MediaTek CCI devfreq driver
dt-bindings: interconnect: Add MediaTek CCI dt-bindings
PM: domains: Ensure genpd_debugfs_dir exists before remove
PM: runtime: Extend support for wakeirq for force_suspend|resume
...
The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend.
- Support for device specific update bits operations with devices that
otherwise use bitstream interfaces.
- Support for bit operations on fields as well as whole registers.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmLnvh0ACgkQJNaLcl1U
h9DlbQf/RELySinbi6WTIthuigJTfQq7xFIrR3LsEIrLWTuKsb+mBtxPkv0sAUF4
lmTtxmixtCAp3z2xWLXTw99dxDcII49YPmR1TzKZ9vBsK0vkAof6t7BQyhFoICpy
cbGdw4Mqi3qOHHeDH3obNbYhz1IwWL47Q0eASkNyaHrrnxykIyjeJ0TTURoVWJpX
FEzGwvFtH4+5w3yc0aE+WvjzHXgPj/xAsyE835TF9jv8cW9a2/KAYPLo7gW+icaz
Qx4JrmwMBuxzJEuaRvx2dxtasDCZKPFyod1cKS2gTpF4OnNKHocDwC3cSoGDLztr
BRljUV3VfWzOB8DdeAaB5XQM8LJeiw==
=talc
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The big thing this release is a big cleanup of the interrupt code from
Aidan MacDonald, plus a few new API updates:
- Rework of the interrupt code, making it much simpler and easier to
extend
- Support for device specific update bits operations with devices
that otherwise use bitstream interfaces
- Support for bit operations on fields as well as whole registers"
* tag 'regmap-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: permit to set reg_update_bits with bulk implementation
regmap: add WARN_ONCE when invalid mask is provided to regmap_field_init()
regmap-irq: Fix bug in regmap_irq_get_irq_reg_linear()
regmap: cache: Add extra parameter check in regcache_init
regmap-irq: Deprecate the not_fixed_stride flag
regmap-irq: Add get_irq_reg() callback
regmap-irq: Fix inverted handling of unmask registers
regmap-irq: Deprecate type registers and virtual registers
regmap-irq: Introduce config registers for irq types
regmap-irq: Refactor checks for status bulk read support
regmap-irq: Remove mask_writeonly and regmap_irq_update_bits()
regmap-irq: Remove inappropriate uses of regmap_irq_update_bits()
regmap-irq: Remove an unnecessary restriction on type_in_mask
regmap-irq: Cleanup sizeof(...) use in memory allocation
regmap-irq: Remove unused type_reg_stride field
regmap-irq: Convert bool bitfields to unsigned int
regmap: Don't warn about cache only mode for devices with no cache
regmap: provide regmap_field helpers for simple bit operations
regmap: cache: Fix syntax errors in comments
Merge core device power management changes for v5.20-rc1:
- Extend support for wakeirq to callback wrappers used during system
suspend and resume (Ulf Hansson).
- Defer waiting for device probe before loading a hibernation image
till the first actual device access to avoid possible deadlocks
reported by syzbot (Tetsuo Handa).
- Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn
Helgaas).
- Add Raptor Lake-P to the list of processors supported by the Intel
RAPL driver (George D Sworo).
- Add Alder Lake-N and Raptor Lake-P to the list of processors for
which Power Limit4 is supported in the Intel RAPL driver (Sumeet
Pawnikar).
- Make pm_genpd_remove() check genpd_debugfs_dir against NULL before
attempting to remove it (Hsin-Yi Wang).
- Change the Energy Model code to represent power in micro-Watts and
adjust its users accordingly (Lukasz Luba).
* pm-core:
PM: runtime: Extend support for wakeirq for force_suspend|resume
* pm-sleep:
PM: hibernate: defer device probing when resuming from hibernation
PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP
* powercap:
powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P
powercap: intel_rapl: Add support for RAPTORLAKE_P
* pm-domains:
PM: domains: Ensure genpd_debugfs_dir exists before remove
* pm-em:
cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1
firmware: arm_scmi: Get detailed power scale from perf
Documentation: EM: Switch to micro-Watts scale
PM: EM: convert power field to micro-Watts precision and align drivers
The use of kmap() is being deprecated in favor of kmap_local_page().
Two main problems with kmap(): (1) It comes with an overhead as mapping
space is restricted and protected by a global lock for synchronization and
(2) kmap() also requires global TLB invalidation when the kmap’s pool
wraps and it might block when the mapping space is fully utilized until a
slot becomes available.
kmap_local_page() is preferred over kmap() and kmap_atomic(). Where it
cannot mechanically replace the latters, code refactor should be considered
(special care must be taken if kernel virtual addresses are aliases in
different contexts).
With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
Call kmap_local_page() in firmware_loader wherever kmap() is currently
used. In firmware_rw() use the helpers copy_{from,to}_page() instead of
open coding the local mappings + memcpy().
Successfully tested with "firmware" selftests on a QEMU/KVM 32-bits VM
with 4GB RAM, booting a kernel with HIGHMEM64GB enabled.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Suggested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Link: https://lore.kernel.org/r/20220714235030.12732-1-fmdefrancesco@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
init_cpu_topology() is called only once at the boot and all the cache
attributes are detected early for all the possible CPUs. However when
the CPUs are hotplugged out, the cacheinfo gets removed. While the
attributes are added back when the CPUs are hotplugged back in as part
of CPU hotplug state machine, it ends up called quite late after the
update_siblings_masks() are called in the secondary_start_kernel()
resulting in wrong llc_sibling_masks.
Move the call to detect_cache_attributes() inside update_siblings_masks()
to ensure the cacheinfo is updated before the LLC sibling masks are
updated. This will fix the incorrect LLC sibling masks generated when
the CPUs are hotplugged out and hotplugged back in again.
Reported-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-3-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On couple of architectures like RISC-V and ARM64, we need to detect
cache attribues quite early during the boot when the secondary CPUs
start. So we will call detect_cache_attributes in the atomic context
and since use of normal allocation can sleep, we will end up getting
"sleeping in the atomic context" bug splat.
In order avoid that, move the allocation to use atomic version in
preparation to move the actual detection of cache attributes in the
CPU hotplug path which is atomic.
Cc: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220720-arch_topo_fixes-v3-1-43d696288e84@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A regmap may still require to set a custom reg_update_bits instead of
relying to the regmap_bus_read/write general function.
Permit to set it in the map if provided by the regmap config.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20220715201032.19507-1-ansuelsmth@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Using bin_attributes with a 0 size causes fstat and friends to return that
0 size. This breaks userspace code that retrieves the size before reading
the file. Rather than reverting 75bd50fa84 ("drivers/base/node.c: use
bin_attribute to break the size limitation of cpumap ABI") let's put in a
size value at compile time.
For cpulist the maximum size is on the order of
NR_CPUS * (ceil(log10(NR_CPUS)) + 1)/2
which for 8192 is 20480 (8192 * 5)/2. In order to get near that you'd need
a system with every other CPU on one node. For example: (0,2,4,8, ... ).
To simplify the math and support larger NR_CPUS in the future we are using
(NR_CPUS * 7)/2. We also set it to a min of PAGE_SIZE to retain the older
behavior for smaller NR_CPUS.
The cpumap file the size works out to be NR_CPUS/4 + NR_CPUS/32 - 1
(or NR_CPUS * 9/32 - 1) including the ","s.
Add a set of macros for these values to cpumask.h so they can be used in
multiple places. Apply these to the handful of such files in
drivers/base/topology.c as well as node.c.
As an example, on an 80 cpu 4-node system (NR_CPUS == 8192):
before:
-r--r--r--. 1 root root 0 Jul 12 14:08 system/node/node0/cpulist
-r--r--r--. 1 root root 0 Jul 11 17:25 system/node/node0/cpumap
after:
-r--r--r--. 1 root root 28672 Jul 13 11:32 system/node/node0/cpulist
-r--r--r--. 1 root root 4096 Jul 13 11:31 system/node/node0/cpumap
CONFIG_NR_CPUS = 16384
-r--r--r--. 1 root root 57344 Jul 13 14:03 system/node/node0/cpulist
-r--r--r--. 1 root root 4607 Jul 13 14:02 system/node/node0/cpumap
The actual number of cpus doesn't matter for the reported size since they
are based on NR_CPUS.
Fixes: 75bd50fa84 ("drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI")
Fixes: bb9ec13d15 ("topology: use bin_attribute to break the size limitation of cpumap ABI")
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: stable@vger.kernel.org
Acked-by: Yury Norov <yury.norov@gmail.com> (for include/linux/cpumask.h)
Signed-off-by: Phil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20220715134924.3466194-1-pauld@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Both genpd_debug_add() and genpd_debug_remove() may be called
indirectly by other drivers while genpd_debugfs_dir is not yet
set. For example, drivers can call pm_genpd_init() in probe or
pm_genpd_init() in probe fail/cleanup path:
pm_genpd_init()
--> genpd_debug_add()
pm_genpd_remove()
--> genpd_remove()
--> genpd_debug_remove()
At this time, genpd_debug_init() may not yet be called.
genpd_debug_add() checks that if genpd_debugfs_dir is NULL, it
will return directly. Make sure this is also checked
in pm_genpd_remove(), otherwise components under debugfs root
which has the same name as other components under pm_genpd may
be accidentally removed, since NULL represents debugfs root.
Fixes: 718072ceb2 ("PM: domains: create debugfs nodes when adding power domains")
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the now
pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmLKqAgACgkQEsHwGGHe
VUoM5w/8CSvwPZ3otkhmu8MrJPtWc7eLDPjYN4qQP+19e+bt094MoozxeeWG2wmp
hkDJAYHT2Oik/qDuEdhFgNYwS7XGgbV3Py3B8syO4//5SD5dkOSG+QqFXvXMdFri
YsVqqNkjJOWk/YL9Ql5RS/xQewsrr0OqEyWWocuI6XAvfWV4kKvlRSd+6oPqtZEO
qYlAHTXElyIrA/gjmxChk1HTt5HZtK3uJLf4twNlUfzw7LYFf3+sw3bdNuiXlyMr
WcLXMwGpS0idURwP3mJa7JRuiVBzb4+kt8mWwWqA02FkKV45FRRRFhFUsy667r00
cdZBaWdy+b7dvXeliO3FN/x1bZwIEUxmaNy1iAClph4Ifh0ySPUkxAr8EIER7YBy
bstDJEaIqgYg8NIaD4oF1UrG0ZbL0ImuxVaFdhG1hopQsh4IwLSTLgmZYDhfn/0i
oSqU0Le+A7QW9s2A2j6qi7BoAbRW+gmBuCgg8f8ECYRkFX1ZF6mkUtnQxYrU7RTq
rJWGW9nhwM9nRxwgntZiTjUUJ2HtyXEgYyCNjLFCbEBfeG5QTg7XSGFhqDbgoymH
85vsmSXYxgTgQ/kTW7Fs26tOqnP2h1OtLJZDL8rg49KijLAnISClEgohYW01CWQf
ZKMHtz3DM0WBiLvSAmfGifScgSrLB5AjtvFHT0hF+5/okEkinVk=
=09fW
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
"Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the
now pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide"
* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
x86/speculation: Disable RRSBA behavior
x86/kexec: Disable RET on kexec
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/retbleed: Add fine grained Kconfig knobs
x86/cpu/amd: Enumerate BTC_NO
x86/common: Stamp out the stepping madness
KVM: VMX: Prevent RSB underflow before vmenter
x86/speculation: Fill RSB on vmexit for IBRS
KVM: VMX: Fix IBRS handling after vmexit
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Convert launched argument to flags
KVM: VMX: Flatten __vmx_vcpu_run()
objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
x86/speculation: Remove x86_spec_ctrl_mask
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
...
A driver that makes use of pm_runtime_force_suspend|resume() to support
system suspend/resume, currently needs to manage the wakeirq support
itself. To avoid the boilerplate code in the driver's system suspend/resume
callbacks in particular, let's extend pm_runtime_force_suspend|resume() to
deal with the wakeirq.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmLFak4ACgkQAEG6vDF+
4pjHNxAAyzXpazkWuTjTot9UcX2TE8kIlEB4wXoGr1WJfi2uefVuvo3owvSKLdmr
pBNUf0fSLKShljueXOYcmwxVvSoN3zWSrOh4huUBWv0VPBg5yyNplZChwbhxjmiL
N5FGtSDHoTgPjjMUXnlsa/Y7/RDUhxR+s4ZdGS/vnMHbGr8Gsm5bjc6BNCq9E6cz
xnGDzUOS3+Sin+751De09HIuH5FOoCEpWOj6FGVK3MtEsizHU4ANEKTgFdsE26mG
nmVY1CU3GJmHluvG1JgL4+HmZsW02h2yU0tRSLqcseJCUou8gJ5yr0wYF6wmsHGk
nzGDfV7GzLdQg5rVnWcgzrzibqbBKJvh795e2cW3tV60VjMxlh57L7OWnHAEzQh8
QCF8HeE2CGg6VlYC+oB5JZ4pLdPE69e0+fHzhFy7hqK/B9yOr0olIKXxcm4tR/TV
5Ri7D8bX4Oviq1pQcT+GE/8Of5vX5R9LTiH1V0ld38npVvA55KDnO6WKvcadEucO
tKnHZx2dZYR/sRJ8ABz4hb7+UlLiLpCPFudx+BcXLHn+nWSqXYlu+F/2D2nsGRP+
HpmJHISJ40gD669KvupLg0/TtPdQ1oRmuf9CXUiMkzQZcySIW4wmFvCvbUfMcApl
7OzQ+FtAPb7n821mSV7KJBYJ5xOxNVR7DfUovmXCa/91xfqmE1M=
=UCxa
-----END PGP SIGNATURE-----
Merge tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into driver-core-next
Sudeep writes:
cacheinfo and arch_topology updates for v5.20
These are updates to fix some discrepancies we have in the CPU topology
parsing from the device tree /cpu-map node and the divergence from the
behaviour on a ACPI enabled platform. The expectation is that both DT
and ACPI enabled systems must present consistent view of the CPU topology.
The current assignment of generated cluster count as the physical package
identifier for each CPU is wrong. The device tree bindings for CPU
topology supports sockets to infer the socket or physical package
identifier for a given CPU. It is now being made use of you address the
issue. These updates also assigns the cluster identifier as parsed from
the device tree cluster nodes within /cpu-map without support for
nesting of the clusters as there are no such reported/known platforms.
In order to be on par with ACPI PPTT physical package/socket support,
these updates also include support for socket nodes in /cpu-map.
The only exception is that the last level cache id information can be
inferred from the same ACPI PPTT while we need to parse CPU cache nodes
in the device tree. The cacheinfo changes here is to enable the re-use
of the cacheinfo to detect the cache attributes for all the CPU quite
early even before the scondardaries are booted so that the information
can be used to build the schedular domains especially the last level
cache(LLC).
* tag 'arch-cache-topo-5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux: (21 commits)
ACPI: Remove the unused find_acpi_cpu_cache_topology()
arch_topology: Warn that topology for nested clusters is not supported
arch_topology: Add support for parsing sockets in /cpu-map
arch_topology: Set cluster identifier in each core/thread from /cpu-map
arch_topology: Limit span of cpu_clustergroup_mask()
arch_topology: Don't set cluster identifier as physical package identifier
arch_topology: Avoid parsing through all the CPUs once a outlier CPU is found
arch_topology: Check for non-negative value rather than -1 for IDs validity
arch_topology: Set thread sibling cpumask only within the cluster
arch_topology: Drop LLC identifier stash from the CPU topology
arm64: topology: Remove redundant setting of llc_id in CPU topology
arch_topology: Use the last level cache information from the cacheinfo
arch_topology: Add support to parse and detect cache attributes
cacheinfo: Align checks in cache_shared_cpu_map_{setup,remove} for readability
cacheinfo: Use cache identifiers to check if the caches are shared if available
cacheinfo: Allow early detection and population of cache attributes
cacheinfo: Add support to check if last level cache(LLC) is valid or shared
cacheinfo: Move cache_leaves_are_shared out of CONFIG_OF
cacheinfo: Add helper to access any cache index for a given CPU
cacheinfo: Use of_cpu_device_node_get instead cpu_dev->of_node
...
In regmap_field_init() when a invalid mask is provided it still
initializes with any warnings.
An example of this is when the LSB is greater than MSB a mask of zero
is produced.
WARN_ONCE() is not ideal for this but requires less changes to core regmap
code.
Cc: Mark Brown <broonie@kernel.org>
Cc: Nishanth Menon <nm@ti.com>
Signed-off-by: Matt Ranostay <mranostay@ti.com>
Link: https://lore.kernel.org/r/20220708013125.313892-1-mranostay@ti.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Previously the CONFIG_PM_SLEEP and !CONFIG_PM_SLEEP device_init_wakeup()
implementations differed in confusing ways:
- The PM_SLEEP version checked for a NULL device pointer and returned
-EINVAL, while the !PM_SLEEP version did not and would simply
dereference a NULL pointer.
- When called with "false", the !PM_SLEEP version cleared "capable" and
"enable" in the opposite order of the PM_SLEEP version. That was
harmless because for !PM_SLEEP they're simple assignments, but it's
unnecessary confusion.
Use a simplified version of the PM_SLEEP implementation for both cases.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
irq_reg_stride in struct regmap_irq_chip is often 0, but that
actually means to use the default stride of 1. The effective
stride is stored in struct regmap_irq_chip_data->irq_reg_stride
and will get the corrected default value.
The default ->get_irq_reg() callback was using the stride from
the chip definition, which is wrong; fix it to use the effective
stride from the chip data instead.
Link: https://lore.kernel.org/lkml/acaaf77f-3282-8544-dd3c-7915fc1a6a4f@samsung.com/
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220704112847.23844-1-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
We don't support the topology for clusters of CPU clusters while the
DT and ACPI bindings theoritcally support the same. Just warn about the
same so that it is clear to the users of arch_topology that the nested
clusters are not yet supported.
Link: https://lore.kernel.org/r/20220704101605.1318280-21-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Finally let us add support for socket nodes in /cpu-map in the device
tree. Since this may not be present in all the old platforms and even
most of the existing platforms, we need to assume absence of the socket
node indicates that it is a single socket system and handle appropriately.
Also it is likely that most single socket systems skip to as the node
since it is optional.
Link: https://lore.kernel.org/r/20220704101605.1318280-20-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Let us set the cluster identifier as parsed from the device tree
cluster nodes within /cpu-map.
We don't support nesting of clusters yet as there are no real hardware
to support clusters of clusters.
Link: https://lore.kernel.org/r/20220704101605.1318280-19-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly, the cluster_sibling mask will be
populated and returned by cpu_clustergroup_mask() to contribute in the
creation of the CLS scheduling domain level, if SCHED_CLUSTER is
enabled.
To avoid topologies that will result in questionable or incorrect
scheduling domains, impose restrictions regarding the span of clusters,
as presented to scheduling domains building code: cluster_sibling should
not span more or the same CPUs as cpu_coregroup_mask().
This is needed in order to obtain a strict separation between the MC and
CLS levels, and maintain the same domains for existing platforms in
the presence of CONFIG_SCHED_CLUSTER, where the new cluster information
is redundant and irrelevant for the scheduler.
While previously the scheduling domain builder code would have removed MC
as redundant and kept CLS if SCHED_CLUSTER was enabled and the
cpu_coregroup_mask() and cpu_clustergroup_mask() spanned the same CPUs,
now CLS will be removed and MC kept.
Link: https://lore.kernel.org/r/20220704101605.1318280-18-sudeep.holla@arm.com
Cc: Darren Hart <darren@os.amperecomputing.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently as we parse the CPU topology from /cpu-map node from the
device tree, we assign generated cluster count as the physical package
identifier for each CPU which is wrong.
The device tree bindings for CPU topology supports sockets to infer
the socket or physical package identifier for a given CPU. Since it is
fairly new and not supported on most of the old and existing systems, we
can assume all such systems have single socket/physical package.
Fix the physical package identifier to 0 by removing the assignment of
cluster identifier to the same.
Link: https://lore.kernel.org/r/20220704101605.1318280-17-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Ionela Voinescu <ionela.voinescu@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
There is no point in looping through all the CPU's physical package
identifier to check if it is valid or not once a CPU which is outside
the topology(i.e. outlier CPU) is found.
Let us just break out of the loop early in such case.
Link: https://lore.kernel.org/r/20220704101605.1318280-16-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Instead of just comparing the cpu topology IDs with -1 to check their
validity, improve that by checking for a valid non-negative value.
Link: https://lore.kernel.org/r/20220704101605.1318280-15-sudeep.holla@arm.com
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently the cluster identifier is not set on the DT based platforms.
The reset or default value is -1 for all the CPUs. Once we assign the
cluster identifier values correctly that may result in getting the thread
siblings wrong as the core identifiers can be same for 2 different CPUs
belonging to 2 different cluster.
So, in order to get the thread sibling cpumasks correct, we need to
update them only if the cores they belong are in the same cluster within
the socket. Let us skip updation of the thread sibling cpumaks if the
cluster identifier doesn't match.
This change won't affect even if the cluster identifiers are not set
currently but will avoid any breakage once we set the same correctly.
Link: https://lore.kernel.org/r/20220704101605.1318280-14-sudeep.holla@arm.com
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Since the cacheinfo LLC information is used directly in arch_topology,
there is no need to parse and store the LLC ID information only for
ACPI systems in the CPU topology.
Remove the redundant LLC ID from the generic CPU arch_topology
information.
Link: https://lore.kernel.org/r/20220704101605.1318280-13-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo is now initialised early along with the CPU topology
initialisation. Instead of relying on the LLC ID information parsed
separately only with ACPI PPTT elsewhere, migrate to use the similar
information from the cacheinfo.
This is generic for both DT and ACPI systems. The ACPI LLC ID information
parsed separately can now be removed from arch specific code.
Link: https://lore.kernel.org/r/20220704101605.1318280-11-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Currently ACPI populates just the minimum information about the last
level cache from PPTT in order to feed the same to build sched_domains.
Similar support for DT platforms is not present.
In order to enable the same, the entire cache hierarchy information can
be built as part of CPU topoplogy parsing both on ACPI and DT platforms.
Note that this change builds the cacheinfo early even on ACPI systems,
but the current mechanism of building llc_sibling mask remains unchanged.
Link: https://lore.kernel.org/r/20220704101605.1318280-10-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The checks to skip the CPU itself or no cacheinfo case are implemented
bit differently though the effect is exactly same. Just align the
implementation in both cache_shared_cpu_map_{setup,remove} just for
improved readability. No functional change.
Link: https://lore.kernel.org/r/20220704101605.1318280-9-sudeep.holla@arm.com
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cache identifiers is an optional property on most of the platforms.
The presence of one must be indicated by the CACHE_ID valid bit in the
attributes.
We can use the cache identifiers provided by the firmware to check if
any two cpus share the same cache instead of relying on the fw_token
generated and set in the OS.
Link: https://lore.kernel.org/r/20220704101605.1318280-8-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Some architecture/platforms may need to setup cache properties very
early in the boot along with other cpu topologies so that all these
information can be used to build sched_domains which is used by the
scheduler.
Allow detect_cache_attributes to be called quite early during the boot.
Link: https://lore.kernel.org/r/20220704101605.1318280-7-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
It is useful to have helper to check if the given two CPUs share last
level cache. We can do that check by comparing fw_token or by comparing
the cache ID. Currently we check just for fw_token as the cache ID is
optional.
This helper can be used to build the llc_sibling during arch specific
topology parsing and feeding information to the sched_domains. This also
helps to get rid of llc_id in the CPU topology as it is sort of duplicate
information.
Also add helper to check if the llc information in cacheinfo is valid
or not.
Link: https://lore.kernel.org/r/20220704101605.1318280-6-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
cache_leaves_are_shared is already used even with ACPI and PPTT. It
checks if the cache leaves are the shared based on fw_token pointer.
However it is defined conditionally only if CONFIG_OF is enabled which
is wrong.
Move the function cache_leaves_are_shared out of CONFIG_OF and keep it
generic. It also handles the case where both OF and ACPI is not defined.
Link: https://lore.kernel.org/r/20220704101605.1318280-5-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The cacheinfo for a given CPU at a given index is used at quite a few
places by fetching the base point for index 0 using the helper
per_cpu_cacheinfo(cpu) and offsetting it by the required index.
Instead, add another helper to fetch the required pointer directly and
use it to simplify and improve readability.
Link: https://lore.kernel.org/r/20220704101605.1318280-4-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
The of_cpu_device_node_get takes care of fetching the CPU'd device node
either from cached cpu_dev->of_node if cpu_dev is initialised or uses
of_get_cpu_node to parse and fetch node if cpu_dev isn't available yet.
Just use of_cpu_device_node_get instead of getting the cpu device first
and then using cpu_dev->of_node for two reasons:
1. There is no other use of cpu_dev and can be simplified
2. It enabled the use detect_cache_attributes and hence cache_setup_of_node
much earlier before the CPUs are registered as devices.
Link: https://lore.kernel.org/r/20220704101605.1318280-3-sudeep.holla@arm.com
Tested-by: Ionela Voinescu <ionela.voinescu@arm.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Because pm_runtime_get_suppliers() bumps up the rpm_active counter
of each device link to a supplier of the given device in addition
to bumping up the supplier's PM-runtime usage counter, a runtime
suspend of the consumer device may case the latter to go down to 0
when pm_runtime_put_suppliers() is running on a remote CPU. If that
happens after pm_runtime_put_suppliers() has released power.lock for
the consumer device, and a runtime resume of that device takes place
immediately after it, before pm_runtime_put() is called for the
supplier, that pm_runtime_put() call may cause the supplier to be
suspended even though the consumer is active.
To prevent that from happening, modify pm_runtime_get_suppliers() to
call pm_runtime_get_sync() for the given device's suppliers without
touching the rpm_active counters of the involved device links
Accordingly, modify pm_runtime_put_suppliers() to call pm_runtime_put()
for the given device's suppliers without looking at the rpm_active
counters of the device links at hand. [This is analogous to what
happened before commit 4c06c4e6cf ("driver core: Fix possible
supplier PM-usage counter imbalance").]
Since pm_runtime_get_suppliers() sets supplier_preactivated for each
device link where the supplier's PM-runtime usage counter has been
incremented and pm_runtime_put_suppliers() calls pm_runtime_put() for
the suppliers whose device links have supplier_preactivated set, the
PM-runtime usage counter is balanced for each supplier and this is
independent of the runtime suspend and resume of the consumer device.
However, in case a device link with DL_FLAG_PM_RUNTIME set is dropped
during the consumer device probe, so pm_runtime_get_suppliers() bumps
up the supplier's PM-runtime usage counter, but it cannot be dropped by
pm_runtime_put_suppliers(), make device_link_release_fn() take care of
that.
Fixes: 4c06c4e6cf ("driver core: Fix possible supplier PM-usage counter imbalance")
Reported-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Instead of passing an extra bool argument to pm_runtime_release_supplier(),
make its callers take care of triggering a runtime-suspend of the
supplier device as needed.
No expected functional impact.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Merge series from Aidan MacDonald <aidanmacdonald.0x0@gmail.com>:
This series is an attempt at cleaning up the regmap-irq API in order
to simplify things and consolidate existing features, while at the
same time generalizing it to support a wider range of hardware.
There is a new system for IRQ type configuration, some tweaks to
unmask registers so they're more intuitive and useful, and a new
callback for calculating register addresses. There's also a few
minor code cleanups in here.
In v2 I've taken the approach of adding new features and deprecating
existing ones rather than removing them aggressively. Warnings will
be issued for any drivers that use deprecated features, but they'll
otherwise continue to function normally.
One important caveat: not all of these changes are tested beyond
compile testing, since I don't have hardware to exercise all of
the features.
When num_reg_defaults > 0 but reg_defaults is NULL, there will be a
NULL pointer exception.
Current code has no such usage, but as additional hardening, also
check this to prevent any chance of crashing.
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220629130951.63040-1-schspa@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This flag is a bit of a hack and the same thing can be accomplished
using a custom ->get_irq_reg() callback. Add a warning to catch any
use of the flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-13-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Replace the internal sub_irq_reg() function with a public callback
that drivers can use when they have more complex register layouts.
The default implementation is regmap_irq_get_irq_reg_linear(), used
if the chip doesn't provide its own callback.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-12-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
To me "unmask" suggests that we write 1s to the register when
an interrupt is enabled. This also makes sense because it's the
opposite of what the "mask" register does (write 1s to disable
an interrupt).
But regmap-irq does the opposite: for a disabled interrupt, it
writes 1s to "unmask" and 0s to "mask". This is surprising and
deviates from the usual way mask registers are handled.
Additionally, mask_invert didn't interact with unmask registers
properly -- it caused them to be ignored entirely.
Fix this by making mask and unmask registers orthogonal, using
the following behavior:
* Mask registers are written with 1s for disabled interrupts.
* Unmask registers are written with 1s for enabled interrupts.
This behavior supports both normal or inverted mask registers
and separate set/clear registers via different combinations of
mask_base/unmask_base.
The old unmask register behavior is deprecated. Drivers need to
opt-in to the new behavior by setting mask_unmask_non_inverted.
Warnings are issued if the driver relies on deprecated behavior.
Chips that only set one of mask_base/unmask_base don't have to
use the mask_unmask_non_inverted flag because that use case was
previously not supported.
The mask_invert flag is also deprecated in favor of describing
inverted mask registers as unmask registers.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-11-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers can be used to replace both type and virtual
registers, so mark both features are deprecated and issue a
warning if they're used.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-10-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Config registers provide a more uniform approach to handling irq type
registers. They are essentially an extension of the virtual registers
used by the qcom-pm8008 driver.
Config registers can be represented as a 2D array:
config_base[0] reg0,0 reg0,1 reg0,2 reg0,3
config_base[1] reg1,0 reg1,1 reg1,2 reg1,3
config_base[2] reg2,0 reg2,1 reg2,2 reg2,3
There are 'num_config_bases' base registers, each of which is used to
address 'num_config_regs' registers. The addresses are calculated in
the same way as for other bases. It is assumed that an irq's type is
controlled by one column of registers; that column is identified by
the irq's 'type_reg_offset'.
The set_type_config() callback is responsible for updating the config
register contents. It receives an array of buffers (each represents a
row of registers) and the index of the column to update, along with
the 'struct regmap_irq' description and requested irq type.
Buffered values are written to registers in regmap_irq_sync_unlock().
Note that the entire register contents are overwritten, which is a
minor change in behavior from type registers via 'type_base'.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-9-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There are several conditions that must be satisfied to support
bulk read of status registers. Move the check into a function
to avoid duplicating it in two places.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-8-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit a71411dbf6 ("regmap: irq: add chip option mask_writeonly")
introduced the mask_writeonly option, but it isn't used now and it
appears it's never been used by any in-tree drivers. The motivation
for the option is mentioned in the commit message,
Some irq controllers have writeonly/multipurpose register
layouts. In those cases we read invalid data back. [...]
The option causes mask register updates to use regmap_write_bits()
instead of regmap_update_bits().
However, regmap_write_bits() doesn't solve the reading invalid data
problem. It's still a read-modify-write op like regmap_update_bits().
The difference is that 'update bits' will only write the new value
if it is different from the current value, while 'write bits' will
write the new value unconditionally, even if it's the same as the
current value.
This seems like a bit of a specialized use case and probably isn't
that useful for regmap-irq, so let's just remove the option and go
back to using an 'update bits' op for the mask registers. We can
always add the option back if some driver ends up needing it in the
future.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-7-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
regmap_irq_update_bits() is misnamed and should only be used for
updating mask registers, since it checks the mask_writeonly flag.
However, it was also used for updating wake and type registers.
It's safe to replace these uses with regmap_update_bits() because
there are no users of the mask_writeonly flag.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-6-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Check types_supported instead of checking type_rising/falling_val
when using type_in_mask interrupts. This makes the intent clearer
and allows a type_in_mask irq to support level or edge triggers,
rather than only edge triggers.
Update the documentation and comments to reflect the new behavior.
This shouldn't affect existing drivers, because if they didn't
set types_supported properly the type buffer wouldn't be updated.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-5-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Instead of mentioning unsigned int directly, use a sizeof(...)
involving the buffer we're allocating to ensure the types don't
get out of sync.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-4-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
It appears that no chip ever required a nonzero type_reg_stride
and commit 1066cfbdfa ("regmap-irq: Extend sub-irq to support
non-fixed reg strides") broke support. Just remove the field.
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220623211420.918875-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When firmware sets the FWNODE_FLAG_BEST_EFFORT flag for a fwnode,
fw_devlink will do a best effort ordering for that device where it'll
only enforce the probe/suspend/resume ordering of that device with
suppliers that have drivers. The driver of that device can then decide
if it wants to defer probe or probe without the suppliers.
This will be useful for avoid probe delays of the console device that
were caused by commit 71066545b4 ("driver core: Set
fw_devlink.strict=1 by default").
Fixes: 71066545b4 ("driver core: Set fw_devlink.strict=1 by default")
Reported-by: Sascha Hauer <sha@pengutronix.de>
Reported-by: Peng Fan <peng.fan@nxp.com>
Tested-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220623080344.783549-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
stack like commit b232b02bf3 ("driver core: fix deadlock in
__device_attach").
list below:
In __driver_attach function, The lock holding logic is as follows:
...
__driver_attach
if (driver_allows_async_probing(drv))
device_lock(dev) // get lock dev
async_schedule_dev(__driver_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__driver_attach_async_helper will get lock dev as
will, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As above show, when it is allowed to do async probes, because of
out of memory or work limit, async work is not be allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__driver_attach_async_helper getting lock dev.
Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&entry_count) > MAX_WORK)) untenable, like
below:
[ 370.785650] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 370.787154] task:swapper/0 state:D stack: 0 pid: 1 ppid:
0 flags:0x00004000
[ 370.788865] Call Trace:
[ 370.789374] <TASK>
[ 370.789841] __schedule+0x482/0x1050
[ 370.790613] schedule+0x92/0x1a0
[ 370.791290] schedule_preempt_disabled+0x2c/0x50
[ 370.792256] __mutex_lock.isra.0+0x757/0xec0
[ 370.793158] __mutex_lock_slowpath+0x1f/0x30
[ 370.794079] mutex_lock+0x50/0x60
[ 370.794795] __device_driver_lock+0x2f/0x70
[ 370.795677] ? driver_probe_device+0xd0/0xd0
[ 370.796576] __driver_attach_async_helper+0x1d/0xd0
[ 370.797318] ? driver_probe_device+0xd0/0xd0
[ 370.797957] async_schedule_node_domain+0xa5/0xc0
[ 370.798652] async_schedule_node+0x19/0x30
[ 370.799243] __driver_attach+0x246/0x290
[ 370.799828] ? driver_allows_async_probing+0xa0/0xa0
[ 370.800548] bus_for_each_dev+0x9d/0x130
[ 370.801132] driver_attach+0x22/0x30
[ 370.801666] bus_add_driver+0x290/0x340
[ 370.802246] driver_register+0x88/0x140
[ 370.802817] ? virtio_scsi_init+0x116/0x116
[ 370.803425] scsi_register_driver+0x1a/0x30
[ 370.804057] init_sd+0x184/0x226
[ 370.804533] do_one_initcall+0x71/0x3a0
[ 370.805107] kernel_init_freeable+0x39a/0x43a
[ 370.805759] ? rest_init+0x150/0x150
[ 370.806283] kernel_init+0x26/0x230
[ 370.806799] ret_from_fork+0x1f/0x30
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: ef0ff68351 ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the devtmpfs fails to mount, a dangling pointer still remains in
global. Specifically, the err variable is passed by a pointer to the
devtmpfsd. When the devtmpfsd exits, it sets the error and completes the
setup_done. In this situation, the thread pointer is not set to null.
After the devtmpfsd exited, the devtmpfs can wakes up the destroyed
devtmpfsd thread by wake_up_process if a device change event comes.
Signed-off-by: Yangxi Xiang <xyangxi5@gmail.com>
Link: https://lore.kernel.org/r/20220627120409.11174-1-xyangxi5@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit 77515ebaf0 as it
causes build problems in linux-next. It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.
Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For devices with no cache it can make sense to use cache only mode as a
mechanism for trapping writes to hardware which is inaccessible but since
no cache is equivalent to cache bypass we force such devices into bypass
mode. This means that our check that bypass and cache only mode aren't both
enabled simultanously is less sensible for devices without a cache so relax
it.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220622171723.1235749-1-broonie@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Fixes for post-5.18 changes:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes pre-5.18 material:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYri4RgAKCRDdBJ7gKXxA
jmhsAQDCvGqtIUhgkTwid8KBRNbowsg0LXd6k+gUjcxBhH403wEA0r0cxxkDAmgr
QNXn/qZRzQP2ji+pdjH9NBOsd2g2XQA=
=UGJ7
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Minor things, mainly - mailmap updates, MAINTAINERS updates, etc.
Fixes for this merge window:
- fix for a damon boot hang, from SeongJae
- fix for a kfence warning splat, from Jason Donenfeld
- fix for zero-pfn pinning, from Alex Williamson
- fix for fallocate hole punch clearing, from Mike Kravetz
Fixes for previous releases:
- fix for a performance regression, from Marcelo
- fix for a hwpoisining BUG from zhenwei pi"
* tag 'mm-hotfixes-stable-2022-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: add entry for Christian Marangi
mm/memory-failure: disable unpoison once hw error happens
hugetlbfs: zero partial pages during fallocate hole punch
mm: memcontrol: reference to tools/cgroup/memcg_slabinfo.py
mm: re-allow pinning of zero pfns
mm/kfence: select random number before taking raw lock
MAINTAINERS: add maillist information for LoongArch
MAINTAINERS: update MM tree references
MAINTAINERS: update Abel Vesa's email
MAINTAINERS: add MEMORY HOT(UN)PLUG section and add David as reviewer
MAINTAINERS: add Miaohe Lin as a memory-failure reviewer
mailmap: add alias for jarkko@profian.com
mm/damon/reclaim: schedule 'damon_reclaim_timer' only after 'system_wq' is initialized
kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal
mm: lru_cache_disable: use synchronize_rcu_expedited
mm/page_isolation.c: fix one kernel-doc comment
Two sets of fixes - one for things that were missed with the support for
custom bulk I/O operations introduced in the last merge window, and
another for some long standing issues with regmap-irq which affect a
fairly small subset of devices.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmK16ucACgkQJNaLcl1U
h9CXmAf/Rpq3pky1EEgc/BAx+Vi6K1PD/5VvUX1Wb0xHYK52yk+XBuyqkpzhTptm
BhY5TIZWriai6UdFnuZtxRYiDy65IBZo1SGtNMlwcnjBWzD1weIWukGmkPzRGr9y
Vdr6PK1SOsP+qVrxrnNsbdEIICev2KDzmVxXa81zsKeyj9Dx4W4ETbpbGDvwyaHk
y+cBucVrb3RSoruoJUL3iV7XeobroLNO1ZcD9Wim1Kx2QFtdgbOEF5oBmqWmfVlu
Au59ug97u3AGVJpeAFwvM6EFlG7ft2wn4gsP92QCJ58zUPLdxHaS/+mWAfUButCk
km/yP4Su9KyRFFecMLwIYa4+UvH9kA==
=vT5N
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"Two sets of fixes - one for things that were missed with the support
for custom bulk I/O operations introduced in the last merge window,
and another for some long standing issues with regmap-irq which affect
a fairly small subset of devices"
* tag 'regmap-fix-v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Fix offset/index mismatch in read_sub_irq_data()
regmap-irq: Fix a bug in regmap_irq_enable() for type_in_mask chips
regmap: Wire up regmap_config provided bulk write in missed functions
regmap: Make regmap_noinc_read() return -ENOTSUPP if map->read isn't set
regmap: Re-introduce bulk read support check in regmap_bulk_read()
We need to divide the sub-irq status register offset by register
stride to get an index for the status buffer to avoid an out of
bounds write when the register stride is greater than 1.
Fixes: a2d21848d9 ("regmap: regmap-irq: Add main status register support")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-3-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
When enabling a type_in_mask irq, the type_buf contents must be
AND'd with the mask of the IRQ we're enabling to avoid enabling
other IRQs by accident, which can happen if several type_in_mask
irqs share a mask register.
Fixes: bc998a7303 ("regmap: irq: handle HW using separate rising/falling edge interrupts")
Signed-off-by: Aidan MacDonald <aidanmacdonald.0x0@gmail.com>
Link: https://lore.kernel.org/r/20220620200644.1961936-2-aidanmacdonald.0x0@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
The dev_coredumpv() and dev_coredumpm() could not be used in atomic
context, because they call kvasprintf_const() and kstrdup() with
GFP_KERNEL parameter. The process is shown below:
dev_coredumpv(.., gfp_t gfp)
dev_coredumpm(.., gfp_t gfp)
dev_set_name
kobject_set_name_vargs
kvasprintf_const(GFP_KERNEL, ...); //may sleep
kstrdup(s, GFP_KERNEL); //may sleep
This patch removes gfp_t parameter of dev_coredumpv() and dev_coredumpm()
and changes the gfp_t parameter of kzalloc() in dev_coredumpm() to
GFP_KERNEL in order to show they could not be used in atomic context.
Fixes: 833c95456a ("device coredump: add new device coredump class")
Reviewed-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/df72af3b1862bac7d8e793d1f3931857d3779dfd.1654569290.git.duoming@zju.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are some functions that were missed by commit d77e745613 ("regmap:
Add bulk read/write callbacks into regmap_config") when support to define
bulk read/write callbacks in regmap_config was introduced.
The regmap_bulk_write() and regmap_noinc_write() functions weren't changed
to use the added map->write instead of the map->bus->write handler.
Also, the regmap_can_raw_write() was not modified to take map->write into
account. So will only return true if a bus with a .write callback is set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-4-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Before adding support to define bulk read/write callbacks in regmap_config
by the commit d77e745613 ("regmap: Add bulk read/write callbacks into
regmap_config"), the regmap_noinc_read() function returned an errno early
a map->bus->read callback wasn't set.
But that commit dropped the check and now a call to _regmap_raw_read() is
attempted even when bulk read operations are not supported. That function
checks for map->read anyways but there's no point to continue if the read
can't succeed.
Also is a fragile assumption to make so is better to make it fail earlier.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-3-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Support for drivers to define bulk read/write callbacks in regmap_config
was introduced by the commit d77e745613 ("regmap: Add bulk read/write
callbacks into regmap_config"), but this commit wrongly dropped a check
in regmap_bulk_read() to determine whether bulk reads can be done or not.
Before that commit, it was checked if map->bus was set. Now has to check
if a map->read callback has been set.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://lore.kernel.org/r/20220616073435.1988219-2-javierm@redhat.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Pull writeback and ext2 fixes from Jan Kara:
"A fix for writeback bug which prevented machines with kdevtmpfs from
booting and also one small ext2 bugfix in IO error handling"
* tag 'fs_for_v5.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
init: Initialize noop_backing_dev_info early
ext2: fix fs corruption when trying to remove a non-empty directory with IO error
noop_backing_dev_info is used by superblocks of various
pseudofilesystems such as kdevtmpfs. After commit 10e1407310
("writeback: Fix inode->i_io_list not be protected by inode->i_lock
error") this broke because __mark_inode_dirty() started to access more
fields from noop_backing_dev_info and this led to crashes inside
locked_inode_to_wb_and_lock_list() called from __mark_inode_dirty().
Fix the problem by initializing noop_backing_dev_info before the
filesystems get mounted.
Fixes: 10e1407310 ("writeback: Fix inode->i_io_list not be protected by inode->i_lock error")
Reported-and-tested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Reported-and-tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKpskgACgkQJNaLcl1U
h9BF7Af7Bg1qhqvTJQKvfc1TU7pRnO2Mt90MJTTAFMgiMP/+FZhiaDrKm12SBTPh
w8Gomg8X+EYVpiqtywFoc0mZIUe03g9+8RpuKyK2EOReaguhU2qHQPDo2Dhx6x5p
VnNY9w/rhQsrX8153c6ICZ8YqrxNqCbgCh7eulMeXz74wbxwbrNdZbbCaM9+WCS/
fTWtNUwZ+3JtaSP55UCX/ITNrLct0gPdWqkRpLExv9HdOf1HaBRUTZH69ftQg82S
PHsxEBpPWKbViflSU7V5/UgeBCTS7FjRpwB6OEX69V+VGhRIey9IgZqbSsX9KoXX
RYPKj5DZD9JtPNA0W5Dzs0KyUIlDtA==
=vniQ
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKqENwACgkQJNaLcl1U
h9D1pQf/cbMWcZeRe6eRxvDueIhyMYXwhahxEHMCL6EOR5SLLARB40klJq5DDPlM
D7SZlPES1oXXQ8zm8rpBahfQf3AAeiUMZ4+bOMMU45xaNTKeOGPuihiP69PQxzp1
RryrB6/AN+qKepHaWsnhOyarVnJ2XUDJ58QBmYwee5UFdxiPPIzjMLC0FVEgLhIc
Ng75dFIur3SknYyoYyd9gsH/VydlaEiYQ9s3GRKtq+D0P/fbsylPp1w5Lup3tXWh
kGYyJroz9doevY5K+TeMsVqSJhES3VJkz9GVxkdwCB6qhrXzGkbdnU7ZF1MZGlnf
bOC6+lE+CxoDWx/apLmdJg6ZmnHxdQ==
=CWeE
-----END PGP SIGNATURE-----
Merge tag 'regmap-field-bit-helpers' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into regmap-5.20
regmap: Add regmap_field helpers for simple bit operations
Add simple bit operations for setting, clearing and testing specific
bits with regmap_field.
We have set/clear/test operations for regmap, but not for regmap_field yet.
So let's introduce regmap_field helpers too.
In many instances regmap_field_update_bits() is used for simple bit setting
and clearing. In these cases the last argument is redundant and we can
hide it with a static inline function.
This adds three new helpers for simple bit operations: set_bits,
clear_bits and test_bits (the last one defined as a regular function).
Signed-off-by: Li Chen <lchen@ambarella.com>
Link: https://lore.kernel.org/r/180eef422c3.deae9cd960729.8518395646822099769@zohomail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Stale Data.
They are a class of MMIO-related weaknesses which can expose stale data
by propagating it into core fill buffers. Data which can then be leaked
using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKXMkMTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoWGPD/idalLIhhV5F2+hZIKm0WSnsBxAOh9K
7y8xBxpQQ5FUfW3vm7Pg3ro6VJp7w2CzKoD4lGXzGHriusn3qst3vkza9Ay8xu8g
RDwKe6hI+p+Il9BV9op3f8FiRLP9bcPMMReW/mRyYsOnJe59hVNwRAL8OG40PY4k
hZgg4Psfvfx8bwiye5efjMSe4fXV7BUCkr601+8kVJoiaoszkux9mqP+cnnB5P3H
zW1d1jx7d6eV1Y063h7WgiNqQRYv0bROZP5BJkufIoOHUXDpd65IRF3bDnCIvSEz
KkMYJNXb3qh7EQeHS53NL+gz2EBQt+Tq1VH256qn6i3mcHs85HvC68gVrAkfVHJE
QLJE3MoXWOqw+mhwzCRrEXN9O1lT/PqDWw8I4M/5KtGG/KnJs+bygmfKBbKjIVg4
2yQWfMmOgQsw3GWCRjgEli7aYbDJQjany0K/qZTq54I41gu+TV8YMccaWcXgDKrm
cXFGUfOg4gBm4IRjJ/RJn+mUv6u+/3sLVqsaFTs9aiib1dpBSSUuMGBh548Ft7g2
5VbFVSDaLjB2BdlcG7enlsmtzw0ltNssmqg7jTK/L7XNVnvxwUoXw+zP7RmCLEYt
UV4FHXraMKNt2ZketlomC8ui2hg73ylUp4pPdMXCp7PIXp9sVamRTbpz12h689VJ
/s55bWxHkR6S
=LBxT
-----END PGP SIGNATURE-----
Merge tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 MMIO stale data fixes from Thomas Gleixner:
"Yet another hw vulnerability with a software mitigation: Processor
MMIO Stale Data.
They are a class of MMIO-related weaknesses which can expose stale
data by propagating it into core fill buffers. Data which can then be
leaked using the usual speculative execution methods.
Mitigations include this set along with microcode updates and are
similar to MDS and TAA vulnerabilities: VERW now clears those buffers
too"
* tag 'x86-bugs-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/mmio: Print SMT warning
KVM: x86/speculation: Disable Fill buffer clear within guests
x86/speculation/mmio: Reuse SRBDS mitigation for SBDS
x86/speculation/srbds: Update SRBDS mitigation selection
x86/speculation/mmio: Add sysfs reporting for Processor MMIO Stale Data
x86/speculation/mmio: Enable CPU Fill buffer clearing on idle
x86/bugs: Group MDS, TAA & Processor MMIO Stale Data mitigations
x86/speculation/mmio: Add mitigation for Processor MMIO Stale Data
x86/speculation: Add a common function for MD_CLEAR mitigation update
x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug
Documentation: Add documentation for Processor MMIO Stale Data
There are several places in the kernel where this kind of functionality is
being used. Provide a generic helper for such cases.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function is no longer used. So delete it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-10-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that deferred_probe_timeout is non-zero by default, fw_devlink will
never permanently block the probing of devices. It'll try its best to
probe the devices in the right order and then finally let devices probe
even if their suppliers don't have any drivers.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-8-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some devices might need to be probed and bound successfully before the
kernel boot sequence can finish and move on to init/userspace. For
example, a network interface might need to be bound to be able to mount
a NFS rootfs.
With fw_devlink=on by default, some of these devices might be blocked
from probing because they are waiting on a optional supplier that
doesn't have a driver. While fw_devlink will eventually identify such
devices and unblock the probing automatically, it might be too late by
the time it unblocks the probing of devices. For example, the IP4
autoconfig might timeout before fw_devlink unblocks probing of the
network interface.
This function is available to temporarily try and probe all devices that
have a driver even if some of their suppliers haven't been added or
don't have drivers.
The drivers can then decide which of the suppliers are optional vs
mandatory and probe the device if possible. By the time this function
returns, all such "best effort" probes are guaranteed to be completed.
If a device successfully probes in this mode, we delete all fw_devlink
discovered dependencies of that device where the supplier hasn't yet
probed successfully because they have to be optional dependencies.
This also means that some devices that aren't needed for init and could
have waited for their optional supplier to probe (when the supplier's
module is loaded later on) would end up probing prematurely with limited
functionality. So call this function only when boot would fail without
it.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-5-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Now that fw_devlink=on by default and fw_devlink supports
"power-domains" property, the execution will never get to the point
where driver_deferred_probe_check_state() is called before the supplier
has probed successfully or before deferred probe timeout has expired.
So, delete the call and replace it with -ENODEV.
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220601070707.3946847-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed
firmware files") added support for ZSTD compression, but in the process
also made the previously default XZ compression a config option.
That means that anybody who upgrades their kernel and does a
make oldconfig
to update their configuration, will end up without the XZ compression
that the configuration used to have.
Add the 'default y' to make sure this doesn't happen.
The whole compression question should probably be improved upon, since
it is now possible to "enable" compression in the kernel config but not
enable any actual compression algorithm, which makes it all very
useless. It makes no sense to ask Kconfig questions that enable
situations that are nonsensical like that.
This at least fixes the immediate problem of a kernel update resulting
in a nonbootable machine because of a missed option.
Fixes: 23cfbc6ec4 ("firmware: Add the support for ZSTD-compressed firmware files")
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since we had to effectively reverted
commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") in an earlier patch, a non-zero
deferred_probe_timeout will break NFS rootfs mounting [1] again. So, set
the default back to zero until we can fix that.
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
Fixes: 2b28a1a84a ("driver core: Extend deferred probe timeout on driver registration")
Cc: Mark Brown <broonie@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1]. This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.
Commit 35a672363a ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.
However, if wait_for_device_probe() is called from the kernel_init()
context:
- Before deferred_probe_initcall() [2], it causes the boot process to
hang due to a deadlock.
- After deferred_probe_initcall() [3], it blocks kernel_init() from
continuing till deferred_probe_timeout expires and beats the point of
deferred_probe_timeout that's trying to wait for userspace to load
modules.
Neither of this is good. So revert the changes to
wait_for_device_probe().
[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/
Fixes: 35a672363a ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz <jstultz@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Basil Eljuse <Basil.Eljuse@arm.com>
Cc: Ferry Toth <fntoth@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: John Stultz <jstultz@google.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the set of driver core changes for 5.19-rc1.
Note, I'm not really happy with this pull request as-is, see below for
details, but overall this is all good for everything but a small set of
systems, which we have a fix for already.
Lots of tiny driver core changes and cleanups happened this cycle,
but the two major things were:
- firmware_loader reorganization and additions including the
ability to have XZ compressed firmware images and the ability
for userspace to initiate the firmware load when it needs to,
instead of being always initiated by the kernel. FPGA devices
specifically want this ability to have their firmware changed
over the lifetime of the system boot, and this allows them to
work without having to come up with yet-another-custom-uapi
interface for loading firmware for them.
- physical location support added to sysfs so that devices that
know this information, can tell userspace where they are
located in a common way. Some ACPI devices already support
this today, and more bus types should support this in the
future.
Smaller changes included:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten any
linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request if you want to take them directly, _OR_ I can just revert
the probe timeout changes and they can wait for the next -rc1 merge
cycle. Given that the fixes are tested, and pretty simple, I'm leaning
toward that choice. Sorry this all came at the end of the merge window,
I should have resolved this all 2 weeks ago, that's my fault as it was
in the middle of some travel for me.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time outs.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
GkjLXBGNWOPBa5cU52qf
=yEi/
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.19-rc1.
Lots of tiny driver core changes and cleanups happened this cycle, but
the two major things are:
- firmware_loader reorganization and additions including the ability
to have XZ compressed firmware images and the ability for userspace
to initiate the firmware load when it needs to, instead of being
always initiated by the kernel. FPGA devices specifically want this
ability to have their firmware changed over the lifetime of the
system boot, and this allows them to work without having to come up
with yet-another-custom-uapi interface for loading firmware for
them.
- physical location support added to sysfs so that devices that know
this information, can tell userspace where they are located in a
common way. Some ACPI devices already support this today, and more
bus types should support this in the future.
Smaller changes include:
- driver_override api cleanups and fixes
- error path cleanups and fixes
- get_abi script fixes
- deferred probe timeout changes.
It's that last change that I'm the most worried about. It has been
reported to cause boot problems for a number of systems, and I have a
tested patch series that resolves this issue. But I didn't get it
merged into my tree before 5.18-final came out, so it has not gotten
any linux-next testing.
I'll send the fixup patches (there are 2) as a follow-on series to this
pull request.
All have been tested in linux-next for weeks, with no reported issues
other than the above-mentioned boot time-outs"
* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
driver core: fix deadlock in __device_attach
kernfs: Separate kernfs_pr_cont_buf and rename_lock.
topology: Remove unused cpu_cluster_mask()
driver core: Extend deferred probe timeout on driver registration
MAINTAINERS: add Russ Weight as a firmware loader maintainer
driver: base: fix UAF when driver_attach failed
test_firmware: fix end of loop test in upload_read_show()
driver core: location: Add "back" as a possible output for panel
driver core: location: Free struct acpi_pld_info *pld
driver core: Add "*" wildcard support to driver_async_probe cmdline param
driver core: location: Check for allocations failure
arch_topology: Trace the update thermal pressure
kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
export: fix string handling of namespace in EXPORT_SYMBOL_NS
rpmsg: use local 'dev' variable
rpmsg: Fix calling device_lock() on non-initialized device
firmware_loader: describe 'module' parameter of firmware_upload_register()
firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
firmware_loader: Fix configs for sysfs split
selftests: firmware: Add firmware upload selftests
...
Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for the
USB core, but there are the usual "hot spots" of development activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes. It seems this driver will never
be finished given that the IP core is showing up in zillions
of new devices and each implementation decides to do something
different with it...
- uvc gadget driver updates as more devices start to use and
rely on this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnZGw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymQhwCeLVANsQjBcL4ys4skl+1In17y28gAn3rEZ7rQ
Yv4uP9zadUqg3Cx0vjgf
=3s5s
-----END PGP SIGNATURE-----
Merge tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the "big" set of USB and Thunderbolt driver changes for
5.18-rc1. For the most part it's been a quiet development cycle for
the USB core, but there are the usual "hot spots" of development
activity.
Included in here are:
- Thunderbolt driver updates:
- fixes for devices without displayport adapters
- lane bonding support and improvements
- other minor changes based on device testing
- dwc3 gadget driver changes.
It seems this driver will never be finished given that the IP core
is showing up in zillions of new devices and each implementation
decides to do something different with it...
- uvc gadget driver updates as more devices start to use and rely on
this hardware as well
- usb_maxpacket() api changes to remove an unneeded and unused
parameter.
- usb-serial driver device id updates and small cleanups
- typec cleanups and fixes based on device testing
- device tree updates for usb properties
- lots of other small fixes and driver updates.
All of these have been in linux-next for weeks with no reported
problems"
* tag 'usb-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
USB: new quirk for Dell Gen 2 devices
usb: dwc3: core: Add error log when core soft reset failed
usb: dwc3: gadget: Move null pinter check to proper place
usb: hub: Simplify error and success path in port_over_current_notify
usb: cdns3: allocate TX FIFO size according to composite EP number
usb: dwc3: Fix ep0 handling when getting reset while doing control transfer
usb: Probe EHCI, OHCI controllers asynchronously
usb: isp1760: Fix out-of-bounds array access
xhci: Don't defer primary roothub registration if there is only one roothub
USB: serial: option: add Quectel BG95 modem
USB: serial: pl2303: fix type detection for odd device
xhci: Allow host runtime PM as default for Intel Alder Lake N xHCI
xhci: Remove quirk for over 10 year old evaluation hardware
xhci: prevent U2 link power state if Intel tier policy prevented U1
xhci: use generic command timer for stop endpoint commands.
usb: host: xhci-plat: omit shared hcd if either root hub has no ports
usb: host: xhci-plat: prepare operation w/o shared hcd
usb: host: xhci-plat: create shared hcd after having added main hcd
xhci: prepare for operation w/o shared hcd
xhci: factor out parts of xhci_gen_setup()
...
Including:
- Intel VT-d driver updates
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU
legacy binding
- Minor cleanups
- Patches to fix a BUG_ON in the vfio_iommu_group_notifier
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group
is either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Patches to make forcing of cache-coherent DMA more coherent
between IOMMU drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmKWCbUACgkQK/BELZcB
GuPHmRAAuoH9iK/jrC3SgrqpBfH2iRN7ovIX8dFvgbQWX27lhXF4gvj2/nYdIvPK
75j/LmdibuzV3Iez4kjbGKNG1AikwK3dKIH21a84f3ctnoamQyL6nMfCVBFaVD/D
kvPpTHyjbGPNf6KZyWQdkJ5DXD1aoG1DKkBnslH5pTNPqGuNqbcnRTg0YxiJFLBv
5w2B6jL06XRzunh+Sp1Dbj+po8ROjLRCEU+tdrndO8W/Dyp6+ZNNuxL9/3BM9zMj
py0M4piFtGnhmJSdym1eeHm7r1YRjkZw+MN+e8NcrcSihmDutEWo7nRRxA5uVaa+
3O2DNERqCvQUYxfNRUOKwzV8v51GYQHEPhvOe/MLgaEQDmDmlF2dHNGm93eCMdrv
m1cT011oU7pa4qHomwLyTJxSsR7FzJ37igq/WeY++MBhl+frqfzEQPVxF+W7GLb8
QvT/+woCPzLVpJbE7s0FUD4nbPd8c1dAz4+HO1DajxILIOTq1bnPIorSjgXODRjq
yzsiP1rAg0L0PsL7pXn3cPMzNCE//xtOsRsAGmaVv6wBoMLyWVFCU/wjPEdjrSWA
nXpAuCL84uxCEl/KLYMsg9UhjT6ko7CuKdsybIG9zNIiUau43uSqgTen0xCpYt0i
m//O/X3tPyxmoLKRW+XVehGOrBZW+qrQny6hk/Zex+6UJQqVMTA=
=W0hj
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Intel VT-d driver updates:
- Domain force snooping improvement.
- Cleanups, no intentional functional changes.
- ARM SMMU driver updates:
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU legacy
binding
- Minor cleanups
- Fix a BUG_ON in the vfio_iommu_group_notifier:
- Groundwork for upcoming iommufd framework
- Introduction of DMA ownership so that an entire IOMMU group is
either controlled by the kernel or by user-space
- MT8195 and MT8186 support in the Mediatek IOMMU driver
- Make forcing of cache-coherent DMA more coherent between IOMMU
drivers
- Fixes for thunderbolt device DMA protection
- Various smaller fixes and cleanups
* tag 'iommu-updates-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu/amd: Increase timeout waiting for GA log enablement
iommu/s390: Tolerate repeat attach_dev calls
iommu/vt-d: Remove hard coding PGSNP bit in PASID entries
iommu/vt-d: Remove domain_update_iommu_snooping()
iommu/vt-d: Check domain force_snooping against attached devices
iommu/vt-d: Block force-snoop domain attaching if no SC support
iommu/vt-d: Size Page Request Queue to avoid overflow condition
iommu/vt-d: Fold dmar_insert_one_dev_info() into its caller
iommu/vt-d: Change return type of dmar_insert_one_dev_info()
iommu/vt-d: Remove unneeded validity check on dev
iommu/dma: Explicitly sort PCI DMA windows
iommu/dma: Fix iova map result check bug
iommu/mediatek: Fix NULL pointer dereference when printing dev_name
iommu: iommu_group_claim_dma_owner() must always assign a domain
iommu/arm-smmu: Force identity domains for legacy binding
iommu/arm-smmu: Support Tegra234 SMMU
dt-bindings: arm-smmu: Add compatible for Tegra234 SOC
dt-bindings: arm-smmu: Document nvidia,memory-controller property
iommu/arm-smmu-qcom: Add SC8280XP support
dt-bindings: arm-smmu: Add compatible for Qualcomm SC8280XP
...
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously hidden
behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC
/ 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL
HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYpFUogAKCRDfioYZHlFs
Zz+VAP9o/NkYhbaM2Ne9ImgsdJii96gA8nN7q/q/ZoXjsSx2WQD+NRC5d3ZwZDCa
9YKEkntnvbnAZOCs+ZUuyZBgNh6vsgU=
=p92w
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
"Compute Express Link (CXL) updates for this cycle.
The highlight is new driver-core infrastructure and CXL subsystem
changes for allowing lockdep to validate device_lock() usage. Thanks
to PeterZ for setting me straight on the current capabilities of the
lockdep API, and Greg acked it as well.
On the CXL ACPI side this update adds support for CXL _OSC so that
platform firmware knows that it is safe to still grant Linux native
control of PCIe hotplug and error handling in the presence of CXL
devices. A circular dependency problem was discovered between suspend
and CXL memory for cases where the suspend image might be stored in
CXL memory where that image also contains the PCI register state to
restore to re-enable the device. Disable suspend for now until an
architecture is defined to clarify that conflict.
Lastly a collection of reworks, fixes, and cleanups to the CXL
subsystem where support for snooping mailbox commands and properly
handling the "mem_enable" flow are the highlights.
Summary:
- Add driver-core infrastructure for lockdep validation of
device_lock(), and fixup a deadlock report that was previously
hidden behind the 'lockdep no validate' policy.
- Add CXL _OSC support for claiming native control of CXL hotplug and
error handling.
- Disable suspend in the presence of CXL memory unless and until a
protocol is identified for restoring PCI device context from memory
hosted on CXL PCI devices.
- Add support for snooping CXL mailbox commands to protect against
inopportune changes, like set-partition with the 'immediate' flag
set.
- Rework how the driver detects legacy CXL 1.1 configurations (CXL
DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
configurations (CXL HDM Capability).
- Miscellaneous cleanups and fixes from -next exposure"
* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
cxl/port: Enable HDM Capability after validating DVSEC Ranges
cxl/port: Reuse 'struct cxl_hdm' context for hdm init
cxl/port: Move endpoint HDM Decoder Capability init to port driver
cxl/pci: Drop @info argument to cxl_hdm_decode_init()
cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
cxl/mem: Skip range enumeration if mem_enable clear
cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
cxl/pci: Move cxl_await_media_ready() to the core
cxl/mem: Validate port connectivity before dvsec ranges
cxl/mem: Fix cxl_mem_probe() error exit
cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
cxl/mem: Drop mem_enabled check from wait_for_media()
nvdimm: Fix firmware activation deadlock scenarios
device-core: Kill the lockdep_mutex
nvdimm: Drop nd_device_lock()
ACPI: NFIT: Drop nfit_device_lock()
nvdimm: Replace lockdep_mutex with local lock classes
cxl: Drop cxl_device_lock()
cxl/acpi: Add root device lockdep validation
...
file-backed transparent hugepages.
Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for runtime
enablement of the recent huge page vmemmap optimization feature.
Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
David Vernet has done some fixup work on the memcg selftests.
Peter Xu has taught userfaultfd to handle write protection faults against
shmem- and hugetlbfs-backed files.
More DAMON development from SeongJae Park - adding online tuning of the
feature and support for monitoring of fixed virtual address ranges. Also
easier discovery of which monitoring operations are available.
Nadav Amit has done some optimization of TLB flushing during mprotect().
Neil Brown continues to labor away at improving our swap-over-NFS support.
David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
Peng Liu fixed some errors in the core hugetlb code.
Joao Martins has reduced the amount of memory consumed by device-dax's
compound devmaps.
Some cleanups of the arch-specific pagemap code from Anshuman Khandual.
Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
Roman Gushchin has done more work on the memcg selftests.
And, of course, many smaller fixes and cleanups. Notably, the customary
million cleanup serieses from Miaohe Lin.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCYo52xQAKCRDdBJ7gKXxA
jtJFAQD238KoeI9z5SkPMaeBRYSRQmNll85mxs25KapcEgWgGQD9FAb7DJkqsIVk
PzE+d9hEfirUGdL6cujatwJ6ejYR8Q8=
=nFe6
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"Almost all of MM here. A few things are still getting finished off,
reviewed, etc.
- Yang Shi has improved the behaviour of khugepaged collapsing of
readonly file-backed transparent hugepages.
- Johannes Weiner has arranged for zswap memory use to be tracked and
managed on a per-cgroup basis.
- Munchun Song adds a /proc knob ("hugetlb_optimize_vmemmap") for
runtime enablement of the recent huge page vmemmap optimization
feature.
- Baolin Wang contributes a series to fix some issues around hugetlb
pagetable invalidation.
- Zhenwei Pi has fixed some interactions between hwpoisoned pages and
virtualization.
- Tong Tiangen has enabled the use of the presently x86-only
page_table_check debugging feature on arm64 and riscv.
- David Vernet has done some fixup work on the memcg selftests.
- Peter Xu has taught userfaultfd to handle write protection faults
against shmem- and hugetlbfs-backed files.
- More DAMON development from SeongJae Park - adding online tuning of
the feature and support for monitoring of fixed virtual address
ranges. Also easier discovery of which monitoring operations are
available.
- Nadav Amit has done some optimization of TLB flushing during
mprotect().
- Neil Brown continues to labor away at improving our swap-over-NFS
support.
- David Hildenbrand has some fixes to anon page COWing versus
get_user_pages().
- Peng Liu fixed some errors in the core hugetlb code.
- Joao Martins has reduced the amount of memory consumed by
device-dax's compound devmaps.
- Some cleanups of the arch-specific pagemap code from Anshuman
Khandual.
- Muchun Song has found and fixed some errors in the TLB flushing of
transparent hugepages.
- Roman Gushchin has done more work on the memcg selftests.
... and, of course, many smaller fixes and cleanups. Notably, the
customary million cleanup serieses from Miaohe Lin"
* tag 'mm-stable-2022-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (381 commits)
mm: kfence: use PAGE_ALIGNED helper
selftests: vm: add the "settings" file with timeout variable
selftests: vm: add "test_hmm.sh" to TEST_FILES
selftests: vm: check numa_available() before operating "merge_across_nodes" in ksm_tests
selftests: vm: add migration to the .gitignore
selftests/vm/pkeys: fix typo in comment
ksm: fix typo in comment
selftests: vm: add process_mrelease tests
Revert "mm/vmscan: never demote for memcg reclaim"
mm/kfence: print disabling or re-enabling message
include/trace/events/percpu.h: cleanup for "percpu: improve percpu_alloc_percpu event trace"
include/trace/events/mmflags.h: cleanup for "tracing: incorrect gfp_t conversion"
mm: fix a potential infinite loop in start_isolate_page_range()
MAINTAINERS: add Muchun as co-maintainer for HugeTLB
zram: fix Kconfig dependency warning
mm/shmem: fix shmem folio swapoff hang
cgroup: fix an error handling path in alloc_pagecache_max_30M()
mm: damon: use HPAGE_PMD_SIZE
tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
nodemask.h: fix compilation error with GCC12
...
- Allow error pointer to be passed to fwnode APIs (Andy Shevchenko).
- Introduce fwnode_for_each_parent_node() (Andy Shevchenko, Douglas
Anderson).
- Advertise fwnode and device property count API calls (Andy
Shevchenko).
- Clean up fwnode_is_ancestor_of() (Andy Shevchenko).
- Convert device_{dma_supported,get_dma_attr} to fwnode (Sakari Ailus).
- Release subnode properties with data nodes (Sakari Ailus).
- Add ->iomap() and ->irq_get() to fwnode operations (Sakari Ailus).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL4qwSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxO+AP/ir+1VKydUioBbH9vh8grCF1vAoJhknv
jb0STEBq+SH7+EwWbpj/J9+ldACvIJ0wnrsZ83vbl9k0z5+mrddw3HiQrrCc2PSc
hRfWCi4ihnTlDK2Sctm/suBSNivh8kcGwJUcYOwYaWEVdGoUXrqldWzzRo48DYgo
KnHm4e2V5Gob2u4edbdgtOl4BGcBcOPNDdOe15Ra6pjcp+0DUWP55j51kmhpjLSk
PuAEtbNLPOzOVIVFpANU4Q7/3G3gjKIwPjKwAsWaa2UKYdsmcLhr8gaBo05UBV6g
M6VPq14OHLGBlwxcxrsFgsNy9uBHnxEAHE0NRg7hrA2eHAqiuA7R+itBvzcgx8Wj
HRTI9ZFjZabx82rWkb7iKI/K/ok5igzX3/zrbr7Bf3+7QY0+UEMqpmPVak01xqe7
nFk7x3rHgZOsKPs+0ZN8pG16vEyuPFEQLJV0o25fiyhK2ob8tbAZxy3nVBRm8H/o
yzuD9rYmZKHHwp3Ff4Bn+S8/T6vpYhsQkEz+NjjGMcXZXMBhbH+wsu9aBtVY/Jdp
wVrmkxm/RC3b97oVoisqY05Wv/tgUwybSkPbARZWQjmshuazrvaJAi+N/LWwBuMK
gsBVLTivWthjlCiPwRZlADTKbhlpdwSikWaC0xktsyqaO3JDtJFn305V7L9oWpQz
tShfSJ2GRBOT
=ly4P
-----END PGP SIGNATURE-----
Merge tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
"These mostly extend the device property API and make it easier to use
in some cases.
Specifics:
- Allow error pointer to be passed to fwnode APIs (Andy Shevchenko).
- Introduce fwnode_for_each_parent_node() (Andy Shevchenko, Douglas
Anderson).
- Advertise fwnode and device property count API calls (Andy
Shevchenko).
- Clean up fwnode_is_ancestor_of() (Andy Shevchenko).
- Convert device_{dma_supported,get_dma_attr} to fwnode (Sakari
Ailus).
- Release subnode properties with data nodes (Sakari Ailus).
- Add ->iomap() and ->irq_get() to fwnode operations (Sakari Ailus)"
* tag 'devprop-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Advertise fwnode and device property count API calls
device property: Fix recent breakage of fwnode_get_next_parent_dev()
device property: Drop 'test' prefix in parameters of fwnode_is_ancestor_of()
device property: Introduce fwnode_for_each_parent_node()
device property: Allow error pointer to be passed to fwnode APIs
ACPI: property: Release subnode properties with data nodes
device property: Add irq_get to fwnode operation
device property: Add iomap to fwnode operations
ACPI: property: Move acpi_fwnode_device_get_match_data() up
device property: Convert device_{dma_supported,get_dma_attr} to fwnode
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register() when
device_register() fails by calling thermal_cooling_device_destroy_sysfs()
(Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params (Corentin
Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver and
document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3MESHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxgzsQAKj4D0ROHLdqReYeIutS7IU6YQ+ejh3N
krcAZvxQd8B3sVwSHHABPdsHwwfYjP4BwoHHVtha5q+5zFGBq5F3fJHsDeRxN23T
NEOW81HBaLu81N5gNuAYg+NQcdqBR4U4IGKZfxWRyx13OMECXGtJRb/SIW/TuDYk
AQIrqOivVEsVRtn+qp+x0rKHsj6ge+2lU1OyJagr2BDhgLrwzxl6cmNBfali1C3D
sw4SqGwGVjEB0QDDpMbty9BsLEjNRE86kwoPh2u0K9dT9/b/P9pk1lp+pOnltspZ
eOdwr0CtoEHw2x5tuhbO7fmvNIuf5jgSDWsCrP/xYnsozcRiwWsgyeJj4f5soreN
kTJB8S/FH+fd7UuYqdmDIeJpDnkkGt1jIqjkDvfJNL7ffBWIe/meXKF4vBnrdlbd
FMNQwzgc5eug07IFjOE43V1v5Hw1H1leVUwZczOMGNIhqyy0WZyQr7vzwbr16jCO
x0hu3q3Y++5SUAUy9WzsOOrpRHa9JxUwVhZuLG7ajf+6pqPxB8kqs9CJe+sgWKju
T4KqHbljJ2Gnkm9SngT0BdnB+AgparhXf8/8Mj0ZBQvamXw+ylNbYfG7qTZk5Ne/
rUf4YDew6hGDqikyJBe2e0a1lLSoN4zaADC1lLYhC+Zp7m6eK0r3y3mhic0bZe1c
RU/2eeA4kTuZ
=3ltV
-----END PGP SIGNATURE-----
Merge tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add a thermal library and thermal tools to wrap the netlink
interface into event-based callbacks, improve overheat condition
handling during suspend-to-idle on Intel SoCs, add some new hardware
support, fix bugs and clean up code.
Specifics:
- Add thermal library and thermal tools to encapsulate the netlink
into event based callbacks (Daniel Lezcano, Jiapeng Chong).
- Improve overheat condition handling during suspend-to-idle in the
Intel PCH thermal driver (Zhang Rui).
- Use local ops instead of global ops in devfreq_cooling (Kant Fan).
- Clean up _OSC handling in int340x (Davidlohr Bueso).
- Switch hisi_termal from CONFIG_PM_SLEEP guards to pm_sleep_ptr()
(Hesham Almatary).
- Add new k3 j72xx bangdap driver and the corresponding bindings
(Keerthy).
- Fix missing of_node_put() in the SC iMX driver at probe time
(Miaoqian Lin).
- Fix memory leak in __thermal_cooling_device_register()
when device_register() fails by calling
thermal_cooling_device_destroy_sysfs() (Yang Yingliang).
- Add sc8180x and sc8280xp compatible string in the DT bindings and
lMH support for QCom tsens driver (Bjorn Andersson).
- Fix OTP Calibration Register values conforming to the documentation
on RZ/G2L and bindings documentation for RZ/G2UL (Biju Das).
- Fix type in kerneldoc description for __thermal_bind_params
(Corentin Labbe).
- Fix potential NULL dereference in sr_thermal_probe() on Broadcom
platform (Zheng Yongjun).
- Add change mode ops to the thermal-of sensor (Manaf Meethalavalappu
Pallikunhi).
- Fix non-negative value support by preventing the value to be clamp
to zero (Stefan Wahren).
- Add compatible string and DT bindings for MSM8960 tsens driver
(Dmitry Baryshkov).
- Add hwmon support for K3 driver (Massimiliano Minella).
- Refactor and add multiple generations support for QCom ADC driver
(Jishnu Prakash).
- Use platform_get_irq_optional() to get the interrupt on RCar driver
and document Document RZ/V2L bindings (Lad Prabhakar).
- Remove NULL check after container_of() call from the Intel HFI
thermal driver (Haowen Bai)"
* tag 'thermal-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (38 commits)
thermal: intel: pch: improve the cooling delay log
thermal: intel: pch: enhance overheat handling
thermal: intel: pch: move cooling delay to suspend_noirq phase
PM: wakeup: expose pm_wakeup_pending to modules
thermal: k3_j72xx_bandgap: Add the bandgap driver support
dt-bindings: thermal: k3-j72xx: Add VTM bindings documentation
thermal/drivers/imx_sc_thermal: Fix refcount leak in imx_sc_thermal_probe
thermal/core: Fix memory leak in __thermal_cooling_device_register()
dt-bindings: thermal: tsens: Add sc8280xp compatible
dt-bindings: thermal: lmh: Add Qualcomm sc8180x compatible
thermal/drivers/qcom/lmh: Add sc8180x compatible
thermal/drivers/rz2gl: Fix OTP Calibration Register values
dt-bindings: thermal: rzg2l-thermal: Document RZ/G2UL bindings
thermal: thermal_of: fix typo on __thermal_bind_params
tools/thermal: remove unneeded semicolon
tools/lib/thermal: remove unneeded semicolon
thermal/drivers/broadcom: Fix potential NULL dereference in sr_thermal_probe
tools/thermal: Add thermal daemon skeleton
tools/thermal: Add a temperature capture tool
tools/thermal: Add util library
...
- Update the Energy Model support code to allow the Energy Model to be
artificial, which means that the power values may not be on a uniform
scale with other devices providing power information, and update the
cpufreq_cooling and devfreq_cooling thermal drivers to support
artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown,
Dan Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen
Yu).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmKL3hsSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxW4oP/RzMh6dclWXs3J/gUCKTqRepq6cb80tq
Q2r9xRRHwy6ZH/PVddGDHmhQ7d3NAv13s4srA9kznZognF3hzuxnGau226ilDqHh
qxVSBRjWY9ijxRBvkcCaa6HZm4Chb91pUX0CLpdYSl9BTgIdk66HZYaMsKhHU/di
j7KKHPdKyyQkssWnMjGEyuaF+UebiEgISCF3+X0eb6c1m7GHXpgLJVxNy0pKkUdK
j+n6+ms12OlVLtg1eIl0J5824w/rkK3ZdqfEXJSq++mNMqSj/KCI3yWpzsLKp9AB
xxhox/tPgJVyON8Vtbb2IkWkiQUKeSrAGIUYXWmnwIZYLPSGD7BPzr82Cxr7S/ez
imMB+1Qd3SsOQ9EdI9rGYgNsEF2vOs1xjMehSdUdmTz148IzBOBt4YyQeb/mfXqH
nh9eVuFCzqH1lAayYt6iP1+V5gQn9as/+rR91k4k4A6OKXomuQUGORLeHfuKMfNH
eBZ72tdXqiq6z+ag3lY3pBAMSm11epCOa3VR6QNaC7hrlY3AZP+o3tIUL6W813b+
V3l1gWApGHZE1hiDM95dll/dIt9IZpTRd3dlqF/YnFW7fPDrz71EGvhrZpO7vdO0
/G6eJcCDjqJVcbCE8Y77I6/AXjpVQ7PRPeNx6aW7jPcQhpVIgcsF2BGjk9anjXDs
3yHJs9R/HMmA
=Hewm
-----END PGP SIGNATURE-----
Merge tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add support for 'artificial' Energy Models in which power
numbers for different entities may be in different scales, add support
for some new hardware, fix bugs and clean up code in multiple places.
Specifics:
- Update the Energy Model support code to allow the Energy Model to
be artificial, which means that the power values may not be on a
uniform scale with other devices providing power information, and
update the cpufreq_cooling and devfreq_cooling thermal drivers to
support artificial Energy Models (Lukasz Luba).
- Make DTPM check the Energy Model type (Lukasz Luba).
- Fix policy counter decrementation in cpufreq if Energy Model is in
use (Pierre Gondois).
- Add CPU-based scaling support to passive devfreq governor (Saravana
Kannan, Chanwoo Choi).
- Update the rk3399_dmc devfreq driver (Brian Norris).
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz
Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
- Add AlderLake processor support to the intel_idle driver (Zhang
Rui).
- Fix regression leading to no genpd governor in the PSCI cpuidle
driver and fix the riscv-sbi cpuidle driver to allow a genpd
governor to be used (Ulf Hansson).
- Fix cpufreq governor clean up code to avoid using kfree() directly
to free kobject-based items (Kevin Hao).
- Prepare cpufreq for powerpc's asm/prom.h cleanup (Christophe
Leroy).
- Make intel_pstate notify frequency invariance code when no_turbo is
turned on and off (Chen Yu).
- Add Sapphire Rapids OOB mode support to intel_pstate (Srinivas
Pandruvada).
- Make cpufreq avoid unnecessary frequency updates due to mismatch
between hardware and the frequency table (Viresh Kumar).
- Make remove_cpu_dev_symlink() clear the real_cpus mask to simplify
code (Viresh Kumar).
- Rearrange cpufreq_offline() and cpufreq_remove_dev() to make the
calling convention for some driver callbacks consistent (Rafael
Wysocki).
- Avoid accessing half-initialized cpufreq policies from the show()
and store() sysfs functions (Schspa Shi).
- Rearrange cpufreq_offline() to make the calling convention for some
driver callbacks consistent (Schspa Shi).
- Update CPPC handling in cpufreq (Pierre Gondois).
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
- Update the turbostat utility to version 2022.04.16 (Len Brown, Dan
Merillat, Sumeet Pawnikar, Zephaniah E. Loss-Cutler-Hull, Chen Yu)"
* tag 'pm-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (94 commits)
PM: domains: Trust domain-idle-states from DT to be correct by genpd
PM: domains: Measure power-on/off latencies in genpd based on a governor
PM: domains: Allocate governor data dynamically based on a genpd governor
PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
PM: domains: Fix initialization of genpd's next_wakeup
PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
PM: domains: Measure suspend/resume latencies in genpd based on governor
PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
PM: domains: Allocate gpd_timing_data dynamically based on governor
PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
PM: domains: Drop redundant code for genpd always-on governor
PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
powercap: intel_rapl: remove redundant store to value after multiply
cpufreq: CPPC: Enable dvfs_possible_from_any_cpu
cpufreq: CPPC: Enable fast_switch
ACPI: CPPC: Assume no transition latency if no PCCT
ACPI: bus: Set CPPC _OSC bits for all and when CPPC_LIB is supported
ACPI: CPPC: Check _OSC for flexible address space
...
The main change here is Marek's addition of bulk read/write callbacks
for individual regmaps, we've supported single register operations for a
while but there's enough hardware out there which can use bulk equivalents
to make it worthwhile.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmKLgdIACgkQJNaLcl1U
h9BkMQf9GvA9ZtjOzR/K/R9IscyDqnJ/Svwp0nhnbrFzeCDMQyWEUVen+1tE45gl
ACIC2RPDYUfuYeoHC6yxOYpJNwI5WhlLfO9uH14ojRl4JVjnjarFvbRb0xfgCCeg
pp6QRb1DeZzUWQBMcW3ruo76gPOjEI94CVUNbXzcDOUS+ZUDPtbcyeB7wmG2i3zU
qpbG6DN0av/tuFSiGZgE2OFA9pGFDQwsZiu5DYxCTk/vI6hH8lddbjnqtxZGi/hb
g3Ho9XewkI2WTnTaSbbX8Rfl6n/rccrsXy9u7E1usK1vEmzU0mZHIpo1T91IFd5j
ksSJcXWHb7d54tWCwV6AXwbC8ZS4NA==
=Km5j
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"The main change here is Marek's addition of bulk read/write callbacks
for individual regmaps, we've supported single register operations for
a while but there's enough hardware out there which can use bulk
equivalents to make it worthwhile"
* tag 'regmap-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Add missing map->bus check
regmap: Add bulk read/write callbacks into regmap_config
regmap: cache: set max_register with reg_stride
regmap: Constify static regmap_bus structs
Merge generlic power domains update for 5.19-rc1:
- Extend dev_pm_domain_detach() doc (Krzysztof Kozlowski).
- Move genpd's time-accounting to ktime_get_mono_fast_ns() (Ulf
Hansson).
- Improve the way genpd deals with its governors (Ulf Hansson).
* pm-domains:
PM: domains: Trust domain-idle-states from DT to be correct by genpd
PM: domains: Measure power-on/off latencies in genpd based on a governor
PM: domains: Allocate governor data dynamically based on a genpd governor
PM: domains: Clean up some code in pm_genpd_init() and genpd_remove()
PM: domains: Fix initialization of genpd's next_wakeup
PM: domains: Fixup QoS latency measurements for IRQ safe devices in genpd
PM: domains: Measure suspend/resume latencies in genpd based on governor
PM: domains: Move the next_wakeup variable into the struct gpd_timing_data
PM: domains: Allocate gpd_timing_data dynamically based on governor
PM: domains: Skip another warning in irq_safe_dev_in_sleep_domain()
PM: domains: Rename irq_safe_dev_in_no_sleep_domain() in genpd
PM: domains: Don't check PM_QOS_FLAG_NO_POWER_OFF in genpd
PM: domains: Drop redundant code for genpd always-on governor
PM: domains: Add GENPD_FLAG_RPM_ALWAYS_ON for the always-on governor
PM: domains: Move genpd's time-accounting to ktime_get_mono_fast_ns()
PM: domains: Extend dev_pm_domain_detach() doc
Merge PM core changes, updates related to system sleep and power capping
updates for 5.19-rc1:
- Export dev_pm_ops instead of suspend() and resume() in the IIO
chemical scd30 driver (Jonathan Cameron).
- Add namespace variants of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and
PM-runtime counterparts (Jonathan Cameron).
- Move symbol exports in the IIO chemical scd30 driver into the
IIO_SCD30 namespace (Jonathan Cameron).
- Avoid device PM-runtime usage count underflows (Rafael Wysocki).
- Allow dynamic debug to control printing of PM messages (David
Cohen).
- Fix some kernel-doc comments in hibernation code (Yang Li, Haowen
Bai).
- Preserve ACPI-table override during hibernation (Amadeusz Sławiński).
- Improve support for suspend-to-RAM for PSCI OSI mode (Ulf Hansson).
- Make Intel RAPL power capping driver support the RaptorLake and
AlderLake N processors (Zhang Rui, Sumeet Pawnikar).
- Remove redundant store to value after multiply in the RAPL power
capping driver (Colin Ian King).
* pm-core:
PM: runtime: Avoid device usage count underflows
iio: chemical: scd30: Move symbol exports into IIO_SCD30 namespace
PM: core: Add NS varients of EXPORT[_GPL]_SIMPLE_DEV_PM_OPS and runtime pm equiv
iio: chemical: scd30: Export dev_pm_ops instead of suspend() and resume()
* pm-sleep:
cpuidle: PSCI: Improve support for suspend-to-RAM for PSCI OSI mode
PM: runtime: Allow to call __pm_runtime_set_status() from atomic context
PM: hibernate: Don't mark comment as kernel-doc
x86/ACPI: Preserve ACPI-table override during hibernation
PM: hibernate: Fix some kernel-doc comments
PM: sleep: enable dynamic debug support within pm_pr_dbg()
PM: sleep: Narrow down -DDEBUG on kernel/power/ files
* powercap:
powercap: intel_rapl: remove redundant store to value after multiply
powercap: intel_rapl: add support for ALDERLAKE_N
powercap: RAPL: Add Power Limit4 support for RaptorLake
powercap: intel_rapl: add support for RaptorLake
Add the sysfs reporting file for Processor MMIO Stale Data
vulnerability. It exposes the vulnerability and mitigation state similar
to the existing files for the other hardware vulnerabilities.
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
If genpd has parsed the domain-idle-states from DT, it's reasonable to
believe that the parsed data should be correct for the HW in question.
Based upon this, it seem superfluous to let genpd measure the corresponding
power-on/off latencies for these states.
Therefore, let's improve the behaviour in genpd by avoiding the
measurements for the domain-idle-states that have been parsed from DT.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The measurements of the power-on|off latencies in genpd for a PM domain are
superfluous, unless the corresponding genpd has a governor assigned to it,
which would make use of the data.
Therefore, let's improve the behaviour in genpd by making the measurements
conditional, based upon if there's a governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If a genpd doesn't have an associated governor assigned, several variables
in the struct generic_pm_domain becomes superfluous.
Rather than wasting memory in allocated genpds, let's move the variables
from the struct generic_pm_domain into a new separate struct. In this way,
we can instead dynamically decide when we need to allocate the
corresponding data for it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
To improve the readability of the code, let's move the parts that deals
with allocation/freeing of data, into two separate functions.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In the genpd governor we walk the list of child-domains to take into
account their next_wakeup. If the child-domain itself, doesn't have a
governor assigned to it, we can end up using the next_wakeup value before
it has been properly initialized. To prevent a possible incorrect behaviour
in the governor, let's initialize next_wakeup to KTIME_MAX.
Fixes: c79aa080fb ("PM: domains: use device's next wakeup to determine domain idle state")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When an IRQ safe device is attached to a non-IRQ safe PM domain, genpd
needs to prevent the PM domain from being powered off. However, genpd still
allows the device to be runtime suspended/resumed, hence it's also
reasonable to think that a governor may be used to validate the QoS latency
constraints.
Unfortunately, genpd_runtime_resume() treats the configuration above, as a
reason to skip measuring the QoS resume latency for the device. This is a
legacy behaviour that was earlier correct, but should have been changed
when genpd was transformed into its current behaviour around how it manages
IRQ safe devices. Luckily, there's no report about problems, so let's just
fixup the behaviour.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The QoS latency measurements for devices in genpd_runtime_suspend|resume()
are superfluous, unless the corresponding genpd has a governor assigned to
it, which would make use of the data.
Therefore, let's improve the behaviour in genpd by making the measurements
conditional, based upon if there's a governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If the corresponding genpd for the device doesn't use a governor, the
variable next_wakeup within the struct generic_pm_domain_data becomes
superfluous.
To avoid wasting memory, let's move it into the struct gpd_timing_data,
which is already being allocated based upon if there is governor assigned.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
If a genpd doesn't have an associated governor assigned, there's really no
point to allocate the per device gpd_timing_data, as the data isn't being
used by a governor anyway.
To avoid wasting memory, let's therefore convert the corresponding td
variable in the struct generic_pm_domain_data into a pointer and manage the
allocation of its data dynamically.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In irq_safe_dev_in_sleep_domain() we correctly skip the dev_warn_once() if
the corresponding genpd for the device, has the GENPD_FLAG_ALWAYS_ON flag
being set. For the same reason (the genpd is always-on in runtime), let's
also skip the warning if the GENPD_FLAG_RPM_ALWAYS_ON flag is set for the
genpd.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The name "irq_safe_dev_in_no_sleep_domain", doesn't really match the
conditions that are being checked in the function, hence the code becomes a
bit confusing to read.
Let's clarify this by renaming it into "irq_safe_dev_in_sleep_domain" and
let's also take the opportunity to clarify a corresponding comment in the
code.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Back in the days when genpd supported intermediate power states of its
devices, it made sense to check the PM_QOS_FLAG_NO_POWER_OFF in
genpd_power_off(). This because the attached devices were all being put
into low power state together when the PM domain was also being powered
off.
At this point, the flag PM_QOS_FLAG_NO_POWER_OFF is better checked by
drivers from their ->runtime_suspend() callbacks, like in the
usb_port_runtime_suspend(), for example. Or perhaps an even better option
is to set the QoS resume latency constraint for the device to zero, which
informs the runtime PM core to prevent the device from being runtime
suspended.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Due to recent changes, the always-on governor is always used with a genpd
that has the GENPD_FLAG_RPM_ALWAYS_ON flag being set. This means genpd,
doesn't invoke the governor's ->power_down_ok() callback, which makes the
code in the governor redundant, so let's drop it.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Rather than relying on the genpd provider to set the corresponding flag,
GENPD_FLAG_RPM_ALWAYS_ON, when the always-on governor is being used, let's
add it in pm_genpd_init(). In this way, it starts to benefits all genpd
providers immediately.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
intel_pch_thermal driver needs a long delay to cool itself (60 seconds
in maximum) during suspend. When a wakeup event occures during the
delay, it is better for the intel_pch_thermal driver to detect this and
quit cooling because the suspend is likely to abort anyway.
Thus expose pm_wakeup_pending to modules so that intel_pch_thermal
driver can be aware of the wakeup events.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev) // get lock dev
async_schedule_dev(__device_attach_async_helper, dev); // func
async_schedule_node
async_schedule_node_domain(func)
entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
/* when fail or work limit, sync to execute func, but
__device_attach_async_helper will get lock dev as
well, which will lead to A-A deadlock. */
if (!entry || atomic_read(&entry_count) > MAX_WORK) {
func;
else
queue_work_node(node, system_unbound_wq, &entry->work)
device_unlock(dev)
As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.
To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.
Fixes: 765230b5f0 ("driver-core: add asynchronous probing support for drivers")
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The deferred probe timer that's used for this currently starts at
late_initcall and runs for driver_deferred_probe_timeout seconds. The
assumption being that all available drivers would be loaded and
registered before the timer expires. This means, the
driver_deferred_probe_timeout has to be pretty large for it to cover the
worst case. But if we set the default value for it to cover the worst
case, it would significantly slow down the average case. For this
reason, the default value is set to 0.
Also, with CONFIG_MODULES=y and the current default values of
driver_deferred_probe_timeout=0 and fw_devlink=on, devices with missing
drivers will cause their consumer devices to always defer their probes.
This is because device links created by fw_devlink defer the probe even
before the consumer driver's probe() is called.
Instead of a fixed timeout, if we extend an unexpired deferred probe
timer on every successful driver registration, with the expectation more
modules would be loaded in the near future, then the default value of
driver_deferred_probe_timeout only needs to be as long as the worst case
time difference between two consecutive module loads.
So let's implement that and set the default value to 10 seconds when
CONFIG_MODULES=y.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Rob Herring <robh@kernel.org>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Kevin Hilman <khilman@kernel.org>
Cc: Thierry Reding <treding@nvidia.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Cc: linux-gpio@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: iommu@lists.linux-foundation.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220429220933.1350374-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When driver_attach(drv); failed, the driver_private will be freed.
But it has been added to the bus, which caused a UAF.
To fix it, we need to delete it from the bus when failed.
Fixes: 190888ac01 ("driver core: fix possible missing of device probe")
Signed-off-by: Schspa Shi <schspa@gmail.com>
Link: https://lore.kernel.org/r/20220513112444.45112-1-schspa@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add "back" as a possible panel output when _PLD output from ACPI
indicates back panel.
Fixes: 6423d29510 ("driver core: Add sysfs support for physical location of a device")
Signed-off-by: Won Chung <wonchung@google.com>
Link: https://lore.kernel.org/r/20220509214930.3573518-1-wonchung@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After struct acpi_pld_info *pld is used to fill in physical location
values, it should be freed to prevent memleak.
Suggested-by: Yu Watanabe <watanabe.yu@gmail.com>
Signed-off-by: Won Chung <wonchung@google.com>
Link: https://lore.kernel.org/r/20220509173135.3515126-1-wonchung@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's currently no way to use driver_async_probe kernel cmdline param
to enable default async probe for all drivers. So, add support for "*"
to match with all driver names. When "*" is used, all other drivers
listed in driver_async_probe are drivers that will NOT match the "*".
For example:
* driver_async_probe=drvA,drvB,drvC
drvA, drvB and drvC do asynchronous probing.
* driver_async_probe=*
All drivers do asynchronous probing except those that have set
PROBE_FORCE_SYNCHRONOUS flag.
* driver_async_probe=*,drvA,drvB,drvC
All drivers do asynchronous probing except drvA, drvB, drvC and those
that have set PROBE_FORCE_SYNCHRONOUS flag.
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Feng Tang <feng.tang@intel.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20220504005344.117803-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The map->bus can be NULL here, add the missing NULL pointer check.
Fixes: d77e745613 ("regmap: Add bulk read/write callbacks into regmap_config")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
To: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/20220509003035.225272-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
The documentation of fwnode and device property array API calls isn't
pointing out to the shortcuts to count the number of elements of size
in an array. Amend the documentation to advertise fwnode and device
property count API calls.
Reported-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Device drivers may decide to not load firmware when probed to avoid
slowing down the boot process should the firmware filesystem not be
available yet. In this case, the firmware loading request may be done
when a device file associated with the driver is first accessed. The
credentials of the userspace process accessing the device file may be
used to validate access to the firmware files requested by the driver.
Ensure that the kernel assumes the responsibility of reading the
firmware.
This was observed on Android for a graphic driver loading their firmware
when the device file (e.g. /dev/mali0) was first opened by userspace
(i.e. surfaceflinger). The security context of surfaceflinger was used
to validate the access to the firmware file (e.g.
/vendor/firmware/mali.bin).
Previously, Android configurations were not setting up the
firmware_class.path command line argument and were relying on the
userspace fallback mechanism. In this case, the security context of the
userspace daemon (i.e. ueventd) was consistently used to read firmware
files. More Android devices are now found to set firmware_class.path
which gives the kernel the opportunity to read the firmware directly
(via kernel_read_file_from_path_initns). In this scenario, the current
process credentials were used, even if unrelated to the loading of the
firmware file.
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Cc: <stable@vger.kernel.org> # 5.10
Reviewed-by: Paul Moore <paul@paul-moore.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20220502004952.3970800-1-tweek@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Check whether the kzalloc() succeeds and return false if it fails.
Fixes: 6423d29510 ("driver core: Add sysfs support for physical location of a device")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YnOn28OFBHHd5bQb@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add trace event to capture the moment of the call for updating the thermal
pressure value. It's helpful to investigate how often those events occur
in a system dealing with throttling. This trace event is needed since the
old 'cdev_update' might not be used by some drivers.
The old 'cdev_update' trace event only provides a cooling state
value: [0, n]. That state value then needs additional tools to translate
it: state -> freq -> capacity -> thermal pressure. This new trace event
just stores proper thermal pressure value in the trace buffer, no need
for additional logic. This is helpful for cooperation when someone can
simply sends to the list the trace buffer output from the platform (no
need from additional information from other subsystems).
There are also platforms which due to some design reasons don't use
cooling devices and thus don't trigger old 'cdev_update' trace event.
They are also important and measuring latency for the thermal signal
raising/decaying characteristics is in scope. This new trace event
would cover them as well.
We already have a trace point 'pelt_thermal_tp' which after a change to
trace event can be paired with this new 'thermal_pressure_update' and
derive more insight what is going on in the system under thermal pressure
(and why).
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20220427080806.1906-1-lukasz.luba@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Merge series from Marek Vasut:
This patch adds an API for custom bulk operations on a simple regmap,
the number of single use bus implementations shows there's a need for
this.
Currently the regmap_config structure only allows the user to implement
single element register read/write using .reg_read/.reg_write callbacks.
The regmap_bus already implements bulk counterparts of both, and is being
misused as a workaround for the missing bulk read/write callbacks in
regmap_config by a couple of drivers. To stop this misuse, add the bulk
read/write callbacks to regmap_config and call them from the regmap core
code.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Jagan Teki <jagan@amarulasolutions.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Robert Foss <robert.foss@linaro.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
To: dri-devel@lists.freedesktop.org
Link: https://lore.kernel.org/r/20220430025145.640305-1-marex@denx.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Due to a subtle typo, instead of commit 87ffea0947 ("device
property: Introduce fwnode_for_each_parent_node()") being a no-op
change, it ended up causing the display on my sc7180-trogdor-lazor
device from coming up unless I added "fw_devlink=off" to my kernel
command line. Fix the typo.
Fixes: 87ffea0947 ("device property: Introduce fwnode_for_each_parent_node()")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmJu9FYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGAyEH/16xtJSpLmLwrQzG
o+4ToQxSQ+/9UHyu0RTEvHg2THm9/8emtIuYyc/5FgdoWctcSa3AaDcveWmuWmkS
KYcdhfJsaEqjNHS3OPYXN84fmo9Hel7263shu5+IYmP/sN0DfQp6UWTryX1q4B3Q
4Pdutkuq63Uwd8nBZ5LXQBumaBrmkkuMgWEdT4+6FOo1mPzwdIGBxCuz1UsNNl5k
chLWxkQfe2eqgWbYJrgCQfrVdORXVtoU2fGilZUNrHRVGkkldXkkz5clJfapyZD3
odmZCEbrE4GPKgZwCmDERMfD1hzhZDtYKiHfOQ506szH5ykJjPBcOjHed7dA60eB
J3+wdek=
=39Ca
-----END PGP SIGNATURE-----
Merge 5.18-rc5 into usb-next
We need the USB fixes in here, and this resolves a merge issue in
drivers/usb/dwc3/drd.c
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stephen Rothwell reported kernel-doc warning:
drivers/base/firmware_loader/sysfs_upload.c:285: warning: Function parameter or member 'module' not described in 'firmware_upload_register'
Fix the warning by describing the 'module' parameter.
Link: https://lore.kernel.org/linux-next/20220502083658.266d55f8@canb.auug.org.au/
Fixes: 97730bbb24 ("firmware_loader: Add firmware-upload support")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Linux Next Mailing List <linux-next@vger.kernel.org>
Reviewed-by: Russ Weight <russell.h.weight@intel.com>
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220502051456.30741-1-bagasdotme@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here are some small driver core and kernfs fixes for some reported
problems. They include:
- kernfs regression that is causing oopses in 5.17 and newer
releases
- topology sysfs fixes for a few small reported problems.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYm1QrQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykJQACgj3QhUJxgKSQ6Rri+ODHg4KgDSZsAoIuD3rjq
5zRFYAcmogYgmN50HNVa
=2LQM
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core and kernfs fixes for some reported
problems. They include:
- kernfs regression that is causing oopses in 5.17 and newer releases
- topology sysfs fixes for a few small reported problems.
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kernfs: fix NULL dereferencing in kernfs_remove
topology: Fix up build warning in topology_is_visible()
arch_topology: Do not set llc_sibling if llc_id is invalid
topology: make core_mask include at least cluster_siblings
topology/sysfs: Hide PPIN on systems that do not support it.
Move definitions required by sysfs.c from sysfs_upload.h to sysfs.h so
that sysfs.c does not need to include sysfs_upload.h.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220426200356.126085-3-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix the CONFIGs around register_sysfs_loader(),
unregister_sysfs_loader(), register_firmware_config_sysctl(), and
unregister_firmware_config_sysctl(). The full definitions of the
register_sysfs_loader() and unregister_sysfs_loader() functions should
be used whenever CONFIG_FW_LOADER_SYSFS is defined. The
register_firmware_config_sysctl() and unregister_firmware_config_sysctl()
functions should be stubbed out unless CONFIG_FW_LOADER_USER_HELPER
CONFIG_SYSCTL are both defined.
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220426200356.126085-2-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__add_memory_block() calls both put_device() and device_unregister() when
storing the memory block into the xarray. This is incorrect because
xarray doesn't take an additional reference and device_unregister()
already calls put_device().
Triggering the issue looks really unlikely and its only effect should be
to log a spurious warning about a ref counted issue.
Link: https://lkml.kernel.org/r/d44c63d78affe844f020dc02ad6af29abc448fc4.1650611702.git.christophe.jaillet@wanadoo.fr
Fixes: 4fb6eabf10 ("drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Scott Cheloha <cheloha@linux.vnet.ibm.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node to fix
this issue.
Link: https://lkml.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com
Fixes: ed4a6d7f06 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The devices on platform/amba/fsl-mc/PCI buses could be bound to drivers
with the device DMA managed by kernel drivers or user-space applications.
Unfortunately, multiple devices may be placed in the same IOMMU group
because they cannot be isolated from each other. The DMA on these devices
must either be entirely under kernel control or userspace control, never
a mixture. Otherwise the driver integrity is not guaranteed because they
could access each other through the peer-to-peer accesses which by-pass
the IOMMU protection.
This checks and sets the default DMA mode during driver binding, and
cleanups during driver unbinding. In the default mode, the device DMA is
managed by the device driver which handles DMA operations through the
kernel DMA APIs (see Documentation/core-api/dma-api.rst).
For cases where the devices are assigned for userspace control through the
userspace driver framework(i.e. VFIO), the drivers(for example, vfio_pci/
vfio_platfrom etc.) may set a new flag (driver_managed_dma) to skip this
default setting in the assumption that the drivers know what they are
doing with the device DMA.
Calling iommu_device_use_default_domain() before {of,acpi}_dma_configure
is currently a problem. As things stand, the IOMMU driver ignored the
initial iommu_probe_device() call when the device was added, since at
that point it had no fwspec yet. In this situation,
{of,acpi}_iommu_configure() are retriggering iommu_probe_device() after
the IOMMU driver has seen the firmware data via .of_xlate to learn that
it actually responsible for the given device. As the result, before
that gets fixed, iommu_use_default_domain() goes at the end, and calls
arch_teardown_dma_ops() if it fails.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220418005000.897664-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stop sharing platform_dma_configure() helper as they are about to have
their own bus dma_configure callbacks.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220418005000.897664-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The bus_type structure defines dma_configure() callback for bus drivers
to configure DMA on the devices. This adds the paired dma_cleanup()
callback and calls it during driver unbinding so that bus drivers can do
some cleanup work.
One use case for this paired DMA callbacks is for the bus driver to check
for DMA ownership conflicts during driver binding, where multiple devices
belonging to a same IOMMU group (the minimum granularity of isolation and
protection) may be assigned to kernel drivers or user space respectively.
Without this change, for example, the vfio driver has to listen to a bus
BOUND_DRIVER event and then BUG_ON() in case of dma ownership conflict.
This leads to bad user experience since careless driver binding operation
may crash the system if the admin overlooks the group restriction. Aside
from bad design, this leads to a security problem as a root user, even with
lockdown=integrity, can force the kernel to BUG.
With this change, the bus driver could check and set the DMA ownership in
driver binding process and fail on ownership conflicts. The DMA ownership
should be released during driver unbinding.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220418005000.897664-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
When ACPI table includes _PLD fields for a device, create a new
directory (physical_location) in sysfs to share _PLD fields.
Currently without PLD information, when there are multiple of same
devices, it is hard to distinguish which device corresponds to which
physical device at which location. For example, when there are two Type
C connectors, it is hard to find out which connector corresponds to the
Type C port on the left panel versus the Type C port on the right panel.
With PLD information provided, we can determine which specific device at
which location is doing what.
_PLD output includes much more fields, but only generic fields are added
and exposed to sysfs, so that non-ACPI devices can also support it in
the future. The minimal generic fields needed for locating a device are
the following.
- panel
- vertical_position
- horizontal_position
- dock
- lid
Signed-off-by: Won Chung <wonchung@google.com>
Link: https://lore.kernel.org/r/20220314195458.271430-1-wonchung@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is
invalid") only calls WARN() when IRQ0 is about to be returned, however
using IRQ0 is considered invalid (according to Linus) outside the arch/
code where it's used by the i8253 drivers. Many driver subsystems treat
0 specially (e.g. as an indication of the polling mode by libata), so
the users of platform_get_irq[_byname]() in them would have to filter
out IRQ0 explicitly and this (quite obviously) doesn't scale...
Let's finally get this straight and return -EINVAL instead of IRQ0!
Fixes: a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is invalid")
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/025679e1-1f0a-ae4b-4369-01164f691511@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Compaction sysfs file is created via compaction_register_node in
register_node. But we forgot to remove it in unregister_node. Thus
compaction sysfs file is leaked. Using compaction_unregister_node
to fix this issue.
Fixes: ed4a6d7f06 ("mm: compaction: add /sys trigger for per-node memory compaction")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Link: https://lore.kernel.org/r/20220401070905.43679-1-linmiaohe@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When there are 2 matched drivers for a device using
async probe mechanism, the dev->p->async_driver might
be overridden by the last attached driver.
So just skip the later one if the previous matched driver
was not handled by async thread yet.
Below is my use case which having this problem.
Make both driver mmcblk and mmc_test allow async probe,
the dev->p->async_driver will be overridden by the later driver
mmc_test and bind to the device then claim it for testing.
When it happen, mmcblk will never do probe again.
Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com>
Link: https://lore.kernel.org/r/20220316074328.1801-1-mark-pk.tsai@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The newly introduced helpers for searching for matches in the case of
multiple connections can be resused by the single-connection case, so do
this to save some duplication.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220422222351.1297276-3-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In some cases multiple connections with the same connection id
needs to be resolved from a fwnode graph.
One such example is when separate hardware is used for performing muxing
and/or orientation switching of the SuperSpeed and SBU lines in a USB
Type-C connector. In this case the connector needs to belong to a graph
with multiple matching remote endpoints, and the Type-C controller needs
to be able to resolve them both.
Add a new API that allows this kind of lookup.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220422222351.1297276-2-bjorn.andersson@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add additional sysfs nodes to monitor the transfer of firmware upload data
to the target device:
cancel: Write 1 to cancel the data transfer
error: Display error status for a failed firmware upload
remaining_size: Display the remaining amount of data to be transferred
status: Display the progress of the firmware upload
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-6-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Extend the firmware subsystem to support a persistent sysfs interface that
userspace may use to initiate a firmware update. For example, FPGA based
PCIe cards load firmware and FPGA images from local FLASH when the card
boots. The images in FLASH may be updated with new images provided by the
user at his/her convenience.
A device driver may call firmware_upload_register() to expose persistent
"loading" and "data" sysfs files. These files are used in the same way as
the fallback sysfs "loading" and "data" files. When 0 is written to
"loading" to complete the write of firmware data, the data is transferred
to the lower-level driver using pre-registered call-back functions. The
data transfer is done in the context of a kernel worker thread.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-5-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation for sharing the "loading" and "data" sysfs nodes with the
new firmware upload support, split out sysfs functionality from fallback.c
and fallback.h into sysfs.c and sysfs.h. This includes the firmware
class driver code that is associated with the sysfs files and the
fw_fallback_config support for the timeout sysfs node.
CONFIG_FW_LOADER_SYSFS is created and is selected by
CONFIG_FW_LOADER_USER_HELPER in order to include sysfs.o in
firmware_class-objs.
This is mostly just a code reorganization. There are a few symbols that
change in scope, and these can be identified by looking at the header
file changes. A few white-space warnings from checkpatch are also
addressed in this patch.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-4-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Current logic does not consider multi-stride cases,
the max_register have to calculate with reg_stride
because it is a kind of address range.
Signed-off-by: Jeongtae Park <jtp.park@samsung.com>
Link: https://lore.kernel.org/r/20220425114613.15934-1-jtp.park@samsung.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Commit aa63a74d45 ("topology/sysfs: Hide PPIN on systems that do not
support it.") caused a build warning on some configurations:
drivers/base/topology.c: In function 'topology_is_visible':
drivers/base/topology.c:158:24: warning: unused variable 'dev' [-Wunused-variable]
158 | struct device *dev = kobj_to_dev(kobj);
Fix this up by getting rid of the variable entirely.
Fixes: aa63a74d45 ("topology/sysfs: Hide PPIN on systems that do not support it.")
Cc: Tony Luck <tony.luck@intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/r/20220422062653.3899972-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__add_memory_block() calls both put_device() and device_unregister() when
storing the memory block into the xarray. This is incorrect because xarray
doesn't take an additional reference and device_unregister() already calls
put_device().
Triggering the issue looks really unlikely and its only effect should be to
log a spurious warning about a ref counted issue.
Fixes: 4fb6eabf10 ("drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/d44c63d78affe844f020dc02ad6af29abc448fc4.1650611702.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Device drivers may decide to not load firmware when probed to avoid
slowing down the boot process should the firmware filesystem not be
available yet. In this case, the firmware loading request may be done
when a device file associated with the driver is first accessed. The
credentials of the userspace process accessing the device file may be
used to validate access to the firmware files requested by the driver.
Ensure that the kernel assumes the responsibility of reading the
firmware.
This was observed on Android for a graphic driver loading their firmware
when the device file (e.g. /dev/mali0) was first opened by userspace
(i.e. surfaceflinger). The security context of surfaceflinger was used
to validate the access to the firmware file (e.g.
/vendor/firmware/mali.bin).
Because previous configurations were relying on the userspace fallback
mechanism, the security context of the userspace daemon (i.e. ueventd)
was consistently used to read firmware files. More devices are found to
use the command line argument firmware_class.path which gives the kernel
the opportunity to read the firmware directly, hence surfacing this
misattribution.
Signed-off-by: Thiébaud Weksteen <tweek@google.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: John Stultz <jstultz@google.com>
Link: https://lore.kernel.org/r/20220422013215.2301793-1-tweek@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rename fw_sysfs_done() and fw_sysfs_loading() to fw_state_is_done() and
fw_state_is_loading() respectively, and place them along side companion
functions in drivers/base/firmware_loader/firmware.h.
Use the fw_state_is_done() function to exit early from
firmware_loading_store() if the state is already "done". This is being done
in preparation for supporting persistent sysfs nodes to allow userspace to
upload firmware to a device, potentially reusing the sysfs loading and data
files multiple times.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-3-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fw_free_paged_buf() function resets the paged buffer information in
the fw_priv data structure. Additionally, clear the data and size members
of fw_priv in order to facilitate the reuse of fw_priv. This is being
done in preparation for enabling userspace to initiate multiple firmware
uploads using this sysfs interface.
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Tianfei zhang <tianfei.zhang@intel.com>
Tested-by: Matthew Gerlach <matthew.gerlach@linux.intel.com>
Signed-off-by: Russ Weight <russell.h.weight@intel.com>
Link: https://lore.kernel.org/r/20220421212204.36052-2-russell.h.weight@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Several core drivers and buses expect that driver_override is a
dynamically allocated memory thus later they can kfree() it.
However such assumption is not documented, there were in the past and
there are already users setting it to a string literal. This leads to
kfree() of static memory during device release (e.g. in error paths or
during unbind):
kernel BUG at ../mm/slub.c:3960!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
...
(kfree) from [<c058da50>] (platform_device_release+0x88/0xb4)
(platform_device_release) from [<c0585be0>] (device_release+0x2c/0x90)
(device_release) from [<c0a69050>] (kobject_put+0xec/0x20c)
(kobject_put) from [<c0f2f120>] (exynos5_clk_probe+0x154/0x18c)
(exynos5_clk_probe) from [<c058de70>] (platform_drv_probe+0x6c/0xa4)
(platform_drv_probe) from [<c058b7ac>] (really_probe+0x280/0x414)
(really_probe) from [<c058baf4>] (driver_probe_device+0x78/0x1c4)
(driver_probe_device) from [<c0589854>] (bus_for_each_drv+0x74/0xb8)
(bus_for_each_drv) from [<c058b48c>] (__device_attach+0xd4/0x16c)
(__device_attach) from [<c058a638>] (bus_probe_device+0x88/0x90)
(bus_probe_device) from [<c05871fc>] (device_add+0x3dc/0x62c)
(device_add) from [<c075ff10>] (of_platform_device_create_pdata+0x94/0xbc)
(of_platform_device_create_pdata) from [<c07600ec>] (of_platform_bus_create+0x1a8/0x4fc)
(of_platform_bus_create) from [<c0760150>] (of_platform_bus_create+0x20c/0x4fc)
(of_platform_bus_create) from [<c07605f0>] (of_platform_populate+0x84/0x118)
(of_platform_populate) from [<c0f3c964>] (of_platform_default_populate_init+0xa0/0xb8)
(of_platform_default_populate_init) from [<c01031f8>] (do_one_initcall+0x8c/0x404)
Provide a helper which clearly documents the usage of driver_override.
This will allow later to reuse the helper and reduce the amount of
duplicated code.
Convert the platform driver to use a new helper and make the
driver_override field const char (it is not modified by the core).
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220419113435.246203-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To move towards a more consistent behaviour between genpd and the runtime
PM core, let's start by converting genpd's time-accounting from ktime_get()
into ktime_get_mono_fast_ns().
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
As the growing demand on ZSTD compressions, there have been requests
for the support of ZSTD-compressed firmware files, so here it is:
this patch extends the firmware loader code to allow loading ZSTD
files. The implementation is fairly straightforward, it just adds a
ZSTD decompression routine for the file expander. (And the code is
even simpler than XZ thanks to the ZSTD API that gives the original
decompressed size from the header.)
Link: https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/
Tested-by: Piotr Gorski <lucjan.lucjanov@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-2-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When ACPI is not enabled, cpuid_topo->llc_id = cpu_topo->llc_id = -1, which
will set llc_sibling 0xff(...), this is misleading.
Don't set llc_sibling(default 0) if we don't know the cache topology.
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Wang Qing <wangqing@vivo.com>
Fixes: 37c3ec2d81 ("arm64: topology: divorce MC scheduling domain from core_siblings")
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/1649644580-54626-1-git-send-email-wangqing@vivo.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ampere Altra defines CPU clusters in the ACPI PPTT. They share a Snoop
Control Unit, but have no shared CPU-side last level cache.
cpu_coregroup_mask() will return a cpumask with weight 1, while
cpu_clustergroup_mask() will return a cpumask with weight 2.
As a result, build_sched_domain() will BUG() once per CPU with:
BUG: arch topology borken
the CLS domain not a subset of the MC domain
The MC level cpumask is then extended to that of the CLS child, and is
later removed entirely as redundant. This sched domain topology is an
improvement over previous topologies, or those built without
SCHED_CLUSTER, particularly for certain latency sensitive workloads.
With the current scheduler model and heuristics, this is a desirable
default topology for Ampere Altra and Altra Max system.
Rather than create a custom sched domains topology structure and
introduce new logic in arch/arm64 to detect these systems, update the
core_mask so coregroup is never a subset of clustergroup, extending it
to cluster_siblings if necessary. Only do this if CONFIG_SCHED_CLUSTER
is enabled to avoid also changing the topology (MC) when
CONFIG_SCHED_CLUSTER is disabled.
This has the added benefit over a custom topology of working for both
symmetric and asymmetric topologies. It does not address systems where
the CLUSTER topology is above a populated MC topology, but these are not
considered today and can be addressed separately if and when they
appear.
The final sched domain topology for a 2 socket Ampere Altra system is
unchanged with or without CONFIG_SCHED_CLUSTER, and the BUG is avoided:
For CPU0:
CONFIG_SCHED_CLUSTER=y
CLS [0-1]
DIE [0-79]
NUMA [0-159]
CONFIG_SCHED_CLUSTER is not set
DIE [0-79]
NUMA [0-159]
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: D. Scott Phillips <scott@os.amperecomputing.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: <stable@vger.kernel.org> # 5.16.x
Suggested-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Link: https://lore.kernel.org/r/c8fe9fce7c86ed56b4c455b8c902982dc2303868.1649696956.git.darren@os.amperecomputing.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Systems that do not support a Protected Processor Identification Number
currently report:
# cat /sys/devices/system/cpu/cpu0/topology/ppin
0x0
which is confusing/wrong.
Add a ".is_visible" function to suppress inclusion of the ppin file.
Fixes: ab28e94419 ("topology/sysfs: Add PPIN in sysfs under cpu topology")
Signed-off-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20220406220150.63855-1-tony.luck@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The only two users of __pm_runtime_set_status() are pm_runtime_set_active()
and pm_runtime_set_suspended(). These are widely used and should be called
from non-atomic context to work as expected. However, it would be
convenient to allow them be called from atomic context too, as shown from a
subsequent change, so let's add support for this.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Maulik Shah <quic_mkshah@quicinc.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The part 'is' in the function name implies the test against something.
Drop unnecessary 'test' prefix in the fwnode_is_ancestor_of() parameters.
No functional change intended.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In a few cases the functionality of fwnode_for_each_parent_node()
is already in use. Introduce a common helper macro for it.
It may be used by others as well in the future.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Some of the fwnode APIs might return an error pointer instead of NULL
or valid fwnode handle. The result of such API call may be considered
optional and hence the test for it is usually done in a form of
fwnode = fwnode_find_reference(...);
if (IS_ERR(fwnode))
...error handling...
Nevertheless the resulting fwnode may have bumped the reference count
and hence caller of the above API is obliged to call fwnode_handle_put().
Since fwnode may be not valid either as NULL or error pointer the check
has to be performed there. This approach uglifies the code and adds
a point of making a mistake, i.e. forgetting about error point case.
To prevent this, allow an error pointer to be passed to the fwnode APIs.
Fixes: 83b34afb6b ("device property: Introduce fwnode_find_reference()")
Reported-by: Nuno Sá <nuno.sa@analog.com>
Tested-by: Nuno Sá <nuno.sa@analog.com>
Acked-by: Nuno Sá <nuno.sa@analog.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A PM-runtime device usage count underflow is potentially critical,
because it may cause a device to be suspended when it is expected to
be operational. It is also a programming problem that would be good
to catch and warn about.
For this reason, (1) make rpm_check_suspend_allowed() return an error
when the device usage count is negative to prevent devices from being
suspended in that case, (2) introduce rpm_drop_usage_count() that will
detect device usage count underflows, warn about them and fix them up,
and (3) use it to drop the usage count in a few places instead of
atomic_dec_and_test().
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Mention all domain attach menthods which dev_pm_domain_detach()
reverses.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When a driver for an interrupt controller is missing, of_irq_get()
returns -EPROBE_DEFER ad infinitum, causing
fwnode_mdiobus_phy_device_register(), and ultimately, the entire
of_mdiobus_register() call, to fail. In turn, any phy_connect() call
towards a PHY on this MDIO bus will also fail.
This is not what is expected to happen, because the PHY library falls
back to poll mode when of_irq_get() returns a hard error code, and the
MDIO bus, PHY and attached Ethernet controller work fine, albeit
suboptimally, when the PHY library polls for link status. However,
-EPROBE_DEFER has special handling given the assumption that at some
point probe deferral will stop, and the driver for the supplier will
kick in and create the IRQ domain.
Reasons for which the interrupt controller may be missing:
- It is not yet written. This may happen if a more recent DT blob (with
an interrupt-parent for the PHY) is used to boot an old kernel where
the driver didn't exist, and that kernel worked with the
vintage-correct DT blob using poll mode.
- It is compiled out. Behavior is the same as above.
- It is compiled as a module. The kernel will wait for a number of
seconds specified in the "deferred_probe_timeout" boot parameter for
user space to load the required module. The current default is 0,
which times out at the end of initcalls. It is possible that this
might cause regressions unless users adjust this boot parameter.
The proposed solution is to use the driver_deferred_probe_check_state()
helper function provided by the driver core, which gives up after some
-EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration.
The return code is changed from -EPROBE_DEFER into -ENODEV or
-ETIMEDOUT, depending on whether the kernel is compiled with support for
modules or not.
Fixes: 66bdede495 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add irq_get() fwnode operation to implement fwnode_irq_get() through
fwnode operations, moving the code in fwnode_irq_get() to OF and ACPI
frameworks.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add iomap() fwnode operation to implement fwnode_iomap() through fwnode
operations, moving the code in fwnode_iomap() to OF framework.
Note that the IS_ENABLED(CONFIG_OF_ADDRESS) && is_of_node(fwnode) check is
needed for Sparc that has its own implementation of of_iomap anyway. Let
the pre-compiler to handle that check.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make the device_dma_supported and device_get_dma_attr functions to use the
fwnode ops, and move the implementation to ACPI and OF frameworks.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The only usage of these is to pass their address to __regmap_init() or
__devm_regmap_init(), both which takes pointers to const struct
regmap_bus. Make them const to allow the compiler to put them in
read-only memory.
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Link: https://lore.kernel.org/r/20220330214110.36337-1-rikard.falkeborn@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This is based on new i2c material for 5.18-rc1 and simply reorganizes
the code on top of it so as to group similar functions together (Andy
Shevchenko).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pkcSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxF0wQAIbgtC27ABvKjl+FptW2ooo8t/1o5Mtq
LvQ274WtBY0pUo7D9S1ZpAPHoGCM++mZYK+RzL5tpOl2J5t3zv4qaWk2vKILtI0I
5aEQm78uJxnHunG9T8LcmQctk5G6sjjA775sEIGf2NC5ovGa8LmtH/24+roOwM30
y0bP0+2/iTUjnfqUbIYFs8xst2i+2+8euHxahRIqmzkkdJXzUa3q8/qxWDz5nl37
2laGI+ril5p8SIpYTReFB8Vzpd07Lfr+HcOjeKdk3dJw4bwmxv5VFUFsTFRP7/0v
xFEnckVNG8gSt3LATj9MAL2ettAief1+IgeJubtNIWMlmai07J5ujZXRoRc4BbD8
sJauw/BU79C6KDt2CBTbABhHXDfc5dw4ZjKqLSYLTKa2Fw9njhTGs41zn+tAj0vY
ubycJxyOlzHeYoqHmZQEF28AHqiMsFS28xIWoVTJfThQOBJCOYLaohWhxcS31C/Z
4BJQIrQCX97oroLIvNOuW/Wco0RdHGXAN1f0XeWfrEcRyVJQ/XJOsR2gVDNVflpD
ya3je+R6mH+J95lRrtqgxl0AUyzVREQhr6EMw3nYfacdj3rzvy1PDjzSZsAdOsrd
6asNc/y84sDqOh+4kmH1wWUqK/Ooq0j1YFqPnyAfdkUD8D0RZvkdz5ior3knU9lE
O6CDMLTCygi6
=voeP
-----END PGP SIGNATURE-----
Merge tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties code update from Rafael Wysocki:
"This is based on new i2c material for 5.18-rc1 and simply reorganizes
the code on top of it so as to group similar functions together (Andy
Shevchenko)"
* tag 'devprop-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
device property: Don't split fwnode_get_irq*() APIs in the code
Here is the set of driver core changes for 5.18-rc1.
Not much here, primarily it was a bunch of cleanups and small updates:
- kobj_type cleanups for default_groups
- documentation updates
- firmware loader minor changes
- component common helper added and take advantage of it in many
drivers (the largest part of this pull request).
There will be a merge conflict in drivers/power/supply/ab8500_chargalg.c
with your tree, the merge conflict should be easy (take all the
changes).
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYkG6PA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylMFwCfSIyAU4oLEgj+/Rfmx4o45cAVIWMAnit3zbdU
wUUCGqKcOnTJEcW6dMPh
=1VVi
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of driver core changes for 5.18-rc1.
Not much here, primarily it was a bunch of cleanups and small updates:
- kobj_type cleanups for default_groups
- documentation updates
- firmware loader minor changes
- component common helper added and take advantage of it in many
drivers (the largest part of this pull request).
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (54 commits)
Documentation: update stable review cycle documentation
drivers/base/dd.c : Remove the initial value of the global variable
Documentation: update stable tree link
Documentation: add link to stable release candidate tree
devres: fix typos in comments
Documentation: add note block surrounding security patch note
samples/kobject: Use sysfs_emit instead of sprintf
base: soc: Make soc_device_match() simpler and easier to read
driver core: dd: fix return value of __setup handler
driver core: Refactor sysfs and drv/bus remove hooks
driver core: Refactor multiple copies of device cleanup
scripts: get_abi.pl: Fix typo in help message
kernfs: fix typos in comments
kernfs: remove unneeded #if 0 guard
ALSA: hda/realtek: Make use of the helper component_compare_dev_name
video: omapfb: dss: Make use of the helper component_compare_dev
power: supply: ab8500: Make use of the helper component_compare_dev
ASoC: codecs: wcd938x: Make use of the helper component_compare/release_of
iommu/mediatek: Make use of the helper component_compare/release_of
drm: of: Make use of the helper component_release_of
...
Pull i2c updates from Wolfram Sang:
- tracepoints when Linux acts as an I2C client
- added support for AMD PSP
- whole subsystem now uses generic_handle_irq_safe()
- piix4 driver gained MMIO access enabling so far missed controllers
with AMD chipsets
- a bulk of device driver updates, refactorization, and fixes.
* 'i2c/for-mergewindow' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (61 commits)
i2c: mux: demux-pinctrl: do not deactivate a master that is not active
i2c: meson: Fix wrong speed use from probe
i2c: add tracepoints for I2C slave events
i2c: designware: Remove code duplication
i2c: cros-ec-tunnel: Fix syntax errors in comments
MAINTAINERS: adjust XLP9XX I2C DRIVER after removing the devicetree binding
i2c: designware: Mark dw_i2c_plat_{suspend,resume}() as __maybe_unused
i2c: mediatek: Add i2c compatible for Mediatek MT8168
dt-bindings: i2c: update bindings for MT8168 SoC
i2c: mt65xx: Simplify with clk-bulk
i2c: i801: Drop two outdated comments
i2c: xiic: Make bus names unique
i2c: i801: Add support for the Process Call command
i2c: i801: Drop useless masking in i801_access
i2c: tegra: Add SMBus block read function
i2c: designware: Use the i2c_mark_adapter_suspended/resumed() helpers
i2c: designware: Lock the adapter while setting the suspended flag
i2c: mediatek: remove redundant null check
i2c: mediatek: modify bus speed calculation formula
i2c: designware: Fix improper usage of readl
...
Let's make it clearer at which places we actually add and remove memory
blocks -- streamlining the terminology -- and highlight which memory block
start out online and which start out as offline.
* rename add_memory_block -> add_boot_memory_block
* rename init_memory_block -> add_memory_block
* rename unregister_memory -> remove_memory_block
* rename register_memory -> __add_memory_block
* add add_hotplug_memory_block
* mark add_boot_memory_block with __init (suggested by Oscar)
__add_memory_block() is a pure helper for add_memory_block(), remove
the somewhat obvious comment.
Link: https://lkml.kernel.org/r/20220221154531.11382-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
test_pages_in_a_zone() is just another nasty PFN walker that can easily
stumble over ZONE_DEVICE memory ranges falling into the same memory block
as ordinary system RAM: the memmap of parts of these ranges might possibly
be uninitialized. In fact, we observed (on an older kernel) with UBSAN:
UBSAN: Undefined behaviour in ./include/linux/mm.h:1133:50
index 7 is out of range for type 'zone [5]'
CPU: 121 PID: 35603 Comm: read_all Kdump: loaded Tainted: [...]
Hardware name: Dell Inc. PowerEdge R7425/08V001, BIOS 1.12.2 11/15/2019
Call Trace:
dump_stack+0x9a/0xf0
ubsan_epilogue+0x9/0x7a
__ubsan_handle_out_of_bounds+0x13a/0x181
test_pages_in_a_zone+0x3c4/0x500
show_valid_zones+0x1fa/0x380
dev_attr_show+0x43/0xb0
sysfs_kf_seq_show+0x1c5/0x440
seq_read+0x49d/0x1190
vfs_read+0xff/0x300
ksys_read+0xb8/0x170
do_syscall_64+0xa5/0x4b0
entry_SYSCALL_64_after_hwframe+0x6a/0xdf
RIP: 0033:0x7f01f4439b52
We seem to stumble over a memmap that contains a garbage zone id. While
we could try inserting pfn_to_online_page() calls, it will just make
memory offlining slower, because we use test_pages_in_a_zone() to make
sure we're offlining pages that all belong to the same zone.
Let's just get rid of this PFN walker and determine the single zone of a
memory block -- if any -- for early memory blocks during boot. For memory
onlining, we know the single zone already. Let's avoid any additional
memmap scanning and just rely on the zone information available during
boot.
For memory hot(un)plug, we only really care about memory blocks that:
* span a single zone (and, thereby, a single node)
* are completely System RAM (IOW, no holes, no ZONE_DEVICE)
If one of these conditions is not met, we reject memory offlining.
Hotplugged memory blocks (starting out offline), always meet both
conditions.
There are three scenarios to handle:
(1) Memory hot(un)plug
A memory block with zone == NULL cannot be offlined, corresponding to
our previous test_pages_in_a_zone() check.
After successful memory onlining/offlining, we simply set the zone
accordingly.
* Memory onlining: set the zone we just used for onlining
* Memory offlining: set zone = NULL
So a hotplugged memory block starts with zone = NULL. Once memory
onlining is done, we set the proper zone.
(2) Boot memory with !CONFIG_NUMA
We know that there is just a single pgdat, so we simply scan all zones
of that pgdat for an intersection with our memory block PFN range when
adding the memory block. If more than one zone intersects (e.g., DMA and
DMA32 on x86 for the first memory block) we set zone = NULL and
consequently mimic what test_pages_in_a_zone() used to do.
(3) Boot memory with CONFIG_NUMA
At the point in time we create the memory block devices during boot, we
don't know yet which nodes *actually* span a memory block. While we could
scan all zones of all nodes for intersections, overlapping nodes complicate
the situation and scanning all nodes is possibly expensive. But that
problem has already been solved by the code that sets the node of a memory
block and creates the link in the sysfs --
do_register_memory_block_under_node().
So, we hook into the code that sets the node id for a memory block. If
we already have a different node id set for the memory block, we know
that multiple nodes *actually* have PFNs falling into our memory block:
we set zone = NULL and consequently mimic what test_pages_in_a_zone() used
to do. If there is no node id set, we do the same as (2) for the given
node.
Note that the call order in driver_init() is:
-> memory_dev_init(): create memory block devices
-> node_dev_init(): link memory block devices to the node and set the
node id
So in summary, we detect if there is a single zone responsible for this
memory block and we consequently store the zone in that case in the
memory block, updating it during memory onlining/offlining.
Link: https://lkml.kernel.org/r/20220210184359.235565-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Rafael Parra <rparrazo@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "drivers/base/memory: determine and store zone for single-zone memory blocks", v2.
I remember talking to Michal in the past about removing
test_pages_in_a_zone(), which we use for:
* verifying that a memory block we intend to offline is really only managed
by a single zone. We don't support offlining of memory blocks that are
managed by multiple zones (e.g., multiple nodes, DMA and DMA32)
* exposing that zone to user space via
/sys/devices/system/memory/memory*/valid_zones
Now that I identified some more cases where test_pages_in_a_zone() might
go wrong, and we received an UBSAN report (see patch #3), let's get rid of
this PFN walker.
So instead of detecting the zone at runtime with test_pages_in_a_zone() by
scanning the memmap, let's determine and remember for each memory block if
it's managed by a single zone. The stored zone can then be used for the
above two cases, avoiding a manual lookup using test_pages_in_a_zone().
This avoids eventually stumbling over uninitialized memmaps in corner
cases, especially when ZONE_DEVICE ranges partly fall into memory block
(that are responsible for managing System RAM).
Handling memory onlining is easy, because we online to exactly one zone.
Handling boot memory is more tricky, because we want to avoid scanning all
zones of all nodes to detect possible zones that overlap with the physical
memory region of interest. Fortunately, we already have code that
determines the applicable nodes for a memory block, to create sysfs links
-- we'll hook into that.
Patch #1 is a simple cleanup I had laying around for a longer time.
Patch #2 contains the main logic to remove test_pages_in_a_zone() and
further details.
[1] https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
[2] https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com
This patch (of 2):
Let's adjust the stale terminology, making it match
unregister_memory_block_under_nodes() and
do_register_memory_block_under_node(). We're dealing with memory block
devices, which span 1..X memory sections.
Link: https://lkml.kernel.org/r/20220210184359.235565-1-david@redhat.com
Link: https://lkml.kernel.org/r/20220210184359.235565-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rafael Parra <rparrazo@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
... and call node_dev_init() after memory_dev_init() from driver_init(),
so before any of the existing arch/subsys calls. All online nodes should
be known at that point: early during boot, arch code determines node and
zone ranges and sets the relevant nodes online; usually this happens in
setup_arch().
This is in line with memory_dev_init(), which initializes the memory
device subsystem and creates all memory block devices.
Similar to memory_dev_init(), panic() if anything goes wrong, we don't
want to continue with such basic initialization errors.
The important part is that node_dev_init() gets called after
memory_dev_init() and after cpu_dev_init(), but before any of the relevant
archs call register_cpu() to register the new cpu device under the node
device. The latter should be the case for the current users of
topology_init().
Link: https://lkml.kernel.org/r/20220203105212.30385-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Tested-by: Anatoly Pugachev <matorola@gmail.com> (sparc64)
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If register_memory() fails, we freed the memory block but already added
the memory block to the group list, not good. Let's defer adding the
block to the memory group to after registering the memory block device.
We do handle it properly during unregister_memory(), but that's not
called when the registration fails.
Link: https://lkml.kernel.org/r/20220128144540.153902-1-david@redhat.com
Fixes: 028fc57a1c ("drivers/base/memory: introduce "memory groups" to logically group memory blocks")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the hwpoison page meets the filter conditions, it should not be
regarded as successful memory_failure() processing for mce handler, but
should return a distinct value, otherwise mce handler regards the error
page has been identified and isolated, which may lead to calling
set_mce_nospec() to change page attribute, etc.
Here memory_failure() return -EOPNOTSUPP to indicate that the error
event is filtered, mce handler should not take any action for this
situation and hwpoison injector should treat as correct.
Link: https://lkml.kernel.org/r/20220223082135.2769649-1-luofei@unicloud.com
Signed-off-by: luofei <luofei@unicloud.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Cleanups for SCHED_DEADLINE
- Tracing updates/fixes
- CPU Accounting fixes
- First wave of changes to optimize the overhead of the scheduler build,
from the fast-headers tree - including placeholder *_api.h headers for
later header split-ups.
- Preempt-dynamic using static_branch() for ARM64
- Isolation housekeeping mask rework; preperatory for further changes
- NUMA-balancing: deal with CPU-less nodes
- NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD)
- Updates to RSEQ UAPI in preparation for glibc usage
- Lots of RSEQ/selftests, for same
- Add Suren as PSI co-maintainer
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmI5rg8RHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1hGrw/+M3QOk6fH7G48wjlNnBvcOife6ls+Ni4k
ixOAcF4JKoixO8HieU5vv0A7yf/83tAa6fpeXeMf1hkCGc0NSlmLtuIux+WOmoAL
LzCyDEYfiP8KnVh0A1Tui/lK0+AkGo21O6ADhQE2gh8o2LpslOHQMzvtyekSzeeb
mVxMYQN+QH0m518xdO2D8IQv9ctOYK0eGjmkqdNfntOlytypPZHeNel/tCzwklP/
dElJUjNiSKDlUgTBPtL3DfpoLOI/0mHF2p6NEXvNyULxSOqJTu8pv9Z2ADb2kKo1
0D56iXBDngMi9MHIJLgvzsA8gKzHLFSuPbpODDqkTZCa28vaMB9NYGhJ643NtEie
IXTJEvF1rmNkcLcZlZxo0yjL0fjvPkczjw4Vj27gbrUQeEBfb4mfuI4BRmij63Ep
qEkgQTJhduCqqrQP1rVyhwWZRk1JNcVug+F6N42qWW3fg1xhj0YSrLai2c9nPez6
3Zt98H8YGS1Z/JQomSw48iGXVqfTp/ETI7uU7jqHK8QcjzQ4lFK5H4GZpwuqGBZi
NJJ1l97XMEas+rPHiwMEN7Z1DVhzJLCp8omEj12QU+tGLofxxwAuuOVat3CQWLRk
f80Oya3TLEgd22hGIKDRmHa22vdWnNQyS0S15wJotawBzQf+n3auS9Q3/rh979+t
ES/qvlGxTIs=
=Z8uT
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
- Cleanups for SCHED_DEADLINE
- Tracing updates/fixes
- CPU Accounting fixes
- First wave of changes to optimize the overhead of the scheduler
build, from the fast-headers tree - including placeholder *_api.h
headers for later header split-ups.
- Preempt-dynamic using static_branch() for ARM64
- Isolation housekeeping mask rework; preperatory for further changes
- NUMA-balancing: deal with CPU-less nodes
- NUMA-balancing: tune systems that have multiple LLC cache domains per
node (eg. AMD)
- Updates to RSEQ UAPI in preparation for glibc usage
- Lots of RSEQ/selftests, for same
- Add Suren as PSI co-maintainer
* tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
sched/headers: ARM needs asm/paravirt_api_clock.h too
sched/numa: Fix boot crash on arm64 systems
headers/prep: Fix header to build standalone: <linux/psi.h>
sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y
cgroup: Fix suspicious rcu_dereference_check() usage warning
sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers
sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains
sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity()
sched/deadline,rt: Remove unused functions for !CONFIG_SMP
sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently
sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file
sched/deadline: Remove unused def_dl_bandwidth
sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
sched/tracing: Don't re-read p->state when emitting sched_switch event
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
sched/cpuacct: Remove redundant RCU read lock
sched/cpuacct: Optimize away RCU read lock
sched/cpuacct: Fix charge percpu cpuusage
sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies
...
A couple of small fixes, plus some new features that enable us to handle
devices that reformat register addresses depending on the bus used to
handle the control interface more gracefully.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmI4aVQACgkQJNaLcl1U
h9DH7Qf/UfcjSp9MbQ7vZ+uuSQFjcRSSKri0VzFSETAaXBSyDOF43L/3dk60xDo5
+4Om4LX3LZvA387I3bctPaij8fNp+X1bYr2pWUbh/FnJ+fS7PUfzXbQXkh/tKsMP
qhYCcrWVBnGTj0OZAuazYW79V1LolBGC5+YjDRipbcwZhqNVWZKuWVYUhLOduZ5G
5n5PxUyPnDEYF6ixJQq2QiCA6Pb/BSPgviaEHVxeLWxs8IOtq+9u69ijecfDYiwD
vMpFlOnPoy+Bpu7gnU2LhDZ/Nmxvi22BtRxt4qtmC3lTq9Cf/Xrtm2En3aaRsero
SsLc0ZaHkt6W9V5O0yFNrjWLslTAJg==
=3oe7
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A couple of small fixes, plus some new features that enable us to
handle devices that reformat register addresses depending on the bus
used to handle the control interface more gracefully"
* tag 'regmap-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: allow a defined reg_base to be added to every address
regmap: add configurable downshift for addresses
regmap: irq: cleanup comments
regmap-irq: Fix typo in comment
- Allow device_pm_check_callbacks() to be called from interrupt
context without issues (Dmitry Baryshkov).
- Modify devm_pm_runtime_enable() to automatically handle
pm_runtime_dont_use_autosuspend() at driver exit time (Douglas
Anderson).
- Make the schedutil cpufreq governor use to_gov_attr_set() instead
of open coding it (Kevin Hao).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
cpufreq longhaul driver (Rafael Wysocki).
- Unify show() and store() naming in cpufreq and make it use
__ATTR_XX (Lianjie Zhang).
- Make the intel_pstate driver use the EPP value set by the firmware
by default (Srinivas Pandruvada).
- Re-order the init checks in the powernow-k8 cpufreq driver (Mario
Limonciello).
- Make the ACPI processor idle driver check for architectural
support for LPI to avoid using it on x86 by mistake (Mario
Limonciello).
- Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add 'preferred_cstates' module argument to the intel_idle driver
to work around C1 and C1E handling issue on Sapphire Rapids (Artem
Bityutskiy).
- Add core C6 optimization on Sapphire Rapids to the intel_idle
driver (Artem Bityutskiy).
- Optimize the haltpoll cpuidle driver a bit (Li RongQing).
- Remove leftover text from intel_idle() kerneldoc comment and fix
up white space in intel_idle (Rafael Wysocki).
- Fix load_image_and_restore() error path (Ye Bin).
- Fix typos in comments in the system wakeup hadling code (Tom Rix).
- Clean up non-kernel-doc comments in hibernation code (Jiapeng
Chong).
- Fix __setup handler error handling in system-wide suspend and
hibernation core code (Randy Dunlap).
- Add device name to suspend_report_result() (Youngjin Jang).
- Make virtual guests honour ACPI S4 hardware signature by
default (David Woodhouse).
- Block power off of a parent PM domain unless child is in deepest
state (Ulf Hansson).
- Use dev_err_probe() to simplify error handling for generic PM
domains (Ahmad Fatoum).
- Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).
- Document Intel uncore frequency scaling (Srinivas Pandruvada).
- Add DTPM hierarchy description (Daniel Lezcano).
- Change the locking scheme in DTPM (Daniel Lezcano).
- Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
release (Daniel Lezcano).
- Make dtpm_node_callback[] static (kernel test robot).
- Fix spelling mistake "initialze" -> "initialize" in
dtpm_create_hierarchy() (Colin Ian King).
- Add tracer tool for the amd-pstate driver (Jinzhou Su).
- Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).
- Add AMD P-State support to the cpupower utility (Huang Rui).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pM4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxh5wQAJEz3u55wIHzeov30obtXaD3SxxnvRzR
p96gRcmNoR2so/Q9D+h+JHZKQkVklbnbqExMXQn1qarceAUN7KPjVMRvagjZsC/f
J3LtQmx96yqGTCzOTu5n+Ol2ojKLMCMo++no/2873BYhd60TV6oQxRzkNiZx215n
tT6MKY5ZMX448VKWAWh9vt5rdvbBj9z6cfvpchK/3bziE21lfLz/1iXeFnwqjPGU
XuA7NYbVAHOfsdHZk19+4qAgm8EYkmjd4/J8HDlb7XouyLuUGy8KJZYhSrJKiQ1C
f9f2Zw0925/YpBmFXOwxuYWP9KjFKlq7Cdr3SSgVGDOvgyRtpeV4fU8Y6WPFCtEV
fQdKr9/4KQP6hwUpxJZucSf49wcnyh7hFDMxrwVVcL96yXZef1OqG3ITihJY/n4J
+wDnpR2VqBeiG5NyECjk3mPROZGFfUlHRsqMd3JOswMpGF5phpEI9nNFcayB262S
Rkgcb3MacFVsuo/ZBdzCUTZ6ECvjxZn4FGZPxumkp65SJO18gOPbqs8qfGCZ3Tgb
GDy0CWEOv/KuGnks1CkBGok2Z4q8s2GcZmaOp9BiPjxKJD71i4uPtiGA/5Ahb6cm
Cu0G7Ub/t2Vc93E7mnTE4hh2IuiAN73yB5teM4YNllHw6f+aqVGlvJktIMpShajo
eEBNFlkwljyz
=WlR9
-----END PGP SIGNATURE-----
Merge tag 'pm-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These are mostly fixes and cleanups all over the code and a new piece
of documentation for Intel uncore frequency scaling.
Functionality-wise, the intel_idle driver will support Sapphire Rapids
Xeons natively now (with some extra facilities for controlling
C-states more precisely on those systems), virtual guests will take
the ACPI S4 hardware signature into account by default, the
intel_pstate driver will take the defualt EPP value from the firmware,
cpupower utility will support the AMD P-state driver added in the
previous cycle, and there is a new tracer utility for that driver.
Specifics:
- Allow device_pm_check_callbacks() to be called from interrupt
context without issues (Dmitry Baryshkov).
- Modify devm_pm_runtime_enable() to automatically handle
pm_runtime_dont_use_autosuspend() at driver exit time (Douglas
Anderson).
- Make the schedutil cpufreq governor use to_gov_attr_set() instead
of open coding it (Kevin Hao).
- Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
cpufreq longhaul driver (Rafael Wysocki).
- Unify show() and store() naming in cpufreq and make it use
__ATTR_XX (Lianjie Zhang).
- Make the intel_pstate driver use the EPP value set by the firmware
by default (Srinivas Pandruvada).
- Re-order the init checks in the powernow-k8 cpufreq driver (Mario
Limonciello).
- Make the ACPI processor idle driver check for architectural support
for LPI to avoid using it on x86 by mistake (Mario Limonciello).
- Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
Bityutskiy).
- Add 'preferred_cstates' module argument to the intel_idle driver to
work around C1 and C1E handling issue on Sapphire Rapids (Artem
Bityutskiy).
- Add core C6 optimization on Sapphire Rapids to the intel_idle
driver (Artem Bityutskiy).
- Optimize the haltpoll cpuidle driver a bit (Li RongQing).
- Remove leftover text from intel_idle() kerneldoc comment and fix up
white space in intel_idle (Rafael Wysocki).
- Fix load_image_and_restore() error path (Ye Bin).
- Fix typos in comments in the system wakeup hadling code (Tom Rix).
- Clean up non-kernel-doc comments in hibernation code (Jiapeng
Chong).
- Fix __setup handler error handling in system-wide suspend and
hibernation core code (Randy Dunlap).
- Add device name to suspend_report_result() (Youngjin Jang).
- Make virtual guests honour ACPI S4 hardware signature by default
(David Woodhouse).
- Block power off of a parent PM domain unless child is in deepest
state (Ulf Hansson).
- Use dev_err_probe() to simplify error handling for generic PM
domains (Ahmad Fatoum).
- Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).
- Document Intel uncore frequency scaling (Srinivas Pandruvada).
- Add DTPM hierarchy description (Daniel Lezcano).
- Change the locking scheme in DTPM (Daniel Lezcano).
- Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
release (Daniel Lezcano).
- Make dtpm_node_callback[] static (kernel test robot).
- Fix spelling mistake "initialze" -> "initialize" in
dtpm_create_hierarchy() (Colin Ian King).
- Add tracer tool for the amd-pstate driver (Jinzhou Su).
- Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).
- Add AMD P-State support to the cpupower utility (Huang Rui)"
* tag 'pm-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (58 commits)
cpufreq: powernow-k8: Re-order the init checks
cpuidle: intel_idle: Drop redundant backslash at line end
cpuidle: intel_idle: Update intel_idle() kerneldoc comment
PM: hibernate: Honour ACPI hardware signature by default for virtual guests
cpufreq: intel_pstate: Use firmware default EPP
cpufreq: unify show() and store() naming and use __ATTR_XX
PM: core: keep irq flags in device_pm_check_callbacks()
cpuidle: haltpoll: Call cpuidle_poll_state_init() later
Documentation: amd-pstate: add tracer tool introduction
tools/power/x86/amd_pstate_tracer: Add tracer tool for AMD P-state
tools/power/x86/intel_pstate_tracer: make tracer as a module
cpufreq: amd-pstate: Add more tracepoint for AMD P-State module
PM: sleep: Add device name to suspend_report_result()
turbostat: fix PC6 displaying on some systems
intel_idle: add core C6 optimization for SPR
intel_idle: add 'preferred_cstates' module argument
intel_idle: add SPR support
PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
ACPI: processor idle: Check for architectural support for LPI
cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
...
- Use uintptr_t and offsetof() in the ACPICA code to avoid compiler
warnings regarding NULL pointer arithmetic (Rafael Wysocki).
- Fix possible NULL pointer dereference in acpi_ns_walk_namespace()
when passed "acpi=off" in the command line (Rafael Wysocki).
- Fix and clean up acpi_os_read/write_port() (Rafael Wysocki).
- Introduce acpi_bus_for_each_dev() and use it for walking all ACPI
device objects in the Type C code (Rafael Wysocki).
- Fix the _OSC platform capabilities negotioation and prevent CPPC
from being used if the platform firmware indicates that it not
supported via _OSC (Rafael Wysocki).
- Use ida_alloc() instead of ida_simple_get() for ACPI enumeration
of devices (Rafael Wysocki).
- Add AGDI and CEDT to the list of known ACPI table signatures (Ilkka
Koskinen, Robert Kiraly).
- Add power management debug messages related to suspend-to-idle in
two places (Rafael Wysocki).
- Fix __acpi_node_get_property_reference() return value and clean up
that function (Andy Shevchenko, Sakari Ailus).
- Fix return value of the __setup handler in the ACPI PM timer clock
source driver (Randy Dunlap).
- Clean up double words in two comments (Tom Rix).
- Add "skip i2c clients" quirks for Lenovo Yoga Tablet 1050F/L and
Nextbook Ares 8 (Hans de Goede).
- Clean up frequency invariance handling on x86 in the ACPI CPPC
library (Huang Rui).
- Work around broken XSDT on the Advantech DAC-BJ01 board (Mark
Cilissen).
- Make wakeup events checks in the ACPI EC driver more
straightforward and clean up acpi_ec_submit_event() (Rafael
Wysocki).
- Make it possible to obtain the CPU capacity with the help of CPPC
information (Ionela Voinescu).
- Improve fine grained fan control in the ACPI fan driver and
document it (Srinivas Pandruvada).
- Add device HID and quirk for Microsoft Surface Go 3 to the ACPI
battery driver (Maximilian Luz).
- Make the ACPI driver for Intel SoCs (LPSS) let the SPI driver know
the exact type of the controller (Andy Shevchenko).
- Force native backlight mode on Clevo NL5xRU and NL5xNU (Werner
Sembach).
- Fix return value of __setup handlers in the APEI code (Randy
Dunlap).
- Add Arm Generic Diagnostic Dump and Reset device driver (Ilkka
Koskinen).
- Limit printable size of BERT table data (Darren Hart).
- Fix up HEST and GHES initialization (Shuai Xue).
- Update the ACPI device enumeration documentation and unify the ASL
style in GPIO-related examples (Andy Shevchenko).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pF0SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxrPMP/A8kkgzJegS4CtUCtUpLcCufaggdpQTd
I9GQJeo73wGdmaelCQuXFJ9NUhuA1KHIU0WYqneWX+wifht+wl+KAZYvswPm0/wt
TiypiyRMf8Il0Q9tTTmWKSokK80O7ks8OZEe1HmiJimdEn+F1XUzLLgbQKFqhbbV
NHkVix3xR/7htgSb0ksaijH3XLyStuwPvc4WFueO14Pp5Bkr2Of33Xdd0UYeTCi4
RUqL3qJ4DT5gvgKipg43y6D2igRq/xMKx1bgnBjtwKChtjK23GGR6UB/jAIitIMv
XpxLw7kceY65zjJmmJ1+OKeM6CNAcIbTeyCyffSAH/MYRObj93XpMjnhxXILzjYB
Pz2U/lJy0kgw0PUkFzTdPkuuJlDn5GLY8F2cytvtlQAIhtFVFFcnHZYfhhLRWpoN
Sta2NHpGRejR/jixkQ4JtsjQ/Og02zQ9N344enaC64h3JYPBSyM8mpLH/YoXnuSx
jDPQK1KE/QVXRixKFjrPSXYq2p7w/CH7yZXX7TOo+ScnLhapiSUpyh7wiFslZ729
v11yzjsgBQk27qf1EGSImsh+YoRck9qOTb9tkVXGxcifTUPYzyXGn4T5i/ZwpN9v
nL6imYuiRJjFNAksbWo72hjYfhNwWAoCIXgUuxroCPLGGT394j5djisHYMjDNAsG
x43D1Fd4vEgT
=uB8P
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"From the new functionality perspective, the most significant items
here are the new driver for the 'ARM Generic Diagnostic Dump and
Reset' device, the extension of fine grain fan control in the ACPI fan
driver, and the change making it possible to use CPPC information to
obtain CPU capacity.
There are also a few new quirks, a bunch of fixes, including the
platform-level _OSC handling change to make it actually take the
platform firmware response into account, some code and documentation
cleanups, and a notable update of the ACPI device enumeration
documentation.
Specifics:
- Use uintptr_t and offsetof() in the ACPICA code to avoid compiler
warnings regarding NULL pointer arithmetic (Rafael Wysocki).
- Fix possible NULL pointer dereference in acpi_ns_walk_namespace()
when passed "acpi=off" in the command line (Rafael Wysocki).
- Fix and clean up acpi_os_read/write_port() (Rafael Wysocki).
- Introduce acpi_bus_for_each_dev() and use it for walking all ACPI
device objects in the Type C code (Rafael Wysocki).
- Fix the _OSC platform capabilities negotioation and prevent CPPC
from being used if the platform firmware indicates that it not
supported via _OSC (Rafael Wysocki).
- Use ida_alloc() instead of ida_simple_get() for ACPI enumeration of
devices (Rafael Wysocki).
- Add AGDI and CEDT to the list of known ACPI table signatures (Ilkka
Koskinen, Robert Kiraly).
- Add power management debug messages related to suspend-to-idle in
two places (Rafael Wysocki).
- Fix __acpi_node_get_property_reference() return value and clean up
that function (Andy Shevchenko, Sakari Ailus).
- Fix return value of the __setup handler in the ACPI PM timer clock
source driver (Randy Dunlap).
- Clean up double words in two comments (Tom Rix).
- Add "skip i2c clients" quirks for Lenovo Yoga Tablet 1050F/L and
Nextbook Ares 8 (Hans de Goede).
- Clean up frequency invariance handling on x86 in the ACPI CPPC
library (Huang Rui).
- Work around broken XSDT on the Advantech DAC-BJ01 board (Mark
Cilissen).
- Make wakeup events checks in the ACPI EC driver more
straightforward and clean up acpi_ec_submit_event() (Rafael
Wysocki).
- Make it possible to obtain the CPU capacity with the help of CPPC
information (Ionela Voinescu).
- Improve fine grained fan control in the ACPI fan driver and
document it (Srinivas Pandruvada).
- Add device HID and quirk for Microsoft Surface Go 3 to the ACPI
battery driver (Maximilian Luz).
- Make the ACPI driver for Intel SoCs (LPSS) let the SPI driver know
the exact type of the controller (Andy Shevchenko).
- Force native backlight mode on Clevo NL5xRU and NL5xNU (Werner
Sembach).
- Fix return value of __setup handlers in the APEI code (Randy
Dunlap).
- Add Arm Generic Diagnostic Dump and Reset device driver (Ilkka
Koskinen).
- Limit printable size of BERT table data (Darren Hart).
- Fix up HEST and GHES initialization (Shuai Xue).
- Update the ACPI device enumeration documentation and unify the ASL
style in GPIO-related examples (Andy Shevchenko)"
* tag 'acpi-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
clocksource: acpi_pm: fix return value of __setup handler
ACPI: bus: Avoid using CPPC if not supported by firmware
Revert "ACPI: Pass the same capabilities to the _OSC regardless of the query flag"
ACPI: video: Force backlight native for Clevo NL5xRU and NL5xNU
arm64, topology: enable use of init_cpu_capacity_cppc()
arch_topology: obtain cpu capacity using information from CPPC
x86, ACPI: rename init_freq_invariance_cppc() to arch_init_invariance_cppc()
ACPI: AGDI: Add driver for Arm Generic Diagnostic Dump and Reset device
ACPI: tables: Add AGDI to the list of known table signatures
ACPI/APEI: Limit printable size of BERT table data
ACPI: docs: gpio-properties: Unify ASL style for GPIO examples
ACPI / x86: Work around broken XSDT on Advantech DAC-BJ01 board
ACPI: APEI: fix return value of __setup handlers
x86/ACPI: CPPC: Move init_freq_invariance_cppc() into x86 CPPC
x86: Expose init_freq_invariance() to topology header
x86/ACPI: CPPC: Move AMD maximum frequency ratio setting function into x86 CPPC
x86/ACPI: CPPC: Rename cppc_msr.c to cppc.c
ACPI / x86: Add skip i2c clients quirk for Lenovo Yoga Tablet 1050F/L
ACPI / x86: Add skip i2c clients quirk for Nextbook Ares 8
ACPICA: Avoid walking the ACPI Namespace if it is not there
...
Add the PPIN number to sysfs so that sockets can be identified when
replacement is needed
- Minor fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmI4UBcACgkQEsHwGGHe
VUoDQw/9E2WVsLS+iVngYI5hY+LQbbeOLPt9sqgdf8U/4tQPwJfAKYRALQ3FvjbJ
XKEqcHoCIBH0XQSC0TPpBF96VABr5Vrc/knV8x2OWJ82p54beB0rXh5mdsnsrVMQ
7Gi00iEgZ1kw7TwRN7rMzUpudNOr7C/SSQL535hyZA4NT9QkNBObZWHMnojVnEmr
AW1TY54xXSpu7xDY5ari5NGeSgvu1PIsCy0EKMK/SFLEpKQW4lt+lUe1aSieMarj
xgnfIyGW3SUbadwlvLbIqVpR0RQBDLTabx8nyXnJAVZlwuAfioRUGL+Z4GFA0Y7q
uDofxuScBAea3sPPFAAIoh13y595TjowBX7pHA1sqjWLmFKt6Qqz5dq1uBVEvIYw
uTAQ/igJ4N2jq03jwnAw1LUAES5azSseCsiQxR7oqzK9KaRlptxHTAqhjqsgpIp4
VLdYtgkzOEFiOsWsHWP1Dd+vzpMvTh5gtTXZuVcldo2D6scdcj+oaloHQ5XMiFu1
GKuyiY4EbkRcp9ZQ847xOn4knEg+aq9zL0tJoWWEMKfRQn6425TEOLqkIdc9QfeU
t63yqJ1q3NTjjzxzy/FdKwdoyOOQxeDl5YGPX3gZnj9X/0wgs+dHRmKp0o74SIg9
4h2kB69wRwn6rC09P2UkQVGpDL0mnif4ZAh61vRE+mS0zSNCkEA=
=MWVZ
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu feature updates from Borislav Petkov:
- Merge the AMD and Intel PPIN code into a shared one by both vendors.
Add the PPIN number to sysfs so that sockets can be identified when
replacement is needed
- Minor fixes and cleanups
* tag 'x86_cpu_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Clear SME feature flag when not in use
x86/cpufeatures: Put the AMX macros in the word 18 block
topology/sysfs: Add PPIN in sysfs under cpu topology
topology/sysfs: Add format parameter to macro defining "show" functions for proc
x86/cpu: Read/save PPIN MSR during initialization
x86/cpu: X86_FEATURE_INTEL_PPIN finally has a CPUID bit
x86/cpu: Merge Intel and AMD ppin_init() functions
x86/CPU/AMD: Use default_groups in kobj_type
Merge changes related to system sleep, PM domains changes and power
management documentation changes for 5.18-rc1:
- Fix load_image_and_restore() error path (Ye Bin).
- Fix typos in comments in the system wakeup hadling code (Tom Rix).
- Clean up non-kernel-doc comments in hibernation code (Jiapeng
Chong).
- Fix __setup handler error handling in system-wide suspend and
hibernation core code (Randy Dunlap).
- Add device name to suspend_report_result() (Youngjin Jang).
- Make virtual guests honour ACPI S4 hardware signature by
default (David Woodhouse).
- Block power off of a parent PM domain unless child is in deepest
state (Ulf Hansson).
- Use dev_err_probe() to simplify error handling for generic PM
domains (Ahmad Fatoum).
- Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).
- Document Intel uncore frequency scaling (Srinivas Pandruvada).
* pm-sleep:
PM: hibernate: Honour ACPI hardware signature by default for virtual guests
PM: sleep: Add device name to suspend_report_result()
PM: suspend: fix return value of __setup handler
PM: hibernate: fix __setup handler error handling
PM: hibernate: Clean up non-kernel-doc comments
PM: sleep: wakeup: Fix typos in comments
PM: hibernate: fix load_image_and_restore() error path
* pm-domains:
PM: domains: Fix sleep-in-atomic bug caused by genpd_debug_remove()
PM: domains: use dev_err_probe() to simplify error handling
PM: domains: Prevent power off for parent unless child is in deepest state
* pm-docs:
Documentation: admin-guide: pm: Document uncore frequency scaling
There's an inconsistency that arises when a register set can be accessed
internally via MMIO, or externally via SPI. The VSC7514 chip allows both
modes of operation. When internally accessed, the system utilizes __iomem,
devm_ioremap_resource, and devm_regmap_init_mmio.
For SPI it isn't possible to utilize memory-mapped IO. To properly operate,
the resource base must be added to the register before every operation.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20220313224524.399947-3-colin.foster@in-advantage.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Add an additional reg_downshift to be applied to register addresses before
any register accesses. An example of a device that uses this is a VSC7514
chip, which require each register address to be downshifted by two if the
access is performed over a SPI bus.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20220313224524.399947-2-colin.foster@in-advantage.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Merge ACPI EC driver changes, CPPC-related changes, ACPI fan driver
changes and ACPI battery driver changes for 5.18-rc1:
- Make wakeup events checks in the ACPI EC driver more
straightforward and clean up acpi_ec_submit_event() (Rafael
Wysocki).
- Make it possible to obtain the CPU capacity with the help of CPPC
information (Ionela Voinescu).
- Improve fine grained fan control in the ACPI fan driver and
document it (Srinivas Pandruvada).
- Add device HID and quirk for Microsoft Surface Go 3 to the ACPI
battery driver (Maximilian Luz).
* acpi-ec:
ACPI: EC: Rearrange code in acpi_ec_submit_event()
ACPI: EC: Reduce indentation level in acpi_ec_submit_event()
ACPI: EC: Do not return result from advance_transaction()
* acpi-cppc:
arm64, topology: enable use of init_cpu_capacity_cppc()
arch_topology: obtain cpu capacity using information from CPPC
x86, ACPI: rename init_freq_invariance_cppc() to arch_init_invariance_cppc()
* acpi-fan:
Documentation/admin-guide/acpi: Add documentation for fine grain control
ACPI: fan: Add additional attributes for fine grain control
ACPI: fan: Properly handle fine grain control
ACPI: fan: Optimize struct acpi_fan_fif
ACPI: fan: Separate file for attributes creation
ACPI: fan: Fix error reporting to user space
* acpi-battery:
ACPI: battery: Add device HID and quirk for Microsoft Surface Go 3
The global variable driver_deferred_probe_enable has a default value of
false and does not need to be initialized to false.
Signed-off-by: lizhe <sensor1010@163.com>
Link: https://lore.kernel.org/r/20220309135418.31101-1-sensor1010@163.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function soc_device_match() is difficult to read for various
reasons:
- There are two loop conditions using different styles: "while (...)"
(which is BTW always true) vs. "if ... break",
- The are two return condition using different logic: "if ... return
foo" vs. "if ... else return bar".
Make the code easier to read by:
1. Removing the always-true "!ret" loop condition, and dropping the
now unneeded pre-initialization of "ret",
2. Converting "if ... break" to a proper "while (...)" loop condition,
3. Inverting the logic of the second return condition.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/9f9107c06f7d065ae6581e5290ef5d72f7298fd1.1646132835.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When "driver_async_probe=nulltty" is used on the kernel boot command line,
it causes an Unknown parameter message and the string is added to init's
environment strings, polluting them.
Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
driver_async_probe=nulltty", will be passed to user space.
Run /sbin/init as init process
with arguments:
/sbin/init
with environment:
HOME=/
TERM=linux
BOOT_IMAGE=/boot/bzImage-517rc6
driver_async_probe=nulltty
Change the return value of the __setup function to 1 to indicate
that the __setup option has been handled.
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Fixes: 1ea61b68d0 ("async: Add cmdline option to specify drivers to be async probed")
Cc: Feng Tang <feng.tang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Reviewed-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220301041829.15137-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are 3 copies of the same device sysfs cleanup and drv/bus remove()
hooks used for probe failure, testing re-probing, and device unbinding.
Let's refactor the code to its own function.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220223225257.1681968-3-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are 3 copies of the same device cleanup code used for probe failure,
testing re-probing, and device unbinding. Changes to this code often miss
at least one of the copies of the code. See commits d0243bbd5d ("drivers
core: Free dma_range_map when driver probe failed") and d8f7a5484f
("driver core: Free DMA range map when device is released") for example.
Let's refactor the code to its own function.
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220223225257.1681968-2-robh@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Define topology_init_cpu_capacity_cppc() to use highest performance
values from _CPC objects to obtain and set maximum capacity information
for each CPU. acpi_cppc_processor_probe() is a good point at which to
trigger the initialization of CPU (u-arch) capacity values, as at this
point the highest performance values can be obtained from each CPU's
_CPC objects. Architectures can therefore use this functionality
through arch_init_invariance_cppc().
The performance scale used by CPPC is a unified scale for all CPUs in
the system. Therefore, by obtaining the raw highest performance values
from the _CPC objects, and normalizing them on the [0, 1024] capacity
scale, used by the task scheduler, we obtain the CPU capacity of each
CPU.
While an ACPI Notify(0x85) could alert about a change in the highest
performance value, which should in turn retrigger the CPU capacity
computations, this notification is not currently handled by the ACPI
processor driver. When supported, a call to arch_init_invariance_cppc()
would perform the update.
Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Valentin Schneider <valentin.schneider@arm.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently, suspend_report_result() prints only function information.
If any driver uses a common PM function, nobody knows who exactly
called the failing function.
A device pinter is needed to recognize the failing device.
For example:
PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0
become after the change:
serial 00:05: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 0
pci 0000:00:01.3: PM: dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns 0
Signed-off-by: Youngjin Jang <yj84.jang@samsung.com>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The PM Runtime docs say:
Drivers in ->remove() callback should undo the runtime PM changes done
in ->probe(). Usually this means calling pm_runtime_disable(),
pm_runtime_dont_use_autosuspend() etc.
From grepping code, it's clear that many people aren't aware of the
need to call pm_runtime_dont_use_autosuspend().
When brainstorming solutions, one idea that came up was to leverage
the new-ish devm_pm_runtime_enable() function. The idea here is that:
* When the devm action is called we know that the driver is being
removed. It's the perfect time to undo the use_autosuspend.
* The code of pm_runtime_dont_use_autosuspend() already handles the
case of being called when autosuspend wasn't enabled.
Suggested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the second 'the'.
Replace the second 'of' with 'the'.
Replace 'couter' with 'counter'.
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
dev_err_probe() can reduce code size, makes the code easier to read
and has the added benefit of recording the defer reason for later
read out. Use it where appropriate.
This also fixes an issue, where an error message in __genpd_dev_pm_attach
was not terminated by a line break.
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A PM domain managed by genpd may support multiple idlestates (power-off
states). During genpd_power_off() a genpd governor may be asked to select
one of the idlestates based upon the dev PM QoS constraints, for example.
However, there is a problem with the behaviour around this in genpd. More
precisely, a parent-domain is allowed to be powered off, no matter of what
idlestate that has been selected for the child-domain.
For the stm32mp1 platform from STMicro, this behaviour doesn't play well.
Instead, the parent-domain must not be powered off, unless the deepest
idlestate has been selected for the child-domain. As the current behaviour
in genpd is quite questionable anyway, let's simply change it into what is
needed by the stm32mp1 platform.
If it surprisingly turns out that other platforms may need a different
behaviour from genpd, then we will have to revisit this to find a way to
make it configurable.
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
A fix for interrupt controllers which require the explicit
acknowledgement of interrupts using a different register to the one
where interrupts are reported. Urgent for the few devices this affects.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmIY/qsACgkQJNaLcl1U
h9D9cQf5Afp92jYFdyITgixSds4tSpjSvWQMj7nwtTXHvg1Ug6GQnggpDLkDCuuu
FB+lvlfDViqSUws1o+e1DWSkyB8PO6GYWXchDPhKlldHebnUxLOkM4G0SQq8f29j
k/lH4PSz7KOndnyJJa/ooClnRKnmZc5E9jXqx3hdqAVLnW6YpKBAoOLgX15xyyU4
JxpZovkMJVDrwGTciSTibWeqCAjk75UzBXvf9nYixJzEUjXnm2k+5lLvHK/tKvVP
hbeddhn08UUfJgUL1vEuegJDP9x/B3my3JjGvD0kRmDLe1fbnyxe6Vt3tFA2vLqb
dblnoi41CYn5KOsTTwkbLp7ZplIPbw==
=lGrx
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"A fix for interrupt controllers which require the explicit
acknowledgement of interrupts using a different register to the one
where interrupts are reported.
Urgent for the few devices this affects"
* tag 'regmap-fix-v5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Update interrupt clear register for proper reset
Here is a single driver core fix for 5.17-rc6. It resolves a reported
problem when the DMA map of a device is not properly released.
It has been in linux-next with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYhjVag8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykYrgCeNS9v4DlGMILrB5gaXBwGxS+MvLYAnRckDk80
chxYlYuVXvT6ykg7ksRF
=uNk0
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core fix for 5.17-rc6. It resolves a reported
problem when the DMA map of a device is not properly released.
It has been in linux-next with no reported problems"
* tag 'driver-core-5.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: Free DMA range map when device is released
The component requires the compare/release functions, there are so many
copies in current kernel. Just define four common helpers for them.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Yong Wu <yong.wu@mediatek.com>
Link: https://lore.kernel.org/r/20220214060819.7334-2-yong.wu@mediatek.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Document in the firmware loader Kconfig help text that firmware image
file compression is not supported for builtin EXTRA_FIRMWARE files so
that someone does not waste time trying that.
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220214222311.9758-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmISrYgeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg20IAKDZr7rfSHBopjQV
Cocw744tom0XuxpvSZpp2GGOOXF+tkswcNNaRIrbGOl1mkyxA7eBZCTMpDeDS9aQ
wB0D0Gxx8QBAJp4KgB1W7TB+hIGes/rs8Ve+6iO4ulLLdCVWX/q2boI0aZ7QX9O9
qNi8OsoZQtk6falRvciZFHwV5Av1p2Sy1AW57udQ7DvJ4H98AfKf1u8/z208WWW8
1ixC+qJxQcUcM9vI+7P9Tt7NbFSKv8SvAmqjFY7P+DxQAsVw6KXoqVXykDzeOv0t
fUNOE/t0oFZafwtn8h7KBQnwS9lH03+3KkslVZs+iMFyUj/Bar+NVVyKoDhWXtVg
/PuMhEg=
=eU1o
-----END PGP SIGNATURE-----
Merge tag 'v5.17-rc5' into sched/core, to resolve conflicts
New conflicts in sched/core due to the following upstream fixes:
44585f7bc0 ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n")
a06247c680 ("psi: Fix uaf issue when psi trigger is destroyed while being polled")
Conflicts:
include/linux/psi_types.h
kernel/sched/psi.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With the existing logic where clear_ack is true (HW doesn’t support
auto clear for ICR), interrupt clear register reset is not handled
properly. Due to this only the first interrupts get processed properly
and further interrupts are blocked due to not resetting interrupt
clear register.
Example for issue case where Invert_ack is false and clear_ack is true:
Say Default ISR=0x00 & ICR=0x00 and ISR is triggered with 2
interrupts making ISR = 0x11.
Step 1: Say ISR is set 0x11 (store status_buff = ISR). ISR needs to
be cleared with the help of ICR once the Interrupt is processed.
Step 2: Write ICR = 0x11 (status_buff), this will clear the ISR to 0x00.
Step 3: Issue - In the existing code, ICR is written with ICR =
~(status_buff) i.e ICR = 0xEE -> This will block all the interrupts
from raising except for interrupts 0 and 4. So expectation here is to
reset ICR, which will unblock all the interrupts.
if (chip->clear_ack) {
if (chip->ack_invert && !ret)
........
else if (!ret)
ret = regmap_write(map, reg,
~data->status_buf[i]);
So writing 0 and 0xff (when ack_invert is true) should have no effect, other
than clearing the ACKs just set.
Fixes: 3a6f0fb7b8 ("regmap: irq: Add support to clear ack registers")
Signed-off-by: Prasad Kumpatla <quic_pkumpatl@quicinc.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20220217085007.30218-1-quic_pkumpatl@quicinc.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Refer to housekeeping APIs using single feature types instead of flags.
This prevents from passing multiple isolation features at once to
housekeeping interfaces, which soon won't be possible anymore as each
isolation features will have their own cpumask.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Reviewed-by: Phil Auld <pauld@redhat.com>
Link: https://lore.kernel.org/r/20220207155910.527133-5-frederic@kernel.org
New fwnode_get_irq_byname() landed after an unrelated function
by ordering. Move fwnode_iomap(), so fwnode_get_irq*() APIs will
go together.
No functional change intended.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull i2c material for 5.18 that is depended on by subsequent device
properties changes.
* 'i2c/alert-for-acpi' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: smbus: Use device_*() functions instead of of_*()
docs: firmware-guide: ACPI: Add named interrupt doc
device property: Add fwnode_irq_get_byname
The commit 2043727c28 ("driver core: platform: Make use of the helper
function dev_err_probe()") missed to also convert platform_get_irq_byname()
for some strange reason -- do that now.
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/11a4aeb2-721c-56a9-919b-f356a30720e0@omp.ru
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After commit e3728b50cd ("ACPI: PM: s2idle: Avoid possible race
related to the EC GPE") wakeup interrupts occurring immediately after
the one discarded by acpi_s2idle_wake() may be missed. Moreover, if
the SCI triggers again immediately after the rearming in
acpi_s2idle_wake(), that wakeup may be missed too.
The problem is that pm_system_irq_wakeup() only calls pm_system_wakeup()
when pm_wakeup_irq is 0, but that's not the case any more after the
interrupt causing acpi_s2idle_wake() to run until pm_wakeup_irq is
cleared by the pm_wakeup_clear() call in s2idle_loop(). However,
there may be wakeup interrupts occurring in that time frame and if
that happens, they will be missed.
To address that issue first move the clearing of pm_wakeup_irq to
the point at which it is known that the interrupt causing
acpi_s2idle_wake() to tun will be discarded, before rearming the SCI
for wakeup. Moreover, because that only reduces the size of the
time window in which the issue may manifest itself, allow
pm_system_irq_wakeup() to register two second wakeup interrupts in
a row and, when discarding the first one, replace it with the second
one. [Of course, this assumes that only one wakeup interrupt can be
discarded in one go, but currently that is the case and I am not
aware of any plans to change that.]
Fixes: e3728b50cd ("ACPI: PM: s2idle: Avoid possible race related to the EC GPE")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The internal_fs_type is mounted via vfs_kernel_mount() and is never
registered as a filesystem, thus specifying the parameters is redundant
as those params will not be validated by fs_validate_description().
Both {shmem,ramfs}_fs_parameters are anyway validated when those
respective filesystems are first registered, so there is no reason to
pass them to devtmpfs too, drop them.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Anthony Iliopoulos <ailiop@suse.com>
Link: https://lore.kernel.org/r/20220119220248.32225-1-ailiop@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no good reason to keep genhd.h separate from the main blkdev.h
header that includes it. So fold the contents of genhd.h into blkdev.h
and remove genhd.h entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
PPIN is the Protected Processor Identification Number.
This is used to identify the socket as a Field Replaceable Unit (FRU).
Existing code only displays this when reporting errors. But this makes
it inconvenient for large clusters to use it for its intended purpose
of inventory control.
Add ppin to /sys/devices/system/cpu/cpu*/topology to make what
is already available using RDMSR more easily accessible. Make
the file read only for root in case there are still people
concerned about making a unique system "serial number" available.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220131230111.2004669-6-tony.luck@intel.com
All the simple (non-mask and non-list files in
/sys/devices/system/cpu/cpu0/topology/ are currently printed as decimal
integers.
Refactor the macro that generates the "show" functions to take a format
parameter to allow future files to display in other formats.
No functional change.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220131230111.2004669-5-tony.luck@intel.com
Remove most references to 'master' in the code and replace them with
some form of 'aggregate device'. This better reflects the reality of
what this code does, i.e. an aggregate device that represents a
device like a GPU card once some set of devices that make up the
aggregate device probe and register with the component framework.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Saravana Kannan <saravanak@google.com>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20220127200141.1295328-2-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add fwnode_irq_get_byname() to get an interrupt by name from either
ACPI table or Device Tree, whichever is used for enumeration.
In the ACPI case, this allow us to use 'interrupt-names' in
_DSD which can be mapped to Interrupt() resource by index.
The implementation is similar to 'interrupt-names' in the
Device Tree.
Signed-off-by: Akhil R <akhilrajeev@nvidia.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
This is not a resource manager, it is a "Resource managed version of
regmap_add_irq_chip()". Fix the comment.
Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Link: https://lore.kernel.org/r/20220113144259.355845-1-luca@lucaceresoli.net
Signed-off-by: Mark Brown <broonie@kernel.org>
Patch series "sysctl: 3rd set of kernel/sysctl cleanups", v2.
This is the third set of patches to help address cleaning the kitchen
seink in kernel/sysctl.c and to move sysctls away to where they are
actually implemented / used.
This patch (of 8):
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.
To help with this maintenance let's start by moving sysctls to places
where they actually belong. The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we
just care about the core logic.
So move the firmware configuration sysctl table to the only place where
it is used, and make it clear that if sysctls are disabled this is not
used.
[akpm@linux-foundation.org: export register_firmware_config_sysctl and unregister_firmware_config_sysctl to modules]
[akpm@linux-foundation.org: use EXPORT_SYMBOL_NS_GPL instead]
[sfr@canb.auug.org.au: fix that so it compiles]
Link: https://lkml.kernel.org/r/20211201160626.401d828d@canb.auug.org.au
[mcgrof@kernel.org: major commit log update to justify the move]
Link: https://lkml.kernel.org/r/20211124231435.1445213-1-mcgrof@kernel.org
Link: https://lkml.kernel.org/r/20211124231435.1445213-2-mcgrof@kernel.org
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Stephen Kitt <steve@sk2.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: James E.J. Bottomley <jejb@linux.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Lukas Middendorf <kernel@tuxforce.de>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Amir Goldstein <amir73il@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Julia Lawall <julia.lawall@inria.fr>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Phillip Potter <phil@philpotter.co.uk>
Cc: Qing Wang <wangqing@vivo.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
New driver:
- Sunplus SP7021 RTC
- Nintendo GameCube, Wii and Wii U RTC
Drivers:
- cmos: refactor UIP handling and presence check, fix century
- rs5c372: offset correction support, report low voltage
- rv8803: Epson RX8804 support
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEBqsFVZXh8s/0O5JiY6TcMGxwOjIFAmHp+jkACgkQY6TcMGxw
OjJS+hAArVbXHT0ceVHY5mzqkBXCyVxhXdyLUw/VmedO9NvCTk1+O9VhBwKTRdQX
FLuvPkyjZlkAk3CEfaHBp4u5AW3Xumdkuh9XwG8b3qRs3UGYBgKiNfXQFtOWHs7w
VO+lD7+f2GNNTYbYH2dLPs6qZwVIN0rL7JTwUH7Po2mqBVztppSOYhmo3vmOqqux
FI5jFVDoE8u+h359bIKcaax3/dne9m1uyD1WWxOJFRtY+apbECpUBfo1tmauAEXP
gg+v7On+gboDqVe9/lwqB+xfKzWFJKwYIu6CM8+Mf/dxhtRNRXCgszoiCSFZHUwG
GghD7Kb96xWuGX1pRe6xx5/7wixes1wCkIVGa5JQLb5E/GQ3O9y8D+Tdk6lyAP5m
XUVPWGC/qByj6z9pBHtbHmq9lhd1vPHp3SU0ske1xlCQd3WnPq7bQ3MEw345gf4Z
RGrtpnUPIfKzHWID+2zCzTj6TluW9FnnOjcm2U8jF/kwB+d1/H2O5xeWR7eQYbUJ
BY5gjDRVIaM3aQC9QiiFMUorqv8Q3oE9FtjCSuLhUx6WEg4S0mtYsXtRUaLT7pRw
RRsoeCJd8Abnf1Nl/imZ2IrRCcbNhzLAq2Aw8cHbiroAk8lSFAH2NzvcS8213JfG
muOrIB2Q3XBu+6hq452ZDE1155FGWFLwKnAjvSRy5yRbsN79/YA=
=nwD6
-----END PGP SIGNATURE-----
Merge tag 'rtc-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC updates from Alexandre Belloni:
"Two new drivers this cycle and a significant rework of the CMOS driver
make the bulk of the changes.
I also carry powerpc changes with the agreement of Michael.
New drivers:
- Sunplus SP7021 RTC
- Nintendo GameCube, Wii and Wii U RTC
Driver updates:
- cmos: refactor UIP handling and presence check, fix century
- rs5c372: offset correction support, report low voltage
- rv8803: Epson RX8804 support"
* tag 'rtc-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (33 commits)
rtc: sunplus: fix return value in sp_rtc_probe()
rtc: cmos: Evaluate century appropriate
rtc: gamecube: Fix an IS_ERR() vs NULL check
rtc: mc146818-lib: fix signedness bug in mc146818_get_time()
dt-bindings: rtc: qcom-pm8xxx-rtc: update register numbers
rtc: pxa: fix null pointer dereference
rtc: ftrtc010: Use platform_get_irq() to get the interrupt
rtc: Move variable into switch case statement
rtc: pcf2127: Fix typo in comment
dt-bindings: rtc: Add Sunplus RTC json-schema
rtc: Add driver for RTC in Sunplus SP7021
rtc: rs5c372: fix incorrect oscillation value on r2221tl
rtc: rs5c372: add offset correction support
rtc: cmos: avoid UIP when writing alarm time
rtc: cmos: avoid UIP when reading alarm time
rtc: mc146818-lib: refactor mc146818_does_rtc_work
rtc: mc146818-lib: refactor mc146818_get_time
rtc: mc146818-lib: extract mc146818_avoid_UIP
rtc: mc146818-lib: fix RTC presence check
rtc: Check return value from mc146818_get_time()
...
Merge more updates from Andrew Morton:
"55 patches.
Subsystems affected by this patch series: percpu, procfs, sysctl,
misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2,
hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits)
lib: remove redundant assignment to variable ret
ubsan: remove CONFIG_UBSAN_OBJECT_SIZE
kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR
lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB
btrfs: use generic Kconfig option for 256kB page size limit
arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB
configs: introduce debug.config for CI-like setup
delayacct: track delays from memory compact
Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact
delayacct: cleanup flags in struct task_delay_info and functions use it
delayacct: fix incomplete disable operation when switch enable to disable
delayacct: support swapin delay accounting for swapping without blkio
panic: remove oops_id
panic: use error_report_end tracepoint on warnings
fs/adfs: remove unneeded variable make code cleaner
FAT: use io_schedule_timeout() instead of congestion_wait()
hfsplus: use struct_group_attr() for memcpy() region
nilfs2: remove redundant pointer sbufs
fs/binfmt_elf: use PT_LOAD p_align values for static PIE
const_structs.checkpatch: add frequently used ops structs
...
With NEED_PER_CPU_PAGE_FIRST_CHUNK enabled, we need a function to
populate pte, this patch adds a generic pcpu populate pte function,
pcpu_populate_pte(), which is marked __weak and used on most
architectures, but it is overridden on x86, which has its own
implementation.
Link: https://lkml.kernel.org/r/20211216112359.103822-5-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With the previous patch, we could add a generic pcpu first chunk
allocate and free function to cleanup the duplicated definations on each
architecture.
Link: https://lkml.kernel.org/r/20211216112359.103822-4-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add pcpu_fc_cpu_to_node_fn_t and pass it into pcpu_fc_alloc_fn_t, pcpu
first chunk allocation will call it to alloc memblock on the
corresponding node by it, this is prepare for the next patch.
Link: https://lkml.kernel.org/r/20211216112359.103822-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Dennis Zhou <dennis@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Add new kconfig target 'make mod2noconfig', which will be useful to
speed up the build and test iteration.
- Raise the minimum supported version of LLVM to 11.0.0
- Refactor certs/Makefile
- Change the format of include/config/auto.conf to stop double-quoting
string type CONFIG options.
- Fix ARCH=sh builds in dash
- Separate compression macros for general purposes (cmd_bzip2 etc.) and
the ones for decompressors (cmd_bzip2_with_size etc.)
- Misc Makefile cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmHnFNIVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGiQEP/1tkt9IHP7vFvkN9xChQI8HQ7HOC
mPIxBAUzHIp1V2IALb0lfojjnpkzcMNpJZVlmqjgyYShLEPPBFwKVXs1War6GViX
aprUMz7w1zR/vZJ2fplFmrkNwSxNp3+LSE6sHVmsliS4Vfzh7CjHb8DnaKjBvQLZ
M+eQugjHsWI3d3E81/qtRG5EaVs6q8osF3b0Km59mrESWVYKqwlUP3aUyQCCUGFK
mI+zC4SrHH6EAIZd//VpaleXxVtDcjjadb7Iru5MFhFdCBIRoSC3d1IWPUNUKNnK
i0ocDXuIoAulA/mROgrpyAzLXg10qYMwwTmX+tplkHA055gKcY/v4aHym6ypH+TX
6zd34UMTLM32LSjs8hssiQT8BiZU0uZoa/m2E9IBaiExA2sTsRZxgQMKXFFaPQJl
jn4cRiG0K1NDeRKtq4xh2WO46OS4sPlR6zW9EXDEsS/bI05Y7LpUz7Flt6iA2Mq3
0g8uYIYr/9drl96X83tFgTkxxB6lpB29tbsmsrKJRGxvrCDnAhXlXhPCkMajkm2Q
PjJfNtMFzwemSZWq09+F+X5BgCjzZtroOdFI9FTMNhGWyaUJZXCtcXQ6UTIKnTHO
cDjcURvh+l56eNEQ5SMTNtAkxB+pX8gPUmyO1wLwRUT4YodxylkTUXGyBBR9tgTn
Yks1TnPD06ld364l
=8BQf
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add new kconfig target 'make mod2noconfig', which will be useful to
speed up the build and test iteration.
- Raise the minimum supported version of LLVM to 11.0.0
- Refactor certs/Makefile
- Change the format of include/config/auto.conf to stop double-quoting
string type CONFIG options.
- Fix ARCH=sh builds in dash
- Separate compression macros for general purposes (cmd_bzip2 etc.) and
the ones for decompressors (cmd_bzip2_with_size etc.)
- Misc Makefile cleanups
* tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: add cmd_file_size
arch: decompressor: remove useless vmlinux.bin.all-y
kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}
kbuild: drop $(size_append) from cmd_zstd
sh: rename suffix-y to suffix_y
doc: kbuild: fix default in `imply` table
microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV}
certs: move scripts/extract-cert to certs/
kbuild: do not quote string values in include/config/auto.conf
kbuild: do not include include/config/auto.conf from shell scripts
certs: simplify $(srctree)/ handling and remove config_filename macro
kbuild: stop using config_filename in scripts/Makefile.modsign
certs: remove misleading comments about GCC PR
certs: refactor file cleaning
certs: remove unneeded -I$(srctree) option for system_certificates.o
certs: unify duplicated cmd_extract_certs and improve the log
certs: use $< and $@ to simplify the key generation rule
kbuild: remove headers_check stub
kbuild: move headers_check.pl to usr/include/
certs: use if_changed to re-generate the key when the key type is changed
...
Prior to Linux v5.4 devtmpfs used mount_single() which treats the given
mount options as "remount" options, so it updates the configuration of
the single super_block on each mount.
Since that was changed, the mount options used for devtmpfs are ignored.
This is a regression which affect systemd - which mounts devtmpfs with
"-o mode=755,size=4m,nr_inodes=1m".
This patch restores the "remount" effect by calling reconfigure_single()
Fixes: d401727ea0 ("devtmpfs: don't mix {ramfs,shmem}_fill_super() with mount_single()")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Here is the large set of char, misc, and other "small" driver subsystem
changes for 5.17-rc1.
Lots of different things are in here for char/misc drivers such as:
- habanalabs driver updates
- mei driver updates
- lkdtm driver updates
- vmw_vmci driver updates
- android binder driver updates
- other small char/misc driver updates
Also smaller driver subsystems have also been updated, including:
- fpga subsystem updates
- iio subsystem updates
- soundwire subsystem updates
- extcon subsystem updates
- gnss subsystem updates
- phy subsystem updates
- coresight subsystem updates
- firmware subsystem updates
- comedi subsystem updates
- mhi subsystem updates
- speakup subsystem updates
- rapidio subsystem updates
- spmi subsystem updates
- virtual driver updates
- counter subsystem updates
Too many individual changes to summarize, the shortlog contains the full
details.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYeGNAQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymoVgCg1CPjMu8/SDj3Sm3a1UMQJn9jnl8AnjQcEp3z
hMr9mISG4r6g4PvjrJBj
=9May
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc and other driver updates from Greg KH:
"Here is the large set of char, misc, and other "small" driver
subsystem changes for 5.17-rc1.
Lots of different things are in here for char/misc drivers such as:
- habanalabs driver updates
- mei driver updates
- lkdtm driver updates
- vmw_vmci driver updates
- android binder driver updates
- other small char/misc driver updates
Also smaller driver subsystems have also been updated, including:
- fpga subsystem updates
- iio subsystem updates
- soundwire subsystem updates
- extcon subsystem updates
- gnss subsystem updates
- phy subsystem updates
- coresight subsystem updates
- firmware subsystem updates
- comedi subsystem updates
- mhi subsystem updates
- speakup subsystem updates
- rapidio subsystem updates
- spmi subsystem updates
- virtual driver updates
- counter subsystem updates
Too many individual changes to summarize, the shortlog contains the
full details.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (406 commits)
counter: 104-quad-8: Fix use-after-free by quad8_irq_handler
dt-bindings: mux: Document mux-states property
dt-bindings: ti-serdes-mux: Add defines for J721S2 SoC
counter: remove old and now unused registration API
counter: ti-eqep: Convert to new counter registration
counter: stm32-lptimer-cnt: Convert to new counter registration
counter: stm32-timer-cnt: Convert to new counter registration
counter: microchip-tcb-capture: Convert to new counter registration
counter: ftm-quaddec: Convert to new counter registration
counter: intel-qep: Convert to new counter registration
counter: interrupt-cnt: Convert to new counter registration
counter: 104-quad-8: Convert to new counter registration
counter: Update documentation for new counter registration functions
counter: Provide alternative counter registration functions
counter: stm32-timer-cnt: Convert to counter_priv() wrapper
counter: stm32-lptimer-cnt: Convert to counter_priv() wrapper
counter: ti-eqep: Convert to counter_priv() wrapper
counter: ftm-quaddec: Convert to counter_priv() wrapper
counter: intel-qep: Convert to counter_priv() wrapper
counter: microchip-tcb-capture: Convert to counter_priv() wrapper
...
Treewide cleanup and consolidation of MSI interrupt handling in
preparation for further changes in this area which are necessary to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmHf+SETHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYobzGD/wNEFl5qQo5mNZ9thP6JSJFOItm7zMc
2QgzCYOqNwAv4jL6Dqo+EHtbShYqDyWzKdKccgqNjmdIqgW8q7/fubN1OPzRsClV
CZG997AsXDGXYlQcE3tXZjkeCWnWEE2AGLnygSkFV1K/r9ALAtFfTBJAWB+UD+Zc
1P8Kxo0q0Jg+DQAMAA5bWfSSjo/Pmpr/1AFjY7+GA8BBeJJgWOyW7H1S+GYEWVOE
RaQP81Sbd6x1JkopxkNqSJ/lbNJfnPJxi2higB56Y0OYn5CuSarYbZUM7oQ2V61t
jN7pcEEvTpjLd6SJ93ry8WOcJVMTbccCklVfD0AfEwwGUGw2VM6fSyNrZfnrosUN
tGBEO8eflBJzGTAwSkz1EhiGKna4o1NBDWpr0sH2iUiZC5G6V2hUDbM+0PQJhDa8
bICwguZElcUUPOprwjS0HXhymnxghTmNHyoEP1yxGoKLTrwIqkH/9KGustWkcBmM
hNtOCwQNqxcOHg/r3MN0KxttTASgoXgNnmFliAWA7XwseRpLWc95XPQFa5sptRhc
EzwumEz17EW1iI5/NyZQcY+jcZ9BdgCqgZ9ECjZkyN4U+9G6iACUkxVaHUUs77jl
a0ISSEHEvJisFOsOMYyFfeWkpIKGIKP/bpLOJEJ6kAdrUWFvlRGF3qlav3JldXQl
ypFjPapDeB5guw==
=vKzd
-----END PGP SIGNATURE-----
Merge tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI irq updates from Thomas Gleixner:
"Rework of the MSI interrupt infrastructure.
This is a treewide cleanup and consolidation of MSI interrupt handling
in preparation for further changes in this area which are necessary
to:
- address existing shortcomings in the VFIO area
- support the upcoming Interrupt Message Store functionality which
decouples the message store from the PCI config/MMIO space"
* tag 'irq-msi-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (94 commits)
genirq/msi: Populate sysfs entry only once
PCI/MSI: Unbreak pci_irq_get_affinity()
genirq/msi: Convert storage to xarray
genirq/msi: Simplify sysfs handling
genirq/msi: Add abuse prevention comment to msi header
genirq/msi: Mop up old interfaces
genirq/msi: Convert to new functions
genirq/msi: Make interrupt allocation less convoluted
platform-msi: Simplify platform device MSI code
platform-msi: Let core code handle MSI descriptors
bus: fsl-mc-msi: Simplify MSI descriptor handling
soc: ti: ti_sci_inta_msi: Remove ti_sci_inta_msi_domain_free_irqs()
soc: ti: ti_sci_inta_msi: Rework MSI descriptor allocation
NTB/msi: Convert to msi_on_each_desc()
PCI: hv: Rework MSI handling
powerpc/mpic_u3msi: Use msi_for_each-desc()
powerpc/fsl_msi: Use msi_for_each_desc()
powerpc/pasemi/msi: Convert to msi_on_each_dec()
powerpc/cell/axon_msi: Convert to msi_on_each_desc()
powerpc/4xx/hsta: Rework MSI handling
...
Here is the set of changes for the driver core for 5.17-rc1.
Lots of little things here, including:
- kobj_type cleanups
- auxiliary_bus documentation updates
- auxiliary_device conversions for some drivers (relevant
subsystems all have provided acks for these)
- kernfs lock contention reduction for some workloads
- other tiny cleanups and changes.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYd7deA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym8ngCgw0ANwrRPE5b1dthEmfU2f8Knk5kAn0pHQv6R
VRZJypgNfU/Pt0ykstZD
=CO9J
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the set of changes for the driver core for 5.17-rc1.
Lots of little things here, including:
- kobj_type cleanups
- auxiliary_bus documentation updates
- auxiliary_device conversions for some drivers (relevant subsystems
all have provided acks for these)
- kernfs lock contention reduction for some workloads
- other tiny cleanups and changes.
All of these have been in linux-next for a while with no reported
issues"
* tag 'driver-core-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (43 commits)
kobject documentation: remove default_attrs information
drivers/firmware: Add missing platform_device_put() in sysfb_create_simplefb
debugfs: lockdown: Allow reading debugfs files that are not world readable
driver core: Make bus notifiers in right order in really_probe()
driver core: Move driver_sysfs_remove() after driver_sysfs_add()
firmware: edd: remove empty default_attrs array
firmware: dmi-sysfs: use default_groups in kobj_type
qemu_fw_cfg: use default_groups in kobj_type
firmware: memmap: use default_groups in kobj_type
sh: sq: use default_groups in kobj_type
headers/uninline: Uninline single-use function: kobject_has_children()
devtmpfs: mount with noexec and nosuid
driver core: Simplify async probe test code by using ktime_ms_delta()
nilfs2: use default_groups in kobj_type
kobject: remove kset from struct kset_uevent_ops callbacks
driver core: make kobj_type constant.
driver core: platform: document registration-failure requirement
vdpa/mlx5: Use auxiliary_device driver data helpers
net/mlx5e: Use auxiliary_device driver data helpers
soundwire: intel: Use auxiliary_device driver data helpers
...
A very quiet release for regmap:
- Allow a custom _update_bits() operation for devices with no bus.
- Fix an issue with creation of the debugfs directory when attaching a
device to an existing no device regmap.
- A trivial formatting fix.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmHcLcAACgkQJNaLcl1U
h9AyZgf/XnwgUltXGyDUgo9OsHB6Jspmua7IFVfTXT9sJckJPCeqxJlWBiH+fv+k
GJZwCPVmeBjRzGJV7jwOSXZgEV2JOotfLCBiGM6ZXepwisy14n4QVLN53FwJbbBW
9OhuBGnXcS3KWJEtwz+0+ZaY6RXVWTXHL3L/T2GF2k+CfOHSvysTMOU03RVvUF8D
3XmdghS8J8EY0RIGXZpnMD8nEePSfhq+TcLbsf9xguixo2rvd8PKFX1rcYjewn9K
z4Srv/oERKaEfMiWk2VnYm3tjaRdFgugItpfIswJ4Eo1xe/ppEBcah3uSBja3DNB
bnp1HpGclNYRRG3Stq6WokRu8AEFOg==
=hTba
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A very quiet release for regmap:
- Allow a custom _update_bits() operation for devices with no bus.
- Fix an issue with creation of the debugfs directory when attaching
a device to an existing no device regmap.
- A trivial formatting fix"
[ The custom _update_bits comit came in earlier through the networking
tree that had merged it for its own needs ]
* tag 'regmap-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Fix indentation
regmap: Call regmap_debugfs_exit() prior to _init()
- Remove device_add_properties() which does not work correctly if
software nodes holding additional device properties are shared
or reused (Heikki Krogerus).
- Fix nargs_prop property handling for software nodes (Clément Léger).
- Update documentation of ACPI device properties (Sakari Ailus).
- Update the handling of graph properties in the generic framework
to match the DT case (Sakari Ailus).
- Update software nodes entry in MAINTAINERS (Andy Shevchenko).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmHcg1ASHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx66YP/Ar0GTqI6iE+6WrF2B00ftHFq11PVTo9
DNiYSJiV60FiaoIANd+59QfC1i2erDrUmRFvZ+Kip4rQG9UomY4Tx7QYw3jzCj8t
yA5JJhN5kYphs0+GigdNVLkN56zwK6Cd759+GVCXQW9KkPryxhjGJeNNRthlXf8s
Bs+4Gp3NOanl5z4LdulkjyoFMDr6kIjotcN2j8+7RkgWc4VIS3OYlFZf5dLi/i1O
cqSkJgikRA40ZuNtsRRVGPOSGoPAPaAFZJY4j/gYx6sAhB9UQi/Xe4iQaXvsEWM2
NlC+X2D48SiTkb1M1QpM0nJ5N/txRp+/FrMiAagBWCNer4lFA42ibyCd6dhADnIE
lMLnbKaUqB3exCBP/BQdYDi+ypKZf88E0zX6OoZfvHj0uQV5KwOgUbbhckpLkI/j
WZQwm/qtLGqpxW6N+IfBRwBBwPkXePep3CG37twfyVp4IXk+hm+ipMQ1dZmxwNKZ
q9o9Iwv35KEbX6nR8psE7GCm6znYeoFPDx8GEjEDh9nfIpt2bFSYy3rw82wghkgq
EeBD/irNS3PbFXyt8cV/cDjctgbG9SumOA5B6Iicq9y5PSnUNjKi49DGZAsqq8g/
I2c5IFBzYwnKk4z1wCRshba9jTF2sH0NTGVvPnfWv/vCtrvURXwBxTa9z8jtuMtw
E44VNKDhYWu2
=/oIe
-----END PGP SIGNATURE-----
Merge tag 'devprop-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework updates from Rafael Wysocki:
"These update the handling of software nodes and graph properties, and
the MAINTAINERS entry for the former.
Specifics:
- Remove device_add_properties() which does not work correctly if
software nodes holding additional device properties are shared or
reused (Heikki Krogerus).
- Fix nargs_prop property handling for software nodes (Clément
Léger).
- Update documentation of ACPI device properties (Sakari Ailus).
- Update the handling of graph properties in the generic framework to
match the DT case (Sakari Ailus).
- Update software nodes entry in MAINTAINERS (Andy Shevchenko)"
* tag 'devprop-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
software node: Update MAINTAINERS data base
software node: fix wrong node passed to find nargs_prop
device property: Drop fwnode_graph_get_remote_node()
device property: Use fwnode_graph_for_each_endpoint() macro
device property: Implement fwnode_graph_get_endpoint_count()
Documentation: ACPI: Update references
Documentation: ACPI: Fix data node reference documentation
device property: Fix documentation for FWNODE_GRAPH_DEVICE_DISABLED
device property: Fix fwnode_graph_devcon_match() fwnode leak
device property: Remove device_add_properties() API
driver core: Don't call device_remove_properties() from device_del()
PCI: Convert to device_create_managed_software_node()
- Add new P-state driver for AMD processors (Huang Rui).
- Fix initialization of min and max frequency QoS requests in the
cpufreq core (Rafael Wysocki).
- Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada).
- Make intel_pstate update cpuinfo.max_freq when notified of HWP
capabilities changes and drop a redundant function call from that
driver (Rafael Wysocki).
- Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel,
Stephen Boyd, Vladimir Zapolskiy).
- Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan).
- Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz
Luba).
- Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman).
- Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman).
- Fix two comments in cpuidle code (Jason Wang, Yang Li).
- Allow model-specific normal EPB value to be used in the intel_epb
sysfs attribute handling code (Srinivas Pandruvada).
- Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki).
- Add safety net to supplier device release in the runtime PM core
code (Rafael Wysocki).
- Capture device status before disabling runtime PM for it (Rafael
Wysocki).
- Add new macros for declaring PM operations to allow drivers to
avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and
update some drivers to use these macros (Paul Cercueil).
- Allow ACPI hardware signature to be honoured during restore from
hibernation (David Woodhouse).
- Update outdated operating performance points (OPP) documentation
(Tang Yizhou).
- Reduce log severity for informative message regarding frequency
transition failures in devfreq (Tzung-Bi Shih).
- Add DRAM frequency controller devfreq driver for Allwinner sunXi
SoCs (Samuel Holland).
- Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd
Bergmann).
- Add support for new layout of Psys PowerLimit Register on SPR to
the Intel RAPL power capping driver (Zhang Rui).
- Fix typo in a comment in idle_inject.c (Jason Wang).
- Remove unused function definition from the DTPM (Dynamit Thermal
Power Management) power capping framework (Daniel Lezcano).
- Reduce DTPM trace verbosity (Daniel Lezcano).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmHcgkgSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxs34P/3kFhRk7qrwEekx6F11im6caLKT9+Qap
PuGVqfTbK7TupVQDVGFBEjTjgKY7Ph7Fcr4bqn6wvNOp96cjXyOSk/c1fcpS3Bpr
b1PYsFsb9diNKE462sGGYClyCT3X5qQqtpxzOl3g4I1PWKTC1mKFm4Jm2m6S6cFq
DKhsgYKFzQSZNb1wJM4JjHS9c3BRygqp4nfEAmifu5b9tLZf7stWnFHhbGq63M9m
OwHOrEEnzhf4pOXGZTvIXeczgE6IcuDdlGkIg7XMHnmKSNvj1HqhEgi2lfSRb98z
5eI4S6JymCJGVK+gr8iVCq1iJ+LKqV3YPXRqvI35/+NqIKYxMt2ZivQQf5s3aQLe
26gUulD3O6Pz5tMlwcDElD4/tcClfg35PCD/VzpRR8TAo8vLBb63kZ5v6+HM34ZJ
6QbLTNZJTnGmEqxMccUxP+HhZz8ssqpLAC+R2sE5yXbNpIZq8CbPiGb65RGiX3SG
CmRKqH/xQVNKBYP0ChjmUyhKcBxOnx1Xu8AhsN7gRAy0aht7j7OdjTnJuGiX6gu3
Q5WxvVvkekyfhuFQ5TST9y/fzvMJWzeaA6GhVIr6RoBmshNQGTb0H4HXARxS3Ah5
qjd7ao7BFLa898FCHaHIpmFWp0wF5iljwCJQVP3I2qUpPvDJxEtsxc4CF/AZzyNR
VudoFqLoIV5C
=1egI
-----END PGP SIGNATURE-----
Merge tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"The most signigicant change here is the addition of a new cpufreq
'P-state' driver for AMD processors as a better replacement for the
venerable acpi-cpufreq driver.
There are also other cpufreq updates (in the core, intel_pstate, ARM
drivers), PM core updates (mostly related to adding new macros for
declaring PM operations which should make the lives of driver
developers somewhat easier), and a bunch of assorted fixes and
cleanups.
Summary:
- Add new P-state driver for AMD processors (Huang Rui).
- Fix initialization of min and max frequency QoS requests in the
cpufreq core (Rafael Wysocki).
- Fix EPP handling on Alder Lake in intel_pstate (Srinivas
Pandruvada).
- Make intel_pstate update cpuinfo.max_freq when notified of HWP
capabilities changes and drop a redundant function call from that
driver (Rafael Wysocki).
- Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel,
Stephen Boyd, Vladimir Zapolskiy).
- Fix double devm_remap() in the Mediatek cpufreq driver (Hector
Yuan).
- Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz
Luba).
- Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman).
- Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman).
- Fix two comments in cpuidle code (Jason Wang, Yang Li).
- Allow model-specific normal EPB value to be used in the intel_epb
sysfs attribute handling code (Srinivas Pandruvada).
- Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki).
- Add safety net to supplier device release in the runtime PM core
code (Rafael Wysocki).
- Capture device status before disabling runtime PM for it (Rafael
Wysocki).
- Add new macros for declaring PM operations to allow drivers to
avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and
update some drivers to use these macros (Paul Cercueil).
- Allow ACPI hardware signature to be honoured during restore from
hibernation (David Woodhouse).
- Update outdated operating performance points (OPP) documentation
(Tang Yizhou).
- Reduce log severity for informative message regarding frequency
transition failures in devfreq (Tzung-Bi Shih).
- Add DRAM frequency controller devfreq driver for Allwinner sunXi
SoCs (Samuel Holland).
- Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd
Bergmann).
- Add support for new layout of Psys PowerLimit Register on SPR to
the Intel RAPL power capping driver (Zhang Rui).
- Fix typo in a comment in idle_inject.c (Jason Wang).
- Remove unused function definition from the DTPM (Dynamit Thermal
Power Management) power capping framework (Daniel Lezcano).
- Reduce DTPM trace verbosity (Daniel Lezcano)"
* tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error
cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State
cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment
cpuidle: use default_groups in kobj_type
x86: intel_epb: Allow model specific normal EPB value
MAINTAINERS: Add AMD P-State driver maintainer entry
Documentation: amd-pstate: Add AMD P-State driver introduction
cpufreq: amd-pstate: Add AMD P-State performance attributes
cpufreq: amd-pstate: Add AMD P-State frequencies attributes
cpufreq: amd-pstate: Add boost mode support for AMD P-State
cpufreq: amd-pstate: Add trace for AMD P-State module
cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution
cpufreq: amd-pstate: Add fast switch function for AMD P-State
cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors
ACPI: CPPC: Add CPPC enable register function
ACPI: CPPC: Check present CPUs for determining _CPC is valid
ACPI: CPPC: Implement support for SystemIO registers
x86/msr: Add AMD CPPC MSR definitions
x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag
cpufreq: use default_groups in kobj_type
...
Core
----
- Defer freeing TCP skbs to the BH handler, whenever possible,
or at least perform the freeing outside of the socket lock section
to decrease cross-CPU allocator work and improve latency.
- Add netdevice refcount tracking to locate sources of netdevice
and net namespace refcount leaks.
- Make Tx watchdog less intrusive - avoid pausing Tx and restarting
all queues from a single CPU removing latency spikes.
- Various small optimizations throughout the stack from Eric Dumazet.
- Make netdev->dev_addr[] constant, force modifications to go via
appropriate helpers to allow us to keep addresses in ordered data
structures.
- Replace unix_table_lock with per-hash locks, improving performance
of bind() calls.
- Extend skb drop tracepoint with a drop reason.
- Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.
BPF
---
- New helpers:
- bpf_find_vma(), find and inspect VMAs for profiling use cases
- bpf_loop(), runtime-bounded loop helper trading some execution
time for much faster (if at all converging) verification
- bpf_strncmp(), improve performance, avoid compiler flakiness
- bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
for tracing programs, all inlined by the verifier
- Support BPF relocations (CO-RE) in the kernel loader.
- Further the support for BTF_TYPE_TAG annotations.
- Allow access to local storage in sleepable helpers.
- Convert verifier argument types to a composable form with different
attributes which can be shared across types (ro, maybe-null).
- Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
creating new, extensible ones where missing and deprecating those
to be removed.
Protocols
---------
- WiFi (mac80211/cfg80211):
- notify user space about long "come back in N" AP responses,
allow it to react to such temporary rejections
- allow non-standard VHT MCS 10/11 rates
- use coarse time in airtime fairness code to save CPU cycles
- Bluetooth:
- rework of HCI command execution serialization to use a common
queue and work struct, and improve handling errors reported
in the middle of a batch of commands
- rework HCI event handling to use skb_pull_data, avoiding packet
parsing pitfalls
- support AOSP Bluetooth Quality Report
- SMC:
- support net namespaces, following the RDMA model
- improve connection establishment latency by pre-clearing buffers
- introduce TCP ULP for automatic redirection to SMC
- Multi-Path TCP:
- support ioctls: SIOCINQ, OUTQ, and OUTQNSD
- support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
- support cmsgs: TCP_INQ
- improvements in the data scheduler (assigning data to subflows)
- support fastclose option (quick shutdown of the full MPTCP
connection, similar to TCP RST in regular TCP)
- MCTP (Management Component Transport) over serial, as defined by
DMTF spec DSP0253 - "MCTP Serial Transport Binding".
Driver API
----------
- Support timestamping on bond interfaces in active/passive mode.
- Introduce generic phylink link mode validation for drivers which
don't have any quirks and where MAC capability bits fully express
what's supported. Allow PCS layer to participate in the validation.
Convert a number of drivers.
- Add support to set/get size of buffers on the Rx rings and size of
the tx copybreak buffer via ethtool.
- Support offloading TC actions as first-class citizens rather than
only as attributes of filters, improve sharing and device resource
utilization.
- WiFi (mac80211/cfg80211):
- support forwarding offload (ndo_fill_forward_path)
- support for background radar detection hardware
- SA Query Procedures offload on the AP side
New hardware / drivers
----------------------
- tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
real-time requirements for isochronous communication with protocols
like OPC UA Pub/Sub.
- Qualcomm BAM-DMUX WWAN - driver for data channels of modems
integrated into many older Qualcomm SoCs, e.g. MSM8916 or
MSM8974 (qcom_bam_dmux).
- Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch
driver with support for bridging, VLANs and multicast forwarding
(lan966x).
- iwlmei driver for co-operating between Intel's WiFi driver and
Intel's Active Management Technology (AMT) devices.
- mse102x - Vertexcom MSE102x Homeplug GreenPHY chips
- Bluetooth:
- MediaTek MT7921 SDIO devices
- Foxconn MT7922A
- Realtek RTL8852AE
Drivers
-------
- Significantly improve performance in the datapaths of:
lan78xx, ax88179_178a, lantiq_xrx200, bnxt.
- Intel Ethernet NICs:
- igb: support PTP/time PEROUT and EXTTS SDP functions on
82580/i354/i350 adapters
- ixgbevf: new PF -> VF mailbox API which avoids the risk of
mailbox corruption with ESXi
- iavf: support configuration of VLAN features of finer granularity,
stacked tags and filtering
- ice: PTP support for new E822 devices with sub-ns precision
- ice: support firmware activation without reboot
- Mellanox Ethernet NICs (mlx5):
- expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
- support TC forwarding when tunnel encap and decap happen between
two ports of the same NIC
- dynamically size and allow disabling various features to save
resources for running in embedded / SmartNIC scenarios
- Broadcom Ethernet NICs (bnxt):
- use page frag allocator to improve Rx performance
- expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
- Other Ethernet NICs:
- amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support
- Microsoft cloud/virtual NIC (mana):
- add XDP support (PASS, DROP, TX)
- Mellanox Ethernet switches (mlxsw):
- initial support for Spectrum-4 ASICs
- VxLAN with IPv6 underlay
- Marvell Ethernet switches (prestera):
- support flower flow templates
- add basic IP forwarding support
- NXP embedded Ethernet switches (ocelot & felix):
- support Per-Stream Filtering and Policing (PSFP)
- enable cut-through forwarding between ports by default
- support FDMA to improve packet Rx/Tx to CPU
- Other embedded switches:
- hellcreek: improve trapping management (STP and PTP) packets
- qca8k: support link aggregation and port mirroring
- Qualcomm 802.11ax WiFi (ath11k):
- qca6390, wcn6855: enable 802.11 power save mode in station mode
- BSS color change support
- WCN6855 hw2.1 support
- 11d scan offload support
- scan MAC address randomization support
- full monitor mode, only supported on QCN9074
- qca6390/wcn6855: report signal and tx bitrate
- qca6390: rfkill support
- qca6390/wcn6855: regdb.bin support
- Intel WiFi (iwlwifi):
- support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
in cooperation with the BIOS
- support for Optimized Connectivity Experience (OCE) scan
- support firmware API version 68
- lots of preparatory work for the upcoming Bz device family
- MediaTek WiFi (mt76):
- Specific Absorption Rate (SAR) support
- mt7921: 160 MHz channel support
- RealTek WiFi (rtw88):
- Specific Absorption Rate (SAR) support
- scan offload
- Other WiFi NICs
- ath10k: support fetching (pre-)calibration data from nvmem
- brcmfmac: configure keep-alive packet on suspend
- wcn36xx: beacon filter support
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHbkZAACgkQMUZtbf5S
IruYkQ//XX7BggcwBfukPK83j0dONolClijqKcKR08g4vB5L8GXvv6OErKIWrh4k
h8JanCH352ZkbCSw3MvFdm825UYQv8vPMd6Qks/LJ4aSKqCuy4MIlAo+yOw4Km3O
i7++lRfma6DqHHI59wvLjWoxZSPu8lL+rI8UsZ5qMOlnNlGAOXsNrzRjaqQ3FddY
AMxZeBUtrPqUCCQZFq3U8apkYzUp7CA/3XR9zRcja3uPbrtOV2G+4whRF90qGNWz
Tm/QvJ9F/Ab292cbhxR4KuaQ3hUhaCQyDjbZk3+FZzZpAVhYTVqcNjny6+yXmbiP
NXRtwemnl1NlWKMnJM8lEeY48u626tRIkxA/Wtd61uoO5uKUSxfGP+UpUi+DfXbF
yIw50VQ7L2bpxXP/HjtmhVgZDaWKYyh22Zw4Hp/muMJz0hgUB0KODY3tf2jUWbjJ
0oEgocWyzhhwMQKqupTDCIaRgIs2ewYr4ZrFDhI3HnHC/vv1VjoPRUPIyxwppD2N
cXvZb3B1sWK8iX5gCbISGzyU4bB7I0rvJSTU42ueti7n6NqRFZ79qHQpYnnY+JdO
z1qOwY/d/yWfBoXVKRtRg2qz6CdEt5BQklwAgVEBgrFpf58gp694EwGMb1htY14J
r/k9bVpmyIFpUnBH2CPMRfBVA3tUTqzyzzFV4AMw40NYLKmhLdo=
=KLm3
-----END PGP SIGNATURE-----
Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core
----
- Defer freeing TCP skbs to the BH handler, whenever possible, or at
least perform the freeing outside of the socket lock section to
decrease cross-CPU allocator work and improve latency.
- Add netdevice refcount tracking to locate sources of netdevice and
net namespace refcount leaks.
- Make Tx watchdog less intrusive - avoid pausing Tx and restarting
all queues from a single CPU removing latency spikes.
- Various small optimizations throughout the stack from Eric Dumazet.
- Make netdev->dev_addr[] constant, force modifications to go via
appropriate helpers to allow us to keep addresses in ordered data
structures.
- Replace unix_table_lock with per-hash locks, improving performance
of bind() calls.
- Extend skb drop tracepoint with a drop reason.
- Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW.
BPF
---
- New helpers:
- bpf_find_vma(), find and inspect VMAs for profiling use cases
- bpf_loop(), runtime-bounded loop helper trading some execution
time for much faster (if at all converging) verification
- bpf_strncmp(), improve performance, avoid compiler flakiness
- bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt()
for tracing programs, all inlined by the verifier
- Support BPF relocations (CO-RE) in the kernel loader.
- Further the support for BTF_TYPE_TAG annotations.
- Allow access to local storage in sleepable helpers.
- Convert verifier argument types to a composable form with different
attributes which can be shared across types (ro, maybe-null).
- Prepare libbpf for upcoming v1.0 release by cleaning up APIs,
creating new, extensible ones where missing and deprecating those
to be removed.
Protocols
---------
- WiFi (mac80211/cfg80211):
- notify user space about long "come back in N" AP responses,
allow it to react to such temporary rejections
- allow non-standard VHT MCS 10/11 rates
- use coarse time in airtime fairness code to save CPU cycles
- Bluetooth:
- rework of HCI command execution serialization to use a common
queue and work struct, and improve handling errors reported in
the middle of a batch of commands
- rework HCI event handling to use skb_pull_data, avoiding packet
parsing pitfalls
- support AOSP Bluetooth Quality Report
- SMC:
- support net namespaces, following the RDMA model
- improve connection establishment latency by pre-clearing buffers
- introduce TCP ULP for automatic redirection to SMC
- Multi-Path TCP:
- support ioctls: SIOCINQ, OUTQ, and OUTQNSD
- support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT,
IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY
- support cmsgs: TCP_INQ
- improvements in the data scheduler (assigning data to subflows)
- support fastclose option (quick shutdown of the full MPTCP
connection, similar to TCP RST in regular TCP)
- MCTP (Management Component Transport) over serial, as defined by
DMTF spec DSP0253 - "MCTP Serial Transport Binding".
Driver API
----------
- Support timestamping on bond interfaces in active/passive mode.
- Introduce generic phylink link mode validation for drivers which
don't have any quirks and where MAC capability bits fully express
what's supported. Allow PCS layer to participate in the validation.
Convert a number of drivers.
- Add support to set/get size of buffers on the Rx rings and size of
the tx copybreak buffer via ethtool.
- Support offloading TC actions as first-class citizens rather than
only as attributes of filters, improve sharing and device resource
utilization.
- WiFi (mac80211/cfg80211):
- support forwarding offload (ndo_fill_forward_path)
- support for background radar detection hardware
- SA Query Procedures offload on the AP side
New hardware / drivers
----------------------
- tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with
real-time requirements for isochronous communication with protocols
like OPC UA Pub/Sub.
- Qualcomm BAM-DMUX WWAN - driver for data channels of modems
integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974
(qcom_bam_dmux).
- Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver
with support for bridging, VLANs and multicast forwarding
(lan966x).
- iwlmei driver for co-operating between Intel's WiFi driver and
Intel's Active Management Technology (AMT) devices.
- mse102x - Vertexcom MSE102x Homeplug GreenPHY chips
- Bluetooth:
- MediaTek MT7921 SDIO devices
- Foxconn MT7922A
- Realtek RTL8852AE
Drivers
-------
- Significantly improve performance in the datapaths of: lan78xx,
ax88179_178a, lantiq_xrx200, bnxt.
- Intel Ethernet NICs:
- igb: support PTP/time PEROUT and EXTTS SDP functions on
82580/i354/i350 adapters
- ixgbevf: new PF -> VF mailbox API which avoids the risk of
mailbox corruption with ESXi
- iavf: support configuration of VLAN features of finer
granularity, stacked tags and filtering
- ice: PTP support for new E822 devices with sub-ns precision
- ice: support firmware activation without reboot
- Mellanox Ethernet NICs (mlx5):
- expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
- support TC forwarding when tunnel encap and decap happen between
two ports of the same NIC
- dynamically size and allow disabling various features to save
resources for running in embedded / SmartNIC scenarios
- Broadcom Ethernet NICs (bnxt):
- use page frag allocator to improve Rx performance
- expose control over IRQ coalescing mode (CQE vs EQE) via ethtool
- Other Ethernet NICs:
- amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support
- Microsoft cloud/virtual NIC (mana):
- add XDP support (PASS, DROP, TX)
- Mellanox Ethernet switches (mlxsw):
- initial support for Spectrum-4 ASICs
- VxLAN with IPv6 underlay
- Marvell Ethernet switches (prestera):
- support flower flow templates
- add basic IP forwarding support
- NXP embedded Ethernet switches (ocelot & felix):
- support Per-Stream Filtering and Policing (PSFP)
- enable cut-through forwarding between ports by default
- support FDMA to improve packet Rx/Tx to CPU
- Other embedded switches:
- hellcreek: improve trapping management (STP and PTP) packets
- qca8k: support link aggregation and port mirroring
- Qualcomm 802.11ax WiFi (ath11k):
- qca6390, wcn6855: enable 802.11 power save mode in station mode
- BSS color change support
- WCN6855 hw2.1 support
- 11d scan offload support
- scan MAC address randomization support
- full monitor mode, only supported on QCN9074
- qca6390/wcn6855: report signal and tx bitrate
- qca6390: rfkill support
- qca6390/wcn6855: regdb.bin support
- Intel WiFi (iwlwifi):
- support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS)
in cooperation with the BIOS
- support for Optimized Connectivity Experience (OCE) scan
- support firmware API version 68
- lots of preparatory work for the upcoming Bz device family
- MediaTek WiFi (mt76):
- Specific Absorption Rate (SAR) support
- mt7921: 160 MHz channel support
- RealTek WiFi (rtw88):
- Specific Absorption Rate (SAR) support
- scan offload
- Other WiFi NICs
- ath10k: support fetching (pre-)calibration data from nvmem
- brcmfmac: configure keep-alive packet on suspend
- wcn36xx: beacon filter support"
* tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits)
tcp: tcp_send_challenge_ack delete useless param `skb`
net/qla3xxx: Remove useless DMA-32 fallback configuration
rocker: Remove useless DMA-32 fallback configuration
hinic: Remove useless DMA-32 fallback configuration
lan743x: Remove useless DMA-32 fallback configuration
net: enetc: Remove useless DMA-32 fallback configuration
cxgb4vf: Remove useless DMA-32 fallback configuration
cxgb4: Remove useless DMA-32 fallback configuration
cxgb3: Remove useless DMA-32 fallback configuration
bnx2x: Remove useless DMA-32 fallback configuration
et131x: Remove useless DMA-32 fallback configuration
be2net: Remove useless DMA-32 fallback configuration
vmxnet3: Remove useless DMA-32 fallback configuration
bna: Simplify DMA setting
net: alteon: Simplify DMA setting
myri10ge: Simplify DMA setting
qlcnic: Simplify DMA setting
net: allwinner: Fix print format
page_pool: remove spinlock in page_pool_refill_alloc_cache()
amt: fix wrong return type of amt_send_membership_update()
...
from poison memory and error injection into SGX pages
- A bunch of changes to the SGX selftests to simplify and allow of SGX
features testing without the need of a whole SGX software stack
- Add a sysfs attribute which is supposed to show the amount of SGX
memory in a NUMA node, similar to what /proc/meminfo is to normal
memory
- The usual bunch of fixes and cleanups too
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcDQMACgkQEsHwGGHe
VUq42xAAjWM0AFpIxgUBpbE0swV3ZMulnndl3/vA5XN+9Yn7Q52+AFyPRE0s7Zam
Ap+cInh2Il7d/sv54rZ4x/j7+TH4i7s8fWPVU/XiPALQuOuw0/B1wJJ+jmMiPFiU
3jr7DkUPyWjWTHduMY/tk+xMOpkx1XsxJheYnKvsKVW+fjJ0vPuftAZtfu2z2VOh
3JLcp5cAXPxW0UK9gdoF5bCBQhBu0NRguTbhHhbByAixQO2GyVSKLSRovUdj0a+y
QRrQ6hgcvpTOsVHJoWJ7yIX4SBzQTe9Bg6dT9DghOxE4Sc2GH89hu7wRztGawBJO
nLyzWgiW9ttjQutDpBvZANNVcFAPAdtDWczrzZpREbrGKkzT+kOBnIIL1LWITWOy
2YWTO3ytW0KNIK85GzMjSVOKRMgaHJeBaGuYZ7Z0kb3GuUPJ9zRlaRxNapKQFuzA
0PGoA4IDT+2Afy7VYBBNUA2d/WverFQuXKusSxK6b5zJ173o5/DXL2q0d3gn/j8Z
hhxJUJyVOsfRXSG4NKrj4se4FiA0n/RL4oyUZR9iJ8kWzzZTd0eZTAn468bpGIp5
yiOlPOLgsmu0xzVmAtG1+4d2+S2x+Ec5YE0sP1V/JLNciYk3Ebp7UyfnS3tn33Xc
cpdWjELvD1LJVpMEURnbjRrwU6OiiAekYJCP/9lmK9zfOGpwRHc=
=vFTM
-----END PGP SIGNATURE-----
Merge tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SGX updates from Borislav Petkov:
- Add support for handling hw errors in SGX pages: poisoning,
recovering from poison memory and error injection into SGX pages
- A bunch of changes to the SGX selftests to simplify and allow of SGX
features testing without the need of a whole SGX software stack
- Add a sysfs attribute which is supposed to show the amount of SGX
memory in a NUMA node, similar to what /proc/meminfo is to normal
memory
- The usual bunch of fixes and cleanups too
* tag 'x86_sgx_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/sgx: Fix NULL pointer dereference on non-SGX systems
selftests/sgx: Fix corrupted cpuid macro invocation
x86/sgx: Add an attribute for the amount of SGX memory in a NUMA node
x86/sgx: Fix minor documentation issues
selftests/sgx: Add test for multiple TCS entry
selftests/sgx: Enable multiple thread support
selftests/sgx: Add page permission and exception test
selftests/sgx: Rename test properties in preparation for more enclave tests
selftests/sgx: Provide per-op parameter structs for the test enclave
selftests/sgx: Add a new kselftest: Unclobbered_vdso_oversubscribed
selftests/sgx: Move setup_test_encl() to each TEST_F()
selftests/sgx: Encpsulate the test enclave creation
selftests/sgx: Dump segments and /proc/self/maps only on failure
selftests/sgx: Create a heap for the test enclave
selftests/sgx: Make data measurement for an enclave segment optional
selftests/sgx: Assign source for each segment
selftests/sgx: Fix a benign linker warning
x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
x86/sgx: Add hook to error injection address validation
x86/sgx: Hook arch_memory_failure() into mainline code
...
Merge cpuidle updates, PM core updates and one hiberation-related
update for 5.17-rc1:
- Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman).
- Fix two comments in cpuidle code (Jason Wang, Yang Li).
- Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki).
- Add safety net to supplier device release in the runtime PM core
code (Rafael Wysocki).
- Capture device status before disabling runtime PM for it (Rafael
Wysocki).
- Add new macros for declaring PM operations to allow drivers to
avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and
update some drivers to use these macros (Paul Cercueil).
- Allow ACPI hardware signature to be honoured during restore from
hibernation (David Woodhouse).
* pm-cpuidle:
cpuidle: use default_groups in kobj_type
cpuidle: Fix cpuidle_remove_state_sysfs() kerneldoc comment
cpuidle: menu: Fix typo in a comment
* pm-core:
PM: runtime: Simplify locking in pm_runtime_put_suppliers()
mmc: mxc: Use the new PM macros
mmc: jz4740: Use the new PM macros
PM: runtime: Add safety net to supplier device release
PM: runtime: Capture device status before disabling runtime PM
PM: core: Add new *_PM_OPS macros, deprecate old ones
PM: core: Redefine pm_ptr() macro
r8169: Avoid misuse of pm_ptr() macro
* pm-sleep:
PM: hibernate: Allow ACPI hardware signature to be honoured
Merge cpufreq updates for 5.17-rc1:
- Add new P-state driver for AMD processors (Huang Rui).
- Fix initialization of min and max frequency QoS requests in the
cpufreq core (Rafael Wysocki).
- Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada).
- Make intel_pstate update cpuinfo.max_freq when notified of HWP
capabilities changes and drop a redundant function call from that
driver (Rafael Wysocki).
- Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel,
Stephen Boyd, Vladimir Zapolskiy).
- Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan).
- Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz
Luba).
- Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman).
* pm-cpufreq: (32 commits)
x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error
cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State
cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment
MAINTAINERS: Add AMD P-State driver maintainer entry
Documentation: amd-pstate: Add AMD P-State driver introduction
cpufreq: amd-pstate: Add AMD P-State performance attributes
cpufreq: amd-pstate: Add AMD P-State frequencies attributes
cpufreq: amd-pstate: Add boost mode support for AMD P-State
cpufreq: amd-pstate: Add trace for AMD P-State module
cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution
cpufreq: amd-pstate: Add fast switch function for AMD P-State
cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors
ACPI: CPPC: Add CPPC enable register function
ACPI: CPPC: Check present CPUs for determining _CPC is valid
ACPI: CPPC: Implement support for SystemIO registers
x86/msr: Add AMD CPPC MSR definitions
x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag
cpufreq: use default_groups in kobj_type
cpufreq: mediatek-hw: Fix double devm_remap in hotplug case
cpufreq: intel_pstate: Update cpuinfo.max_freq on HWP_CAP changes
...
The previous commit fixed up all shell scripts to not include
include/config/auto.conf.
Now that include/config/auto.conf is only included by Makefiles,
we can change it into a more Make-friendly form.
Previously, Kconfig output string values enclosed with double-quotes
(both in the .config and include/config/auto.conf):
CONFIG_X="foo bar"
Unlike shell, Make handles double-quotes (and single-quotes as well)
verbatim. We must rip them off when used.
There are some patterns:
[1] $(patsubst "%",%,$(CONFIG_X))
[2] $(CONFIG_X:"%"=%)
[3] $(subst ",,$(CONFIG_X))
[4] $(shell echo $(CONFIG_X))
These are not only ugly, but also fragile.
[1] and [2] do not work if the value contains spaces, like
CONFIG_X=" foo bar "
[3] does not work correctly if the value contains double-quotes like
CONFIG_X="foo\"bar"
[4] seems to work better, but has a cost of forking a process.
Anyway, quoted strings were always PITA for our Makefiles.
This commit changes Kconfig to stop quoting in include/config/auto.conf.
These are the string type symbols referenced in Makefiles or scripts:
ACPI_CUSTOM_DSDT_FILE
ARC_BUILTIN_DTB_NAME
ARC_TUNE_MCPU
BUILTIN_DTB_SOURCE
CC_IMPLICIT_FALLTHROUGH
CC_VERSION_TEXT
CFG80211_EXTRA_REGDB_KEYDIR
EXTRA_FIRMWARE
EXTRA_FIRMWARE_DIR
EXTRA_TARGETS
H8300_BUILTIN_DTB
INITRAMFS_SOURCE
LOCALVERSION
MODULE_SIG_HASH
MODULE_SIG_KEY
NDS32_BUILTIN_DTB
NIOS2_DTB_SOURCE
OPENRISC_BUILTIN_DTB
SOC_CANAAN_K210_DTB_SOURCE
SYSTEM_BLACKLIST_HASH_LIST
SYSTEM_REVOCATION_KEYS
SYSTEM_TRUSTED_KEYS
TARGET_CPU
UNUSED_KSYMS_WHITELIST
XILINX_MICROBLAZE0_FAMILY
XILINX_MICROBLAZE0_HW_VER
XTENSA_VARIANT_NAME
I checked them one by one, and fixed up the code where necessary.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Since commit cffa4b2122 ("regmap: debugfs: Fix a memory leak when
calling regmap_attach_dev"), the following debugfs error is seen
on i.MX boards:
debugfs: Directory 'dummy-iomuxc-gpr@20e0000' with parent 'regmap' already present!
In the attempt to fix the memory leak, the above commit added a NULL check
for map->debugfs_name. For the first debufs entry, map->debugfs_name is NULL
and then the new name is allocated via kasprintf().
For the second debugfs entry, map->debugfs_name() is no longer NULL, so
it will keep using the old entry name and the duplicate name error is seen.
Quoting Mark Brown:
"That means that if the device gets freed we'll end up with the old debugfs
file hanging around pointing at nothing.
...
To be more explicit this means we need a call to regmap_debugfs_exit()
which will clean up all the existing debugfs stuff before we loose
references to it."
Call regmap_debugfs_exit() prior to regmap_debugfs_init() to fix
the problem.
Tested on i.MX6Q and i.MX6SX boards.
Fixes: cffa4b2122 ("regmap: debugfs: Fix a memory leak when calling regmap_attach_dev")
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Fabio Estevam <festevam@denx.de>
Link: https://lore.kernel.org/r/20220107163307.335404-1-festevam@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
If a driver cannot be bound to a device, the correct bus notifier order
should be:
- BUS_NOTIFY_BIND_DRIVER: driver is about to be bound
- BUS_NOTIFY_DRIVER_NOT_BOUND: driver failed to be bound
or no notifier if the failure happens before the actual binding.
The really_probe() notifies a BUS_NOTIFY_DRIVER_NOT_BOUND event without
a BUS_NOTIFY_BIND_DRIVER if .dma_configure() returns failure. This
change makes the notifiers in order.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211231033901.2168664-3-baolu.lu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The driver_sysfs_remove() should be called after driver_sysfs_add() in
really_probe(). The out-of-order driver_sysfs_remove() tries to remove
some nonexistent nodes under the device and driver sysfs nodes. This is
allowed, hence this change doesn't fix any problem, just a cleanup.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20211231033901.2168664-2-baolu.lu@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This was the only usage of <linux/kref_api.h> in <linux/kobject_api.h>,
so we'll able to decouple the two after this change.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
commit 077cdda764 ("net/mlx5e: TC, Fix memory leak with rules with internal port")
commit 31108d142f ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'")
commit 4390c6edc0 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'")
https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/
net/smc/smc_wr.c
commit 49dc9013e3 ("net/smc: Use the bitmap API when applicable")
commit 349d43127d ("net/smc: fix kernel panic caused by race of smc_sock")
bitmap_zero()/memset() is removed by the fix
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
devtmpfs is writable. Add the noexec and nosuid as default mount flags
to prevent code execution from /dev. The systems who don't use systemd
and who rely on CONFIG_DEVTMPFS_MOUNT=y are the ones to be protected by
this patch. Other systems are fine with the udev solution.
No sane program should be relying on executing from /dev. So this patch
reduces the attack surface. It doesn't prevent any specific attack, but
it reduces the possibility that someone can use /dev as a place to put
executable code. Chrome OS has been carrying this patch for several
years. It seems trivial and simple solution to improve the protection of
/dev when CONFIG_DEVTMPFS_MOUNT=y.
Original patch:
https://lore.kernel.org/lkml/20121120215059.GA1859@www.outflux.net/
Cc: ellyjones@chromium.org
Cc: Kay Sievers <kay@vrfy.org>
Cc: Roland Eggner <edvx1@systemanalysen.net>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/YcMfDOyrg647RCmd@debian-BULLSEYE-live-builder-AMD64
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to pass the pointer to the kset in the struct
kset_uevent_ops callbacks as no one uses it, so just remove that pointer
entirely.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Wedson Almeida Filho <wedsonaf@google.com>
Link: https://lore.kernel.org/r/20211227163924.3970661-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This way instances of kobj_type (which contain function pointers) can be
stored in .rodata, which means that they cannot be [easily/accidentally]
modified at runtime.
Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Link: https://lore.kernel.org/r/20211224231345.777370-1-wedsonaf@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Notice that pm_runtime_put_suppliers() cannot be called with
disabled interrupts, because it may sleep (due to the device
links read locking in the non-SRCU case), and so it can use
spin_lock_irq() and spin_unlock_irq() for the locking.
Update the function accordingly and while at it move the "put"
local variable in it into the inner block where it is used.
This change is not expected to have any visible functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
nargs_prop refers to a property located in the reference that is found
within the nargs property. Use the correct reference node in call to
property_entry_read_int_array() to retrieve the correct nargs value.
Fixes: b06184acf7 ("software node: Add software_node_get_reference_args()")
Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Daniel Scally <djrscally@gmail.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add an explicit comment to document that the reference initialised by
platform_device_register() needs to be released by a call to
platform_device_put() also when registration fails (cf.
device_register()).
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20211222104213.5673-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This patch introduces a new helper routine - fwnode_iomap(), which
allows to map the memory mapped IO for a given device node.
This implementation does not cover the ACPI case and may be expanded
in the future. The main purpose here is to be able to develop resource
provider agnostic drivers.
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Anand Ashok Dumbre <anand.ashok.dumbre@xilinx.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20211203212358.31444-2-anand.ashok.dumbre@xilinx.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
fwnode_graph_get_remote_node() is only used by the tegra-video driver.
Convert it to use newer fwnode_graph_get_endpoint_by_id() and drop
now-unused fwnode_graph_get_remote_node().
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Now that we have fwnode_graph_for_each_endpoint() macro, use it instead of
calling fwnode_graph_get_next_endpoint() directly. It manages the iterator
variable for the user without manual intervention.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add fwnode_graph_get_endpoint_count() function to provide generic
implementation of of_graph_get_endpoint_count(). The former by default
only counts endpoints to available devices which is consistent with the
rest of the fwnode graph API. By providing FWNODE_GRAPH_DEVICE_DISABLED
flag, also unconnected endpoints and endpoints to disabled devices are
counted.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
FWNODE_GRAPH_DEVICE_DISABLED flag was meant for also returning endpoints
connected to disabled devices, but it also may return endpoints that are
not connected. Fix this in documentation. Also
fwnode_graph_get_endpoint_by_id() was affeced by this.
Also improve the language a little bit.
Fixes: 0fcc2bdc8a ("device property: Add fwnode_graph_get_endpoint_by_id()")
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
For each endpoint it encounters, fwnode_graph_devcon_match() checks
whether the endpoint's remote port parent device is available. If it is
not, it ignores the endpoint but does not put the reference to the remote
endpoint port parent fwnode. For available devices the fwnode handle
reference is put as expected.
Put the reference for unavailable devices now.
Fixes: 637e9e52b1 ("device connection: Find device connections also from device graphs")
Cc: 5.1+ <stable@vger.kernel.org> # 5.1+
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 2aa36604e8 ("PM: sleep: Avoid calling put_device() under
dpm_list_mtx") forgot to update the while () loop termination
condition to also break the loop if error is nonzero, which
causes the loop to become infinite if device_prepare() returns
an error for one device.
Add the missing !error check.
Fixes: 2aa36604e8 ("PM: sleep: Avoid calling put_device() under dpm_list_mtx")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Cc: All applicable <stable@vger.kernel.org>
Because refcount_dec_not_one() returns true if the target refcount
becomes saturated, it is generally unsafe to use its return value as
a loop termination condition, but that is what happens when a device
link's supplier device is released during runtime PM suspend
operations and on device link removal.
To address this, introduce pm_runtime_release_supplier() to be used
in the above cases which will check the supplier device's runtime
PM usage counter in addition to the refcount_dec_not_one() return
value, so the loop can be terminated in case the rpm_active refcount
value becomes invalid, and update the code in question to use it as
appropriate.
This change is not expected to have any visible functional impact.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
In some cases (for example, during system-wide suspend and resume of
devices) it is useful to know whether or not runtime PM has ever been
enabled for a given device and, if so, what the runtime PM status of
it had been right before runtime PM was disabled for it last time.
For this reason, introduce a new struct dev_pm_info field called
last_status that will be used for capturing the runtime PM status of
the device when its power.disable_depth counter changes from 0 to 1.
The new field will be set to RPM_INVALID to start with and whenever
power.disable_depth changes from 1 to 0, so it will be valid only
when runtime PM of the device is currently disabled, but it has been
enabled at least once.
Immediately use power.last_status in rpm_resume() to make it handle
the case when PM runtime is disabled for the device, but its runtime
PM status is RPM_ACTIVE more consistently. Namely, make it return 1
if power.last_status is also equal to RPM_ACTIVE in that case (the
idea being that if the status was RPM_ACTIVE last time when
power.disable_depth was changing from 0 to 1 and it is still
RPM_ACTIVE, it can be assumed to reflect what happened to the device
last time when it was using runtime PM) and -EACCES otherwise.
Update the documentation to provide a description of last_status and
change the description of pm_runtime_resume() in it to reflect the
new behavior of rpm_active().
While at it, rearrange the code in pm_runtime_enable() to be more
straightforward and replace the WARN() macro in it with a pr_warn()
invocation which is less disruptive.
Link: https://lore.kernel.org/linux-pm/20211026222626.39222-1-ulf.hansson@linaro.org/t/#u
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The allocation code is overly complex. It tries to have the MSI index space
packed, which is not working when an interrupt is freed. There is no
requirement for this. The only requirement is that the MSI index is unique.
Move the MSI descriptor allocation into msi_domain_populate_irqs() and use
the Linux interrupt number as MSI index which fulfils the unique
requirement.
This requires to lock the MSI descriptors which makes the lock order
reverse to the regular MSI alloc/free functions vs. the domain
mutex. Assign a seperate lockdep class for these MSI device domains.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211206210748.956731741@linutronix.de
Use the core functionality for platform MSI interrupt domains. The platform
device MSI interrupt domains will be converted in a later step.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211206210748.903173257@linutronix.de
It's only required when MSI is in use.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211206210747.650487479@linutronix.de
Use the common msi_index member and get rid of the pointless wrapper struct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221814.413638645@linutronix.de
Storing the platform private data in a MSI descriptor is sloppy at
best. The data belongs to the device and not to the descriptor.
Add a pointer to struct msi_device_data and store the pointer there.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221814.287680528@linutronix.de
It's hard to distinguish what platform_msi_domain_alloc() and
platform_msi_domain_alloc_irqs() are about. Make the distinction more
explicit and add comments which explain the use cases properly.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221814.228706214@linutronix.de
Set the domain info flag and remove the local sysfs code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221814.109408832@linutronix.de
Allocate the MSI device data on first invocation of the allocation function
for platform MSI private data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20211210221813.805529729@linutronix.de
The only unconditional part of MSI data in struct device is the irqdomain
pointer. Everything else can be allocated on demand. Create a data
structure and move the irqdomain pointer into it. The other MSI specific
parts are going to be removed from struct device in later steps.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Nishanth Menon <nm@ti.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211210221813.617178827@linutronix.de
There are 4 users of mc146818_get_time() and none of them was checking
the return value from this function. Change this.
Print the appropriate warnings in callers of mc146818_get_time() instead
of in the function mc146818_get_time() itself, in order not to add
strings to rtc-mc146818-lib.c, which is kind of a library.
The callers of alpha_rtc_read_time() and cmos_read_time() may use the
contents of (struct rtc_time *) even when the functions return a failure
code. Therefore, set the contents of (struct rtc_time *) to 0x00,
which looks more sensible then 0xff and aligns with the (possibly
stale?) comment in cmos_read_time:
/*
* If pm_trace abused the RTC for storage, set the timespec to 0,
* which tells the caller that this RTC value is unusable.
*/
For consistency, do this in mc146818_get_time().
Note: hpet_rtc_interrupt() may call mc146818_get_time() many times a
second. It is very unlikely, though, that the RTC suddenly stops
working and mc146818_get_time() would consistently fail.
Only compile-tested on alpha.
Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: linux-alpha@vger.kernel.org
Cc: x86@kernel.org
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20211210200131.153887-4-mat.jonczyk@o2.pl
== Problem ==
The amount of SGX memory on a system is determined by the BIOS and it
varies wildly between systems. It can be as small as dozens of MB's
and as large as many GB's on servers. Just like how applications need
to know how much regular RAM is available, enclave builders need to
know how much SGX memory an enclave can consume.
== Solution ==
Introduce a new sysfs file:
/sys/devices/system/node/nodeX/x86/sgx_total_bytes
to enumerate the amount of SGX memory available in each NUMA node.
This serves the same function for SGX as /proc/meminfo or
/sys/devices/system/node/nodeX/meminfo does for normal RAM.
'sgx_total_bytes' is needed today to help drive the SGX selftests.
SGX-specific swap code is exercised by creating overcommitted enclaves
which are larger than the physical SGX memory on the system. They
currently use a CPUID-based approach which can diverge from the actual
amount of SGX memory available. 'sgx_total_bytes' ensures that the
selftests can work efficiently and do not attempt stupid things like
creating a 100,000 MB enclave on a system with 128 MB of SGX memory.
== Implementation Details ==
Introduce CONFIG_HAVE_ARCH_NODE_DEV_GROUP opt-in flag to expose an
arch specific attribute group, and add an attribute for the amount of
SGX memory in bytes to each NUMA node:
== ABI Design Discussion ==
As opposed to the per-node ABI, a single, global ABI was considered.
However, this would prevent enclaves from being able to size
themselves so that they fit on a single NUMA node. Essentially, a
single value would rule out NUMA optimizations for enclaves.
Create a new "x86/" directory inside each "nodeX/" sysfs directory.
'sgx_total_bytes' is expected to be the first of at least a few
sgx-specific files to be placed in the new directory. Just scanning
/proc/meminfo, these are the no-brainers that we have for RAM, but we
need for SGX:
MemTotal: xxxx kB // sgx_total_bytes (implemented here)
MemFree: yyyy kB // sgx_free_bytes
SwapTotal: zzzz kB // sgx_swapped_bytes
So, at *least* three. I think we will eventually end up needing
something more along the lines of a dozen. A new directory (as
opposed to being in the nodeX/ "root") directory avoids cluttering the
root with several "sgx_*" files.
Place the new file in a new "nodeX/x86/" directory because SGX is
highly x86-specific. It is very unlikely that any other architecture
(or even non-Intel x86 vendor) will ever implement SGX. Using "sgx/"
as opposed to "x86/" was also considered. But, there is a real chance
this can get used for other arch-specific purposes.
[ dhansen: rewrite changelog ]
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211116162116.93081-2-jarkko@kernel.org
It's only required for PCI/MSI. So no point in having it in every struct
device.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20211206210224.925241961@linutronix.de
fwnode_property_get_reference_args() searches for named properties
against a fwnode_handle, but these could instead be against the fwnode's
secondary. If the property isn't found against the primary, check the
secondary to see if it's there instead.
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Daniel Scally <djrscally@gmail.com>
Link: https://lore.kernel.org/r/20211128232455.39332-1-djrscally@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code and documentation are more difficult to maintain when kept
separately. This is further compounded when the standard structure
documentation infrastructure is not used.
Move the documentation into the code, use the standard documentation
infrastructure, add current documented functions, and reference the text
in the rst file.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211202044305.4006853-8-ira.weiny@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
auxiliary_find_device() takes a proper get_device() reference on the
device before returning the matched device.
Users of this call should be informed that they need to properly release
this reference with put_device().
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211202044305.4006853-7-ira.weiny@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
__auxiliary_driver_register is not intended to be called directly unless
a custom name is required. Add documentation for this fact.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211202044305.4006853-5-ira.weiny@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The documentation for creating an auxiliary device is a 3 step not a 2
step process. Specifically the requirements of setting the name, id,
dev.release, and dev.parent fields was not clear as a precursor to the '2
step' process documented.
Clarify by declaring this a 3 step process starting with setting the
fields of struct auxiliary_device correctly.
Also add some sample code and tie the change into the rest of the
documentation.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20211202044305.4006853-2-ira.weiny@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Provide default defines for the topology_book_[id|cpumask] and
topology_drawer_[id|cpumask] macros just like for each other topology
level.
This way all topology levels are handled in a similar way. Still the
the book and drawer levels are only used on s390, and also the sysfs
attributes are only created on s390. However other architectures may
opt in if wanted.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20211129130309.3256168-4-hca@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The cluster_id and cluster_cpus topology sysfs attributes have been
added with commit c5e22feffd ("topology: Represent clusters of CPUs
within a die").
They are currently only used for x86, arm64, and riscv (via generic
arch topology), however they are still present with bogus default
values for all other architectures. Instead of enforcing such new
sysfs attributes to all architectures, make them only optional visible
if an architecture opts in by defining both the topology_cluster_id
and topology_cluster_cpumask attributes.
This is similar to what was done when the book and drawer topology
levels were introduced: avoid useless and therefore confusing sysfs
attributes for architectures which cannot make use of them.
This should not break any existing applications, since this is a
new interface introduced with the v5.16 merge window.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20211129130309.3256168-3-hca@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The die_id and die_cpus topology sysfs attributes have been added with
commit 0e344d8c70 ("cpu/topology: Export die_id") and commit
2e4c54dac7 ("topology: Create core_cpus and die_cpus sysfs attributes").
While they are currently only used and useful for x86 they are still
present with bogus default values for all architectures. Instead of
enforcing such new sysfs attributes to all architectures, make them
only optional visible if an architecture opts in by defining both the
topology_die_id and topology_die_cpumask attributes.
This is similar to what was done when the book and drawer topology
levels were introduced: avoid useless and therefore confusing sysfs
attributes for architectures which cannot make use of them.
This should not break any existing applications, since this is a
rather new interface and applications should be able to handle also
older kernel versions without such attributes - besides that they
contain only useful information for x86.
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20211129130309.3256168-2-hca@linux.ibm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When possible using dev_err_probe() helps to properly deal with the
PROBE_DEFER error, the benefit is that DEFER issue will be logged
in the devices_deferred debugfs file.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20211105071509.969-1-caihuoqing@baidu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are no more users for it.
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All the drivers that relied on device_del() to call
device_remove_properties() have now been converted to either
use device_create_managed_software_node() instead of
device_add_properties(), or to register the software node
completely separately from the device.
This will make it finally possible to share and reuse the
software nodes that hold the additional device properties.
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There is no need of this function (and related) since code has been
converted to use the new arch_update_thermal_pressure() API. The old
code can be removed.
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
The thermal pressure is a mechanism which is used for providing
information about reduced CPU performance to the scheduler. Usually code
has to convert the value from frequency units into capacity units,
which are understandable by the scheduler. Create a common conversion code
which can be just used via a handy API.
Internally, the topology_update_thermal_pressure() operates on frequency
in MHz and max CPU frequency is taken from 'freq_factor' (per-cpu).
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Some device requires a special handling for reg_update_bits and can't use
the normal regmap read write logic. An example is when locking is
handled by the device and rmw operations requires to do atomic operations.
Allow to declare a dedicated function in regmap_config for
reg_update_bits in no bus configuration.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Link: https://lore.kernel.org/r/20211104150040.1260-1-ansuelsmth@gmail.com
Signed-off-by: Mark Brown <broonie@kernel.org>
- Fix 2 intel_pstate driver regressions related to the HWP interrupt
handling added recently (Srinivas Pandruvada).
- Fix intel_pstate driver regression introduced during the 5.11
cycle and causing HWP desired performance to be mishandled in
some cases when switching driver modes and during system
suspend and shutdown (Rafael Wysocki).
- Fix system-wide device suspend and resume locking to avoid
deadlocks when device objects are deleted during a system-wide
PM transition (Rafael Wysocki).
- Modify system-wide suspend of devices to prevent cpuidle drivers
based on runtime PM from misbehaving during the "no IRQ" phase of
it (Ulf Hansson).
- Fix return value of _opp_add_static_v2() helper (YueHaibing).
- Fix required-opp handle count (Pavankumar Kondeti).
- Add resource managed OPP helpers, update dev_pm_opp_attach_genpd(),
update their devfreq users, and make minor DT binding change (Dmitry
Osipenko).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGL0/4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxaeMP/09xsXf0LP2LhP9YY77kvdrMONjyh1nj
XuelrhuWl1Cz2X+gh3d7zXrjbuPW0lZF8Syc8GY1Ct8o3omQ8jnyiH1LIBfkDVx7
WqVk8rGMMBXNpg3Os+MMADm3s0woW8c7IF+5cYwxL03OaOGH8uHiZxS9ZTLocH6H
sOh9EAy72b/vLCq4aAzrG8ux10k9t8+2A6EuxzKrkO8rz/egtLjb8KznCER4n4nx
lJ5Us+Kpo9TwuQ4pwujcvk7M9U2boHeKEk8ItiQUQwpW1VJsEUg1konz/UziE3Nf
2vHZIellKs3RB/cVzFZsIJJt0etDss6YtivAMdpgBS+ciiZYgyjqKY+whZ/eBrue
7nn79kSIhi0TVsVnM02HUYpz8oWHo0P0xqZr0t8IzOU+xvJF1e3fb/AkiQbGhYPs
OXUTDqdOHFPplymRy6yrq3NqXyfqYVEKMHXWsQ+uY6FhOvVQTlkBdWscF9inoSDf
LCmoc+xYLIl9R79syq9BqmVDE9cKLTZowmkexKI7MNu32ZQrPrKNdjMbYvvU0Sxf
248RsZokiobnmXZdEupIEKXmI4ANyZFbeU0CL22MY3B14YcRqN7xPWXz2a/6fXWL
h+DS8GCO6j3fgdrywUwA9/LyxR4SOJYEzrC3m34atufMm9LFYII6ShTh96DDPQV6
i5hcVlC5wXAb
=lduU
-----END PGP SIGNATURE-----
Merge tag 'pm-5.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These fix three intel_pstate driver regressions, fix locking in the
core code suspending and resuming devices during system PM
transitions, fix the handling of cpuidle drivers based on runtime PM
during system-wide suspend, fix two issues in the operating
performance points (OPP) framework and resource-managed helpers to it.
Specifics:
- Fix two intel_pstate driver regressions related to the HWP
interrupt handling added recently (Srinivas Pandruvada).
- Fix intel_pstate driver regression introduced during the 5.11 cycle
and causing HWP desired performance to be mishandled in some cases
when switching driver modes and during system suspend and shutdown
(Rafael Wysocki).
- Fix system-wide device suspend and resume locking to avoid
deadlocks when device objects are deleted during a system-wide PM
transition (Rafael Wysocki).
- Modify system-wide suspend of devices to prevent cpuidle drivers
based on runtime PM from misbehaving during the "no IRQ" phase of
it (Ulf Hansson).
- Fix return value of _opp_add_static_v2() helper (YueHaibing).
- Fix required-opp handle count (Pavankumar Kondeti).
- Add resource managed OPP helpers, update dev_pm_opp_attach_genpd(),
update their devfreq users, and make minor DT binding change
(Dmitry Osipenko)"
* tag 'pm-5.16-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Avoid calling put_device() under dpm_list_mtx
cpufreq: intel_pstate: Clear HWP Status during HWP Interrupt enable
cpufreq: intel_pstate: Fix unchecked MSR 0x773 access
cpufreq: intel_pstate: Clear HWP desired on suspend/shutdown and offline
PM: sleep: Fix runtime PM based cpuidle support
dt-bindings: opp: Allow multi-worded OPP entry name
opp: Fix return in _opp_add_static_v2()
PM / devfreq: tegra30: Check whether clk_round_rate() returns zero rate
PM / devfreq: tegra30: Use resource-managed helpers
PM / devfreq: Add devm_devfreq_add_governor()
opp: Add more resource-managed variants of dev_pm_opp_of_add_table()
opp: Change type of dev_pm_opp_attach_genpd(names) argument
opp: Fix required-opps phandle array count check
Merge misc updates from Andrew Morton:
"257 patches.
Subsystems affected by this patch series: scripts, ocfs2, vfs, and
mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache,
gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc,
pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools,
memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm,
vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram,
cleanups, kfence, and damon)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (257 commits)
mm/damon: remove return value from before_terminate callback
mm/damon: fix a few spelling mistakes in comments and a pr_debug message
mm/damon: simplify stop mechanism
Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions
Docs/admin-guide/mm/damon/start: simplify the content
Docs/admin-guide/mm/damon/start: fix a wrong link
Docs/admin-guide/mm/damon/start: fix wrong example commands
mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on
mm/damon: remove unnecessary variable initialization
Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM
mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM)
selftests/damon: support watermarks
mm/damon/dbgfs: support watermarks
mm/damon/schemes: activate schemes based on a watermarks mechanism
tools/selftests/damon: update for regions prioritization of schemes
mm/damon/dbgfs: support prioritization weights
mm/damon/vaddr,paddr: support pageout prioritization
mm/damon/schemes: prioritize regions within the quotas
mm/damon/selftests: support schemes quotas
mm/damon/dbgfs: support quotas of schemes
...
CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM, so there is no need for
CONFIG_MEMORY_HOTPLUG_SPARSE anymore; adjust all instances to use
CONFIG_MEMORY_HOTPLUG and remove CONFIG_MEMORY_HOTPLUG_SPARSE.
Link: https://lkml.kernel.org/r/20210929143600.49379-3-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org> [kselftest]
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename memblock_free_ptr() to memblock_free() and use memblock_free()
when freeing a virtual pointer so that memblock_free() will be a
counterpart of memblock_alloc()
The callers are updated with the below semantic patch and manual
addition of (void *) casting to pointers that are represented by
unsigned long variables.
@@
identifier vaddr;
expression size;
@@
(
- memblock_phys_free(__pa(vaddr), size);
+ memblock_free(vaddr, size);
|
- memblock_free_ptr(vaddr, size);
+ memblock_free(vaddr, size);
)
[sfr@canb.auug.org.au: fixup]
Link: https://lkml.kernel.org/r/20211018192940.3d1d532f@canb.auug.org.au
Link: https://lkml.kernel.org/r/20210930185031.18648-7-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since memblock_free() operates on a physical range, make its name
reflect it and rename it to memblock_phys_free(), so it will be a
logical counterpart to memblock_phys_alloc().
The callers are updated with the below semantic patch:
@@
expression addr;
expression size;
@@
- memblock_free(addr, size);
+ memblock_phys_free(addr, size);
Link: https://lkml.kernel.org/r/20210930185031.18648-6-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
memblock_free_early_nid() is unused and memblock_free_early() is an
alias for memblock_free().
Replace calls to memblock_free_early() with calls to memblock_free() and
remove memblock_free_early() and memblock_free_early_nid().
Link: https://lkml.kernel.org/r/20210930185031.18648-4-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "memblock: cleanup memblock_free interface", v2.
This is the fix for memblock freeing APIs mismatch [1].
The first patch is a cleanup of numa_distance allocation in arch_numa
I've spotted during the conversion. The second patch is a fix for Xen
memory freeing on some of the error paths.
[1] https://lore.kernel.org/all/CAHk-=wj9k4LZTz+svCxLYs5Y1=+yKrbAUArH1+ghyG3OLd8VVg@mail.gmail.com
This patch (of 6):
Memory allocation of numa_distance uses memblock_phys_alloc_range()
without actual range limits, converts the returned physical address to
virtual and then only uses the virtual address for further
initialization.
Simplify this by replacing memblock_phys_alloc_range() with
memblock_alloc().
Link: https://lkml.kernel.org/r/20210930185031.18648-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20210930185031.18648-2-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Juergen Gross <jgross@suse.com>
Cc: Shahab Vahedi <Shahab.Vahedi@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Percpu embedded first chunk allocator is the firstly option, but it
could fails on ARM64, eg,
percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000
percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000
percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000
then we could get
WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838
and the system could not boot successfully.
Let's implement page mapping percpu first chunk allocator as a fallback
to the embedding allocator to increase the robustness of the system.
Link: https://lkml.kernel.org/r/20210910053354.26721-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marco Elver <elver@google.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In the cpuidle-psci case, runtime PM in combination with the generic PM
domain (genpd), may be used when entering/exiting a shared idlestate. More
precisely, genpd relies on runtime PM to be enabled for the attached device
(in this case it belongs to a CPU), to properly manage the reference
counting of its PM domain.
This works fine most of the time, but during system suspend in
dpm_suspend_late(), the PM core disables runtime PM for all devices. Beyond
this point, calls to pm_runtime_get_sync() to runtime resume a device may
fail and therefore it could also mess up the reference counting in genpd.
To fix this problem, let's call wake_up_all_idle_cpus() in
dpm_suspend_late(), prior to disabling runtime PM. In this way a device
that belongs to a CPU, becomes runtime resumed through cpuidle-psci and
stays like that, because the runtime PM usage count has been bumped in
device_prepare().
Diagnosed-by: Maulik Shah <mkshah@codeaurora.org>
Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Here is the big set of driver core changes for 5.16-rc1.
All of these have been in linux-next for a while now with no reported
problems.
Included in here are:
- big update and cleanup of the sysfs abi documentation files
and scripts from Mauro. We are almost at the place where we
can properly check that the running kernel's sysfs abi is
documented fully.
- firmware loader updates
- dyndbg updates
- kernfs cleanups and fixes from Christoph
- device property updates
- component fix
- other minor driver core cleanups and fixes
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYYPbjQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ync9gCfXKMUI1GAnCfJWAwTdTcd18q5akoAoMw32/AH
0yh5TjAWFyFd7xz5d7qs
=itsC
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 5.16-rc1.
All of these have been in linux-next for a while now with no reported
problems.
Included in here are:
- big update and cleanup of the sysfs abi documentation files and
scripts from Mauro. We are almost at the place where we can
properly check that the running kernel's sysfs abi is documented
fully.
- firmware loader updates
- dyndbg updates
- kernfs cleanups and fixes from Christoph
- device property updates
- component fix
- other minor driver core cleanups and fixes"
* tag 'driver-core-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (122 commits)
device property: Drop redundant NULL checks
x86/build: Tuck away built-in firmware under FW_LOADER
vmlinux.lds.h: wrap built-in firmware support under FW_LOADER
firmware_loader: move struct builtin_fw to the only place used
x86/microcode: Use the firmware_loader built-in API
firmware_loader: remove old DECLARE_BUILTIN_FIRMWARE()
firmware_loader: formalize built-in firmware API
component: do not leave master devres group open after bind
dyndbg: refine verbosity 1-4 summary-detail
gpiolib: acpi: Replace custom code with device_match_acpi_handle()
i2c: acpi: Replace custom function with device_match_acpi_handle()
driver core: Provide device_match_acpi_handle() helper
dyndbg: fix spurious vNpr_info change
dyndbg: no vpr-info on empty queries
dyndbg: vpr-info on remove-module complete, not starting
device property: Add missed header in fwnode.h
Documentation: dyndbg: Improve cli param examples
dyndbg: Remove support for ddebug_query param
dyndbg: make dyndbg a known cli param
dyndbg: show module in vpr-info in dd-exec-queries
...
- Add support for inefficient operating performance points to the
Energy Model and modify cpufreq to use them properly (Vincent
Donnefort).
- Rearrange the DTPM framework code to simplify it and make it easier
to follow (Daniel Lezcano).
- Fix power intialization in DTPM (Daniel Lezcano).
- Add CPU load consideration when estimating the instaneous power
consumption in DTPM (Daniel Lezcano).
- Fix cpu->pstate.turbo_freq initialization in intel_pstate (Zhang
Rui).
- Make intel_pstate process HWP Guaranteed change notifications from
the processor (Srinivas Pandruvada).
- Fix typo in cpufreq.h (Rafael Wysocki).
- Fix tegra driver to handle BPMP errors properly (Mikko Perttunen).
- Fix the parameter usage of the newly added perf-domain API (Hector
Yuan).
- Minor cleanups to cppc, vexpress and s3c244x drivers (Han Wang,
Guenter Roeck, and Arnd Bergmann).
- Fix kobject memory leaks in cpuidle error paths (Anel Orazgaliyeva).
- Make intel_idle enable interrupts before entering C1 on some Xeon
processor models (Artem Bityutskiy).
- Clean up hib_wait_io() (Falla Coulibaly).
- Fix sparse warnings in hibernation-related code (Anders Roxell).
- Use vzalloc() and kzalloc() instead of their open-coded
equivalents in hibernation-related code (Cai Huoqing).
- Prevent user space from crashing the kernel by attempting to
restore the system state from a swap partition in use (Ye Bin).
- Do not let "syscore" devices runtime-suspend during system PM
transitions (Rafael Wysocki).
- Do not pause cpuidle in the suspend-to-idle path (Rafael Wysocki).
- Pause cpuidle later and resume it earlier during system PM
transitions (Rafael Wysocki).
- Make system suspend code use valid_state() consistently (Rafael
Wysocki).
- Add support for enabling wakeup IRQs after invoking the
->runtime_suspend() callback and make two drivers use it (Chunfeng
Yun).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices
in general (Rafael Wysocki).
- Eliminate struct pci_platform_pm_ops and handle the both of its
users (PCI and Intel MID) directly in the PCI bus code (Rafael
Wysocki).
- Simplify and clarify ACPI PCI device PM helpers (Rafael Wysocki).
- Fix ordering of operations in pci_back_from_sleep() (Rafael
Wysocki).
- Make exynos-ppmu use hyphens in DT properties (Krzysztof
Kozlowski).
- Simplify parsing event-type from DT in exynos-ppmu (Krzysztof
Kozlowski).
- Strengthen check for freq_table in devfreq (Samuel Holland).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmGBkiQSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxDZwP/1kyhVdk+FSMMwEfhn8NriOkE8drJ3nB
Y/PWI93oPhgA7ITMwBv4ufLcxx6uYu5bctVDx3XG2lqvFC30t1fzTosJiwireRaS
WC5p7ZTqAECxd1LiZWPjL5EzHmtbyzrz3/AxSxOEN1K66qENBd6q5QldKZylxhY3
bawCQjabz6CLXvOjzdv8M8A7Fmd8LTcjHI5Y3/IOcDydPJqyN4/rDCoqft3/xcNB
kwN6de73aQwB3AQYufS/VAiNN4XOOkhPwF4QfULmAlnMdCsov6YzahtMB2+oG7O4
G3DF/OVFrONr3GPMMuMJSC6GXyFiBuW8FRva4W9HpY0MA8xVGLPUpwlpaFVaX1+c
vAYcRBTyJvOWgRap8+q+UKTlkj37pAgHp7kRiaO1wkVnKxJB1w40OSJZO1nnsExe
3qeCJHOJ9r+S/FsSPKCmws8vr0XQH5wPXY639Kmj9OI/t3gXGrfy3cXm9pa+gSh0
eMyHxtCp5ItT7V2FMpYh+wn+wfe5h//sK3tESZs+h6FKwJG1hYIbG4+F3ztIgzHp
t0rT3JXZIkY41KREGFhCMS9+wnLugOik21w9O0qVZfn/dJtDe73Kely7rY/EA3Mw
H4aBJDD19BvbIKqaTguxJXEc9zJI737fy/Ze4rrzTDkbXU8qVmjvFoEl2i/Ef4o2
b6aiDdz3V/CW
=L2Jo
-----END PGP SIGNATURE-----
Merge tag 'pm-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These make the power management of PCI devices with ACPI companions
more straightforwad, add support for inefficient operating performance
points to the Energy model and make cpufreq handle them as
appropriate, rearrange the handling of cpuidle during system PM
transitions, update a few cpufreq drivers and intel_idle, fix assorded
issues and clean up code in multiple places.
Specifics:
- Add support for inefficient operating performance points to the
Energy Model and modify cpufreq to use them properly (Vincent
Donnefort).
- Rearrange the DTPM framework code to simplify it and make it easier
to follow (Daniel Lezcano).
- Fix power intialization in DTPM (Daniel Lezcano).
- Add CPU load consideration when estimating the instaneous power
consumption in DTPM (Daniel Lezcano).
- Fix cpu->pstate.turbo_freq initialization in intel_pstate (Zhang
Rui).
- Make intel_pstate process HWP Guaranteed change notifications from
the processor (Srinivas Pandruvada).
- Fix typo in cpufreq.h (Rafael Wysocki).
- Fix tegra driver to handle BPMP errors properly (Mikko Perttunen).
- Fix the parameter usage of the newly added perf-domain API (Hector
Yuan).
- Minor cleanups to cppc, vexpress and s3c244x drivers (Han Wang,
Guenter Roeck, and Arnd Bergmann).
- Fix kobject memory leaks in cpuidle error paths (Anel
Orazgaliyeva).
- Make intel_idle enable interrupts before entering C1 on some Xeon
processor models (Artem Bityutskiy).
- Clean up hib_wait_io() (Falla Coulibaly).
- Fix sparse warnings in hibernation-related code (Anders Roxell).
- Use vzalloc() and kzalloc() instead of their open-coded equivalents
in hibernation-related code (Cai Huoqing).
- Prevent user space from crashing the kernel by attempting to
restore the system state from a swap partition in use (Ye Bin).
- Do not let "syscore" devices runtime-suspend during system PM
transitions (Rafael Wysocki).
- Do not pause cpuidle in the suspend-to-idle path (Rafael Wysocki).
- Pause cpuidle later and resume it earlier during system PM
transitions (Rafael Wysocki).
- Make system suspend code use valid_state() consistently (Rafael
Wysocki).
- Add support for enabling wakeup IRQs after invoking the
->runtime_suspend() callback and make two drivers use it (Chunfeng
Yun).
- Make the association of ACPI device objects with PCI devices more
straightforward and simplify the code doing that for all devices in
general (Rafael Wysocki).
- Eliminate struct pci_platform_pm_ops and handle the both of its
users (PCI and Intel MID) directly in the PCI bus code (Rafael
Wysocki).
- Simplify and clarify ACPI PCI device PM helpers (Rafael Wysocki).
- Fix ordering of operations in pci_back_from_sleep() (Rafael
Wysocki).
- Make exynos-ppmu use hyphens in DT properties (Krzysztof
Kozlowski).
- Simplify parsing event-type from DT in exynos-ppmu (Krzysztof
Kozlowski).
- Strengthen check for freq_table in devfreq (Samuel Holland)"
* tag 'pm-5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (49 commits)
cpufreq: Fix parameter in parse_perf_domain()
usb: mtu3: enable wake-up interrupt after runtime_suspend called
usb: xhci-mtk: enable wake-up interrupt after runtime_suspend called
PM / wakeirq: support enabling wake-up irq after runtime_suspend called
PM / devfreq: Strengthen check for freq_table
devfreq: exynos-ppmu: simplify parsing event-type from DT
devfreq: exynos-ppmu: use node names with hyphens
cpufreq: intel_pstate: Fix cpu->pstate.turbo_freq initialization
PM: suspend: Use valid_state() consistently
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
PM: suspend: Do not pause cpuidle in the suspend-to-idle path
PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions
PM: hibernate: Get block device exclusively in swsusp_check()
powercap/drivers/dtpm: Fix power limit initialization
powercap/drivers/dtpm: Scale the power with the load
powercap/drivers/dtpm: Use container_of instead of a private data field
powercap/drivers/dtpm: Simplify the dtpm table
powercap/drivers/dtpm: Encapsulate even more the code
PM: hibernate: swap: Use vzalloc() and kzalloc()
PM: hibernate: fix sparse warnings
...
Merge updates related to system sleep for 5.16-rc1:
- Clean up hib_wait_io() (Falla Coulibaly).
- Fix sparse warnings in hibernation-related code (Anders Roxell).
- Use vzalloc() and kzalloc() instead of their open-coded
equivalents in hibernation-related code (Cai Huoqing).
- Prevent user space from crashing the kernel by attempting to
restore the system state from a swap partition in use (Ye Bin).
- Do not let "syscore" devices runtime-suspend during system PM
transitions (Rafael Wysocki).
- Do not pause cpuidle in the suspend-to-idle path (Rafael Wysocki).
- Pause cpuidle later and resume it earlier during system PM
transitions (Rafael Wysocki).
- Make system suspend code use valid_state() consistently (Rafael
Wysocki).
- Add support for enabling wakeup IRQs after invoking the
->runtime_suspend() callback and make two drivers use it (Chunfeng
Yun).
* pm-sleep:
usb: mtu3: enable wake-up interrupt after runtime_suspend called
usb: xhci-mtk: enable wake-up interrupt after runtime_suspend called
PM / wakeirq: support enabling wake-up irq after runtime_suspend called
PM: suspend: Use valid_state() consistently
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
PM: suspend: Do not pause cpuidle in the suspend-to-idle path
PM: sleep: Do not let "syscore" devices runtime-suspend during system transitions
PM: hibernate: Get block device exclusively in swsusp_check()
PM: hibernate: swap: Use vzalloc() and kzalloc()
PM: hibernate: fix sparse warnings
Revert "PM: sleep: Do not assume that "mem" is always present"
PM: hibernate: Remove blk_status_to_errno in hib_wait_io
PM: sleep: Do not assume that "mem" is always present
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space
and avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace
to work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction,
by exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec
offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP
buffer pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x
and Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we
can add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements
to CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGAzX4ACgkQMUZtbf5S
IrvW3g//Q0ZLrOuHK9pZ8sCXMMhDj8qL6ajm0otMddHWA/+1UglwVBKFhsajfxOf
wJ/5LZis+XKLpLqKTU5chKVfn39HuDGe/D3l+egi01Gv5BW0+XzEhagfyR5tJX5z
wsGG5CXO/we/laVSzRiFtwwVEKHKN20YC+tIQwYOYP5Wy3q4G7qDsFhT7GqgsGCS
n74QUEAIB5Tz0ODWFqLtbsySzIurXrskibwt5T9bvAAlPw/lCU68mmG+NVJ7VddO
lBbNkLMOo8yW9Ci20H09SrYd4jZTmMARo9tsFO1tAvAMk7qpn0Wd8pnOYTjFFoMD
+qjiFSVMh7E0JGb8Y7NCvwaB99suAK5rfGP68Xwe62DfP7vYWEx4pZGxBP19F4ld
6Kn1ME33BX9rUF9tBecf0bdKfJUwB2Q2Xou/b9laG04bwiqsc9iG5FQq1C46lnLZ
QdzNiS1My4dJMczkWt66HF3Kx30ibwHfvKMIHjf4PqkzEatkv6Y6SBZ57KXL+Lde
0BQSFhbf0tm2Gf55etzrczLElI3uqHSFWUNZZ2Bt6WmzO1e6tpV9nAtRWF4C/dFg
QDpLJtOOOY65uq+qz09zoPfv2lem868SrCAuFrVn99bEpYjx/CGNFDeEI02l6jyr
84eUxd364UcbIk3fc+eTGdXHLQNVk30G0AHVBBxaWNIidwfqXeE=
=srde
-----END PGP SIGNATURE-----
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
This update has a single change which will use the maximum transfer and
message sizes advertised by SPI controllers to configure limits within
the regmap core, ensuring better interoperation.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmF/6F8ACgkQJNaLcl1U
h9Calgf+PaI6wr17nU0/m4N50Vjat0bL7iBuEQYPKFWu963M0QasPZEKteJV7sa+
8XVxcBh43E2unQkY+iMfldmBIEfZJwmsaWecdxSair6cWcm8MT6agOOkl6KjiCHM
AQiPI2+UbXDHnPociBmyA+MqI3hWSPZIMirVC+3WAFGLs8wHuRe0G97egkcPh88a
doQ5f2Ei7eCgP4uy+p93S4QWXbsbQVm/yszw0BH0gaqnBylwUfQtU1vdlLyBXW4y
9bB1MnzCdlLbKf4bQd7NcjB2bA8lgAuZ3t5rJoRjZtnf+//fRTB1cu/UkLRFlZme
vsrt60wo52r2v5tiLKA4wrgcvEFR+A==
=8MR4
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap update from Mark Brown:
"A single change to use the maximum transfer and message sizes
advertised by SPI controllers to configure limits within the
regmap core, ensuring better interoperation"
* tag 'regmap-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: spi: Set regmap max raw r/w from max_transfer_size
- Revert the printk format based wchan() symbol resolution as it can leak
the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset and
__sched_setscheduler() introduced a new lock dependency which is now
triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmF/OUkTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoR/5D/9ikdGNpKg9osNqJ3GjAmxsK6kVkB29
iFe2k8pIpWDToWQf/wQRGih4Yj3Cl49QSnZcPIibh2/12EB1qrrW6iSPJkInz8Ec
/1LS5/Vewn2OyoxyXZjdvGC5gTXEodSbIazASvX7nvdMeI4gsAsL5etzrMJirT/t
aymqvr7zovvywrwMTQJrGjUMo9l4ewE8tafMNNhRu1BHU1U4ojM9yvThyRAAcmp7
3Xy49A+Yq3IgrvYI4u8FMK5Zh08KaxSFjiLhePGm/bF+wSfYmWop2TP1jY05W2Uo
ti8hfbJMUoFRYuMxAiEldkItnc0wV4M9PtWZZ/x+B71bs65Y4Zjt9cW+rxJv2+m1
vzV31EsQwGnOti072dzWN4c/cZqngVXAjaNtErvDwJUr+Tw1ayv9KUvuodMQqZY6
mu68bFUO2kV9EMe1CBOv51Uy1RGHyLj3rlNqrkw+Xp5ISE9Ad2vhUEiRp5bQx5Ci
V/XFhGZkGUluh0vccrdFlNYZwhj8cZEzkOPCnPSeZ+bq8SyZE6xuHH/lTP1CJCOy
s800rW1huM+kgV+zRN8adDkGXibAk9N3RtVGnQXmuEy8gB9LZmQg+JeM2wsc9B+6
i0gdqZnsjNAfoK+BBAG4holxptSL8/eOJsFH8ZNIoxQ+iqooyPx9tFX7yXnRTBQj
d2qWG7UvoseT+g==
=fgtS
-----END PGP SIGNATURE-----
Merge tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Thomas Gleixner:
- Revert the printk format based wchan() symbol resolution as it can
leak the raw value in case that the symbol is not resolvable.
- Make wchan() more robust and work with all kind of unwinders by
enforcing that the task stays blocked while unwinding is in progress.
- Prevent sched_fork() from accessing an invalid sched_task_group
- Improve asymmetric packing logic
- Extend scheduler statistics to RT and DL scheduling classes and add
statistics for bandwith burst to the SCHED_FAIR class.
- Properly account SCHED_IDLE entities
- Prevent a potential deadlock when initial priority is assigned to a
newly created kthread. A recent change to plug a race between cpuset
and __sched_setscheduler() introduced a new lock dependency which is
now triggered. Break the lock dependency chain by moving the priority
assignment to the thread function.
- Fix the idle time reporting in /proc/uptime for NOHZ enabled systems.
- Improve idle balancing in general and especially for NOHZ enabled
systems.
- Provide proper interfaces for live patching so it does not have to
fiddle with scheduler internals.
- Add cluster aware scheduling support.
- A small set of tweaks for RT (irqwork, wait_task_inactive(), various
scheduler options and delaying mmdrop)
- The usual small tweaks and improvements all over the place
* tag 'sched-core-2021-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (69 commits)
sched/fair: Cleanup newidle_balance
sched/fair: Remove sysctl_sched_migration_cost condition
sched/fair: Wait before decaying max_newidle_lb_cost
sched/fair: Skip update_blocked_averages if we are defering load balance
sched/fair: Account update_blocked_averages in newidle_balance cost
x86: Fix __get_wchan() for !STACKTRACE
sched,x86: Fix L2 cache mask
sched/core: Remove rq_relock()
sched: Improve wake_up_all_idle_cpus() take #2
irq_work: Also rcuwait for !IRQ_WORK_HARD_IRQ on PREEMPT_RT
irq_work: Handle some irq_work in a per-CPU thread on PREEMPT_RT
irq_work: Allow irq_work_sync() to sleep if irq_work() no IRQ support.
sched/rt: Annotate the RT balancing logic irqwork as IRQ_WORK_HARD_IRQ
sched: Add cluster scheduler level for x86
sched: Add cluster scheduler level in core and related Kconfig for ARM64
topology: Represent clusters of CPUs within a die
sched: Disable -Wunused-but-set-variable
sched: Add wrapper for get_wchan() to keep task blocked
x86: Fix get_wchan() to support the ORC unwinder
proc: Use task_is_running() for wchan in /proc/$pid/stat
...
This fixes a potential double free when handling an out of memory error
inserting a node into an rbtree regcache.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmF6lHQACgkQJNaLcl1U
h9DNugf/WXDIhnn1ToxWpe80tGZFgTcyXN1T6QfCiP5iKbOsYHFYrITDUhnbCrV+
4z5YngU8Jw2bz6nx72Yt3c4bwQRl6ZESgH6Rupr1nClQO06k8bLySLVWJPH1V3y4
ofzdSc7x6JVHaVLj09NR6N3rrOKbmMxS8Ny0RMr7Futu0cPog0/7f57zG/Mkq9Fw
vBsLvtpwt9cAOPRl7lxiU3NXKVdWYyIXxOCGW7z+e7u/HK7W5ByELrgXo1BFLXST
dUks5d5qKHmExB/7plxv8mxOQlNHVimTxXpt2sL9Z9FoQR3OCs5kfG+eaIDyx5UW
Y10gaxu61sRO6g9EqUPhfzOkiAYk5Q==
=hMOi
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"This fixes a potential double free when handling an out of memory
error inserting a node into an rbtree regcache"
* tag 'regmap-fix-v5.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Fix possible double-free in regcache_rbtree_exit()
When the dedicated wake IRQ is level trigger, and it uses the
device's low-power status as the wakeup source, that means if the
device is not in low-power state, the wake IRQ will be triggered
if enabled; For this case, need enable the wake IRQ after running
the device's ->runtime_suspend() which make it enter low-power state.
e.g.
Assume the wake IRQ is a low level trigger type, and the wakeup
signal comes from the low-power status of the device.
The wakeup signal is low level at running time (0), and becomes
high level when the device enters low-power state (runtime_suspend
(1) is called), a wakeup event at (2) make the device exit low-power
state, then the wakeup signal also becomes low level.
------------------
| ^ ^|
---------------- | | --------------
|<---(0)--->|<--(1)--| (3) (2) (4)
if enable the wake IRQ before running runtime_suspend during (0),
a wake IRQ will arise, it causes resume immediately;
it works if enable wake IRQ ( e.g. at (3) or (4)) after running
->runtime_suspend().
This patch introduces a new status WAKE_IRQ_DEDICATED_REVERSE to
optionally support enabling wake IRQ after running ->runtime_suspend().
Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
In cases when functions are called via fwnode operations,
we already know that this is software node we are dealing
with, hence no need to check if it's NULL, it can't be,
Reported-by: YE Chengfeng <cyeaa@connect.ust.hk>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211026162954.89811-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 8651f97bd9 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
It is pointless to pause cpuidle in the suspend-to-idle path,
because it is going to be resumed in the same path later and
pausing it does not serve any particular purpose in that case.
Rework the code to avoid doing that.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
This converts users of mdiobus to mdiodev using the following semantic
patch:
@@
identifier mdiodev;
expression regnum;
@@
- mdiobus_read(mdiodev->bus, mdiodev->addr, regnum)
+ mdiodev_read(mdiodev, regnum)
@@
identifier mdiodev;
expression regnum, val;
@@
- mdiobus_write(mdiodev->bus, mdiodev->addr, regnum, val)
+ mdiodev_write(mdiodev, regnum, val)
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Set regmap raw read/write from spi max_transfer_size
so regmap_raw_read/write can split the access into chunks
Signed-off-by: Lucas Tanure <tanureal@opensource.cirrus.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
[André: fix build warning]
Signed-off-by: André Almeida <andrealmeid@collabora.com>
Link: https://lore.kernel.org/r/20211021132721.13669-1-andrealmeid@collabora.com
Signed-off-by: Mark Brown <broonie@kernel.org>
There is no reason to allow "syscore" devices to runtime-suspend
during system-wide PM transitions, because they are subject to the
same possible failure modes as any other devices in that respect.
Accordingly, change device_prepare() and device_complete() to call
pm_runtime_get_noresume() and pm_runtime_put(), respectively, for
"syscore" devices too.
Fixes: 057d51a126 ("Merge branch 'pm-sleep'")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: 3.10+ <stable@vger.kernel.org> # 3.10+
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Now that x86 doesn't abuse picking at internals to the firmware
loader move out the built-in firmware struct to its only user.
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20211021155843.1969401-5-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Formalize the built-in firmware with a proper API. This can later
be used by other callers where all they need is built-in firmware.
We export the firmware_request_builtin() call for now only
under the TEST_FIRMWARE symbol namespace as there are no
direct modular users for it. If they pop up they are free
to export it generally. Built-in code always gets access to
the callers and we'll demonstrate a hidden user which has been
lurking in the kernel for a while and the reason why using a
proper API was better long term.
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20211021155843.1969401-2-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In current code, the devres group for aggregate master is left open
after call to component_master_add_*(). This leads to problems when the
master does further managed allocations on its own. When any
participating driver calls component_del(), this leads to immediate
release of resources.
This came up when investigating a page fault occurring with i915 DRM
driver unbind with 5.15-rc1 kernel. The following sequence occurs:
i915_pci_remove()
-> intel_display_driver_unregister()
-> i915_audio_component_cleanup()
-> component_del()
-> component.c:take_down_master()
-> hdac_component_master_unbind() [via master->ops->unbind()]
-> devres_release_group(master->parent, NULL)
With older kernels this has not caused issues, but with audio driver
moving to use managed interfaces for more of its allocations, this no
longer works. Devres log shows following to occur:
component_master_add_with_match()
[ 126.886032] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000323ccdc5 devm_component_match_release (24 bytes)
[ 126.886045] snd_hda_intel 0000:00:1f.3: DEVRES ADD 00000000865cdb29 grp< (0 bytes)
[ 126.886049] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 grp< (0 bytes)
audio driver completes its PCI probe()
[ 126.892238] snd_hda_intel 0000:00:1f.3: DEVRES ADD 000000001b480725 pcim_iomap_release (48 bytes)
component_del() called() at DRM/i915 unbind()
[ 137.579422] i915 0000:00:02.0: DEVRES REL 00000000ef44c293 grp< (0 bytes)
[ 137.579445] snd_hda_intel 0000:00:1f.3: DEVRES REL 00000000865cdb29 grp< (0 bytes)
[ 137.579458] snd_hda_intel 0000:00:1f.3: DEVRES REL 000000001b480725 pcim_iomap_release (48 bytes)
So the "devres_release_group(master->parent, NULL)" ends up freeing the
pcim_iomap allocation. Upon next runtime resume, the audio driver will
cause a page fault as the iomap alloc was released without the driver
knowing about it.
Fix this issue by using the "struct master" pointer as identifier for
the devres group, and by closing the devres group after
the master->ops->bind() call is done. This allows devres allocations
done by the driver acting as master to be isolated from the binding state
of the aggregate driver. This modifies the logic originally introduced in
commit 9e1ccb4a77 ("drivers/base: fix devres handling for master device")
Fixes: 9e1ccb4a77 ("drivers/base: fix devres handling for master device")
Cc: stable@vger.kernel.org
Acked-by: Imre Deak <imre.deak@intel.com>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Kai Vehmanen <kai.vehmanen@linux.intel.com>
BugLink: https://gitlab.freedesktop.org/drm/intel/-/issues/4136
Link: https://lore.kernel.org/r/20211013161345.3755341-1-kai.vehmanen@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have a couple of users of this helper, make it available for them.
The prototype for the helper is specifically crafted in order to be
easily used with bus_find_device() call. That's why its location is
in the driver core rather than ACPI.
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20211014134756.39092-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here are some small driver core fixes for 5.15-rc6, all of which have
been in linux-next for a while with no reported issues.
They include:
- kernfs negative dentry bugfix
- simple pm bus fixes to resolve reported issues
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYWvzFg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymfLQCfSCP698AAvoCgG0fOfLakFkw80h0AoKIVm3lk
t0GUdnplU18CjnO5M1Zj
=+dh9
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some small driver core fixes for 5.15-rc6, all of which have
been in linux-next for a while with no reported issues.
They include:
- kernfs negative dentry bugfix
- simple pm bus fixes to resolve reported issues"
* tag 'driver-core-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
drivers: bus: Delete CONFIG_SIMPLE_PM_BUS
drivers: bus: simple-pm-bus: Add support for probing simple bus only devices
driver core: Reject pointless SYNC_STATE_ONLY device links
kernfs: don't create a negative dentry if inactive node exists
Both ACPI and DT provide the ability to describe additional layers of
topology between that of individual cores and higher level constructs
such as the level at which the last level cache is shared.
In ACPI this can be represented in PPTT as a Processor Hierarchy
Node Structure [1] that is the parent of the CPU cores and in turn
has a parent Processor Hierarchy Nodes Structure representing
a higher level of topology.
For example Kunpeng 920 has 6 or 8 clusters in each NUMA node, and each
cluster has 4 cpus. All clusters share L3 cache data, but each cluster
has local L3 tag. On the other hand, each clusters will share some
internal system bus.
+-----------------------------------+ +---------+
| +------+ +------+ +--------------------------+ |
| | CPU0 | | cpu1 | | +-----------+ | |
| +------+ +------+ | | | | |
| +----+ L3 | | |
| +------+ +------+ cluster | | tag | | |
| | CPU2 | | CPU3 | | | | | |
| +------+ +------+ | +-----------+ | |
| | | |
+-----------------------------------+ | |
+-----------------------------------+ | |
| +------+ +------+ +--------------------------+ |
| | | | | | +-----------+ | |
| +------+ +------+ | | | | |
| | | L3 | | |
| +------+ +------+ +----+ tag | | |
| | | | | | | | | |
| +------+ +------+ | +-----------+ | |
| | | |
+-----------------------------------+ | L3 |
| data |
+-----------------------------------+ | |
| +------+ +------+ | +-----------+ | |
| | | | | | | | | |
| +------+ +------+ +----+ L3 | | |
| | | tag | | |
| +------+ +------+ | | | | |
| | | | | | +-----------+ | |
| +------+ +------+ +--------------------------+ |
+-----------------------------------| | |
+-----------------------------------| | |
| +------+ +------+ +--------------------------+ |
| | | | | | +-----------+ | |
| +------+ +------+ | | | | |
| +----+ L3 | | |
| +------+ +------+ | | tag | | |
| | | | | | | | | |
| +------+ +------+ | +-----------+ | |
| | | |
+-----------------------------------+ | |
+-----------------------------------+ | |
| +------+ +------+ +--------------------------+ |
| | | | | | +-----------+ | |
| +------+ +------+ | | | | |
| | | L3 | | |
| +------+ +------+ +---+ tag | | |
| | | | | | | | | |
| +------+ +------+ | +-----------+ | |
| | | |
+-----------------------------------+ | |
+-----------------------------------+ | |
| +------+ +------+ +--------------------------+ |
| | | | | | +-----------+ | |
| +------+ +------+ | | | | |
| | | L3 | | |
| +------+ +------+ +--+ tag | | |
| | | | | | | | | |
| +------+ +------+ | +-----------+ | |
| | +---------+
+-----------------------------------+
That means spreading tasks among clusters will bring more bandwidth
while packing tasks within one cluster will lead to smaller cache
synchronization latency. So both kernel and userspace will have
a chance to leverage this topology to deploy tasks accordingly to
achieve either smaller cache latency within one cluster or an even
distribution of load among clusters for higher throughput.
This patch exposes cluster topology to both kernel and userspace.
Libraried like hwloc will know cluster by cluster_cpus and related
sysfs attributes. PoC of HWLOC support at [2].
Note this patch only handle the ACPI case.
Special consideration is needed for SMT processors, where it is
necessary to move 2 levels up the hierarchy from the leaf nodes
(thus skipping the processor core level).
Note that arm64 / ACPI does not provide any means of identifying
a die level in the topology but that may be unrelate to the cluster
level.
[1] ACPI Specification 6.3 - section 5.2.29.1 processor hierarchy node
structure (Type 0)
[2] https://github.com/hisilicon/hwloc/tree/linux-cluster
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210924085104.44806-2-21cnbao@gmail.com
tools/testing/selftests/net/ioam6.sh
7b1700e009 ("selftests: net: modify IOAM tests for undef bits")
bf77b1400a ("selftests: net: Test for the IOAM encapsulation with IPv6")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In regcache_rbtree_insert_to_block(), when 'present' realloc failed,
the 'blk' which is supposed to assign to 'rbnode->block' will be freed,
so 'rbnode->block' points a freed memory, in the error handling path of
regcache_rbtree_init(), 'rbnode->block' will be freed again in
regcache_rbtree_exit(), KASAN will report double-free as follows:
BUG: KASAN: double-free or invalid-free in kfree+0xce/0x390
Call Trace:
slab_free_freelist_hook+0x10d/0x240
kfree+0xce/0x390
regcache_rbtree_exit+0x15d/0x1a0
regcache_rbtree_init+0x224/0x2c0
regcache_init+0x88d/0x1310
__regmap_init+0x3151/0x4a80
__devm_regmap_init+0x7d/0x100
madera_spi_probe+0x10f/0x333 [madera_spi]
spi_probe+0x183/0x210
really_probe+0x285/0xc30
To fix this, moving up the assignment of rbnode->block to immediately after
the reallocation has succeeded so that the data structure stays valid even
if the second reallocation fails.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3f4ff561bc ("regmap: rbtree: Make cache_present bitmap per node")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20211012023735.1632786-1-yangyingliang@huawei.com
Signed-off-by: Mark Brown <broonie@kernel.org>
This KUnit fixes update for Linux 5.15-rc6 consists of:
- Fixes to address the structleak plugin causing the stack frame size
to grow immensely when used with KUnit. Fixes include adding a new
makefile to disable structleak and using it from KUnit iio, device
property, thunderbolt, and bitfield tests to disable it.
- KUnit framework reference count leak in kfree_at_end
- KUnit tool fix to resolve conflict between --json and --raw_output
and generate correct test output in either case.
- kernel-doc warnings due to mismatched arg names
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmFkqr8ACgkQCwJExA0N
QxyziBAAn+02Iq8Kd1p9/+ajr78iQKmjIpSHH51R6wPwOBJg0RgusyuheJ/pccl1
OOt7sAIdoBWlbZjkwi9SOoFabdvtAO6i7SWpw1R0AdWT5iG4ewe17ODIJiGQ6l9E
4fNayU+XFunXSqxxEsiSJYO5ztoyDDNOqWJ1a43r+9j6ApsHPH3aKwtx4hFutaJA
TAUvSOJOMv93SWTHdQh1ihHMVBkhvdnnpSvsRzJFzkEmI5rmL+p7HcbuM0Zk/gyD
9obygEx8QsJ4pg1AJs8qAj1YbV599qbVhBYYr3EewWscyWME4RrIiXPGImcYKN3/
99S5pOVTI+wp12r5c8WOdXta+Qe+1RvT9wvTPS+msGQakQhIq0eNePtO1AwIoIor
S+36JbJdW1rOepg1WssL55F6+re//GYdluSUGwmwYCL3eUEIMpfRqeamg7BKQH3j
gfukFJvLzzdLiyeYRg2f/G3Vxwa18shlIJgz+6ADCmemura3waeelTG0zFWiNsd5
XYXXhrxV8eq+Cj5ee/iBddKIgn2+QjFtDmZtsLtiZts3aV7zlugYm2bAYCN1Yxsu
ZDvymnZUozoOCRFvBe60Eo4DSTDTnmAaRRnuX5L/x1ybpAUvwZQxzyNDIuvfZKQ9
JB8LrxAJDz7G4jj/9OY5IeacoXNKqDVZYK+Kb9L7hAS08ITk6vU=
=08Ra
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-kunit-fixes-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kunit fixes from Shuah Khan:
- Fixes to address the structleak plugin causing the stack frame size
to grow immensely when used with KUnit. Fixes include adding a new
makefile to disable structleak and using it from KUnit iio, device
property, thunderbolt, and bitfield tests to disable it.
- KUnit framework reference count leak in kfree_at_end
- KUnit tool fix to resolve conflict between --json and --raw_output
and generate correct test output in either case.
- kernel-doc warnings due to mismatched arg names
* tag 'linux-kselftest-kunit-fixes-5.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: fix kernel-doc warnings due to mismatched arg names
bitfield: build kunit tests without structleak plugin
thunderbolt: build kunit tests without structleak plugin
device property: build kunit tests without structleak plugin
iio/test-format: build kunit tests without structleak plugin
gcc-plugins/structleak: add makefile var for disabling structleak
kunit: fix reference count leak in kfree_at_end
kunit: tool: better handling of quasi-bool args (--json, --raw_output)
Move the mac address helpers out, eth.c already contains
a bunch of similar helpers.
Suggested-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The structleak plugin causes the stack frame size to grow immensely when
used with KUnit:
../drivers/base/test/property-entry-test.c:492:1: warning: the frame size of 2832 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:322:1: warning: the frame size of 2080 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:250:1: warning: the frame size of 4976 bytes is larger than 2048 bytes [-Wframe-larger-than=]
../drivers/base/test/property-entry-test.c:115:1: warning: the frame size of 3280 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Turn it off in this file.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SYNC_STATE_ONLY device links intentionally allow cycles because cyclic
sync_state() dependencies are valid and necessary.
However a SYNC_STATE_ONLY device link where the consumer and the supplier
are the same device is pointless because the device link would be deleted
as soon as the device probes (because it's also the consumer) and won't
affect when the sync_state() callback is called. It's a waste of CPU cycles
and memory to create this device link. So reject any attempts to create
such a device link.
Fixes: 05ef983e0d ("driver core: Add device link support for SYNC_STATE_ONLY flag")
Cc: stable <stable@vger.kernel.org>
Reported-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210929190549.860541-1-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Right now firmware_request_builtin() is used internally only
and so we have control over the callers. But if we want to expose
that API more broadly we should ensure the firmware pointer
is valid.
This doesn't fix any known issue, it just prepares us to later
expose this API to other users.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210917182226.3532898-4-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are two ways the firmware_loader can use the built-in
firmware: with or without the pre-allocated buffer. We already
have one explicit use case for each of these, and so split them
up so that it is clear what the intention is on the caller side.
This also paves the way so that eventually other callers outside
of the firmware loader can uses these if and when needed.
While at it, adopt the firmware prefix for the routine names.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210917182226.3532898-3-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The firmware_loader can be used with a pre-allocated buffer
through the use of the API calls:
o request_firmware_into_buf()
o request_partial_firmware_into_buf()
If the firmware was built-in and present, our current check
for if the built-in firmware fits into the pre-allocated buffer
does not return any errors, and we proceed to tell the caller
that everything worked fine. It's a lie and no firmware would
end up being copied into the pre-allocated buffer. So if the
caller trust the result it may end up writing a bunch of 0's
to a device!
Fix this by making the function that checks for the pre-allocated
buffer return non-void. Since the typical use case is when no
pre-allocated buffer is provided make this return successfully
for that case. If the built-in firmware does *not* fit into the
pre-allocated buffer size return a failure as we should have
been doing before.
I'm not aware of users of the built-in firmware using the API
calls with a pre-allocated buffer, as such I doubt this fixes
any real life issue. But you never know... perhaps some oddball
private tree might use it.
In so far as upstream is concerned this just fixes our code for
correctness.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20210917182226.3532898-2-mcgrof@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
component.c hasn't use any macro or function declared in linux/kref.h.
Thus, these files can be removed from component.c safely without
affecting the compilation of the drivers/base/ module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Link: https://lore.kernel.org/r/20210928193849.28717-1-liumh1@shanghaitech.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
arch_topology.c hasn't use any macro or function declared in linux/percpu.h,
linux/smp.h and linux/string.h.
Thus, these files can be removed from arch_topology.c safely without
affecting the compilation of the drivers/base/ module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Link: https://lore.kernel.org/r/20210928193138.24192-1-liumh1@shanghaitech.edu.cn
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Don't use (-1) constant for setting initial device node. Instead, use
the generic NUMA_NO_NODE definition to indicate that "no node id
specified".
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20211004133453.18881-1-mgurtovoy@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
I got memory leak as follows:
unreferenced object 0xffff88801f0b2200 (size 64):
comm "i2c-lis2hh12-21", pid 5455, jiffies 4294944606 (age 15.224s)
hex dump (first 32 bytes):
72 65 67 75 6c 61 74 6f 72 3a 72 65 67 75 6c 61 regulator:regula
74 6f 72 2e 30 2d 2d 69 32 63 3a 31 2d 30 30 31 tor.0--i2c:1-001
backtrace:
[<00000000bf5b0c3b>] __kmalloc_track_caller+0x19f/0x3a0
[<0000000050da42d9>] kvasprintf+0xb5/0x150
[<000000004bbbed13>] kvasprintf_const+0x60/0x190
[<00000000cdac7480>] kobject_set_name_vargs+0x56/0x150
[<00000000bf83f8e8>] dev_set_name+0xc0/0x100
[<00000000cc1cf7e3>] device_link_add+0x6b4/0x17c0
[<000000009db9faed>] _regulator_get+0x297/0x680
[<00000000845e7f2b>] _devm_regulator_get+0x5b/0xe0
[<000000003958ee25>] st_sensors_power_enable+0x71/0x1b0 [st_sensors]
[<000000005f450f52>] st_accel_i2c_probe+0xd9/0x150 [st_accel_i2c]
[<00000000b5f2ab33>] i2c_device_probe+0x4d8/0xbe0
[<0000000070fb977b>] really_probe+0x299/0xc30
[<0000000088e226ce>] __driver_probe_device+0x357/0x500
[<00000000c21dda32>] driver_probe_device+0x4e/0x140
[<000000004e650441>] __device_attach_driver+0x257/0x340
[<00000000cf1891b8>] bus_for_each_drv+0x166/0x1e0
When device_register() returns an error, the name allocated in dev_set_name()
will be leaked, the put_device() should be used instead of kfree() to give up
the device reference, then the name will be freed in kobject_cleanup() and the
references of consumer and supplier will be decreased in device_link_release_fn().
Fixes: 287905e68d ("driver core: Expose device link details in sysfs")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210930085714.2057460-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here are some driver core and kernfs fixes for reported issues for
5.15-rc4. These fixes include:
- kernfs positive dentry bugfix
- debugfs_create_file_size error path fix
- cpumask sysfs file bugfix to preserve the user/kernel abi (has
been reported multiple times.)
- devlink fixes for mdiobus devices as reported by the subsystem
maintainers.
Also included in here are some devlink debugging changes to make it
easier for people to report problems when asked. They have already
helped with the mdiobus and other subsystems reporting issues.
All of these have been linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYVmC1A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykfCQCgtQ0kdPoPdUG6mPCn45rrbJQLBY4AnRL5JyhO
zB60l6C2EHHnLnRxMnQq
=gygB
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are some driver core and kernfs fixes for reported issues for
5.15-rc4. These fixes include:
- kernfs positive dentry bugfix
- debugfs_create_file_size error path fix
- cpumask sysfs file bugfix to preserve the user/kernel abi (has been
reported multiple times.)
- devlink fixes for mdiobus devices as reported by the subsystem
maintainers.
Also included in here are some devlink debugging changes to make it
easier for people to report problems when asked. They have already
helped with the mdiobus and other subsystems reporting issues.
All of these have been linux-next for a while with no reported issues"
* tag 'driver-core-5.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
kernfs: also call kernfs_set_rev() for positive dentry
driver core: Add debug logs when fwnode links are added/deleted
driver core: Create __fwnode_link_del() helper function
driver core: Set deferred probe reason when deferred by driver core
net: mdiobus: Set FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD for mdiobus parents
driver core: fw_devlink: Add support for FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD
driver core: fw_devlink: Improve handling of cyclic dependencies
cpumask: Omit terminating null byte in cpumap_print_{list,bitmask}_to_buf
debugfs: debugfs_create_file_size(): use IS_ERR to check for error
The same code is repeated in multiple locations. Create a helper
function for it.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915172808.620546-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the driver core defers the probe of a device, set the deferred
probe reason so that it's easier to debug. The deferred probe reason is
available in debugfs under devices_deferred.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915172808.620546-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a parent device is also a supplier to a child device, fw_devlink=on by
design delays the probe() of the child device until the probe() of the
parent finishes successfully.
However, some drivers of such parent devices (where parent is also a
supplier) expect the child device to finish probing successfully as soon as
they are added using device_add() and before the probe() of the parent
device has completed successfully. One example of such a case is discussed
in the link mentioned below.
Add a flag to make fw_devlink=on not enforce these supplier-consumer
relationships, so these drivers can continue working.
Link: https://lore.kernel.org/netdev/CAGETcx_uj0V4DChME-gy5HGKTYnxLBX=TH2rag29f_p=UcG+Tg@mail.gmail.com/
Fixes: ea718c6990 ("Revert "Revert "driver core: Set fw_devlink=on by default""")
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-3-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is some debate about whether it's deemed acceptable to call
dev_err_probe() if you know that the error code can never be
-EPROBE_DEFER. Clarify in the function comments that this is
OK. Specifically this makes us able to transform code like this:
ret = do_something_that_cant_defer();
if (ret < 0) {
dev_err(dev, "The foo failed to bar (%pe)\n", ERR_PTR(ret));
return ret;
}
to code like this:
ret = do_something_that_cant_defer();
if (ret < 0)
return dev_err_probe(dev, ret, "The foo failed to bar\n");
It is also possible that in the future folks might want a CONFIG
option to strip out all probe error strings to save space (keeping
non-probe errors) with the argument that probe errors rarely happen
after bringup. Having probe errors reported with a consistent function
would allow that.
Cc: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20210916161931.1.I32bea713bd6c6fb419a24da73686145742b6c117@changeid
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we have a dependency of the form:
Device-A -> Device-C
Device-B
Device-C -> Device-B
Where,
* Indentation denotes "child of" parent in previous line.
* X -> Y denotes X is consumer of Y based on firmware (Eg: DT).
We have cyclic dependency: device-A -> device-C -> device-B -> device-A
fw_devlink current treats device-C -> device-B dependency as an invalid
dependency and doesn't enforce it but leaves the rest of the
dependencies as is.
While the current behavior is necessary, it is not sufficient if the
false dependency in this example is actually device-A -> device-C. When
this is the case, device-C will correctly probe defer waiting for
device-B to be added, but device-A will be incorrectly probe deferred by
fw_devlink waiting on device-C to probe successfully. Due to this, none
of the devices in the cycle will end up probing.
To fix this, we need to go relax all the dependencies in the cycle like
we already do in the other instances where fw_devlink detects cycles.
A real world example of this was reported[1] and analyzed[2].
[1] - https://lore.kernel.org/lkml/0a2c4106-7f48-2bb5-048e-8c001a7c3fda@samsung.com/
[2] - https://lore.kernel.org/lkml/CAGETcx8peaew90SWiux=TyvuGgvTQOmO4BFALz7aj0Za5QdNFQ@mail.gmail.com/
Fixes: f9aa460672 ("driver core: Refactor fw_devlink feature")
Cc: stable <stable@vger.kernel.org>
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Link: https://lore.kernel.org/r/20210915170940.617415-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYUR8DAAKCRCAXGG7T9hj
vjclAP0ekisZSabOsmzM0iz8rpuYj113/kQozokFeFD5eQlEiAD/b5LytLosic/1
5XKa5fbOyimAyRBwblWhpLLhrW6uiAg=
=WoAB
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- The first hunk of a Xen swiotlb fixup series fixing multiple minor
issues and doing some small cleanups
- Some further Xen related fixes avoiding WARN() splats when running as
Xen guests or dom0
- A Kconfig fix allowing the pvcalls frontend to be built as a module
* tag 'for-linus-5.15b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
swiotlb-xen: drop DEFAULT_NSLABS
swiotlb-xen: arrange to have buffer info logged
swiotlb-xen: drop leftover __ref
swiotlb-xen: limit init retries
swiotlb-xen: suppress certain init retries
swiotlb-xen: maintain slab count properly
swiotlb-xen: fix late init retry
swiotlb-xen: avoid double free
xen/pvcalls: backend can be a module
xen: fix usage of pmd_populate in mremap for pv guests
xen: reset legacy rtc flag for PV domU
PM: base: power: don't try to use non-existing RTC for storing data
xen/balloon: use a kernel thread instead a workqueue
The boot-time allocation interface for memblock is a mess, with
'memblock_alloc()' returning a virtual pointer, but then you are
supposed to free it with 'memblock_free()' that takes a _physical_
address.
Not only is that all kinds of strange and illogical, but it actually
causes bugs, when people then use it like a normal allocation function,
and it fails spectacularly on a NULL pointer:
https://lore.kernel.org/all/20210912140820.GD25450@xsang-OptiPlex-9020/
or just random memory corruption if the debug checks don't catch it:
https://lore.kernel.org/all/61ab2d0c-3313-aaab-514c-e15b7aa054a0@suse.cz/
I really don't want to apply patches that treat the symptoms, when the
fundamental cause is this horribly confusing interface.
I started out looking at just automating a sane replacement sequence,
but because of this mix or virtual and physical addresses, and because
people have used the "__pa()" macro that can take either a regular
kernel pointer, or just the raw "unsigned long" address, it's all quite
messy.
So this just introduces a new saner interface for freeing a virtual
address that was allocated using 'memblock_alloc()', and that was kept
as a regular kernel pointer. And then it converts a couple of users
that are obvious and easy to test, including the 'xbc_nodes' case in
lib/bootconfig.c that caused problems.
Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 40caa127f3 ("init: bootconfig: Remove all bootconfig data when the init memory is removed")
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the helper macro SET_RUNTIME_PM_OPS() instead of the verbose
operators ".runtime_suspend/.runtime_resume", because the
SET_RUNTIME_PM_OPS() is a nice helper macro that could be brought
in to make code a little clearer, a little more concise.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210828090219.1177-1-caihuoqing@baidu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If there is no legacy RTC device, don't try to use it for storing trace
data across suspend/resume.
Cc: <stable@vger.kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20210903084937.19392-2-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
- Add new cpufreq driver for the MediaTek MT6779 platform called
mediatek-hw along with corresponding DT bindings (Hector.Yuan).
- Add DCVS interrupt support to the qcom-cpufreq-hw driver (Thara
Gopinath).
- Make the qcom-cpufreq-hw driver set the dvfs_possible_from_any_cpu
policy flag (Taniya Das).
- Blocklist more Qualcomm platforms in cpufreq-dt-platdev (Bjorn
Andersson).
- Make the vexpress cpufreq driver set the CPUFREQ_IS_COOLING_DEV
flag (Viresh Kumar).
- Add new cpufreq driver callback to allow drivers to register
with the Energy Model in a consistent way and make several
drivers use it (Viresh Kumar).
- Change the remaining users of the .ready() cpufreq driver callback
to move the code from it elsewhere and drop it from the cpufreq
core (Viresh Kumar).
- Revert recent intel_pstate change adding HWP guaranteed performance
change notification support to it that led to problems, because
the notification in question is triggered prematurely on some
systems (Rafael Wysocki).
- Convert the OPP DT bindings to DT schema and clean them up while
at it (Rob Herring).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmE41VgSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxCN4P+gMjMrrZmuU6gZsbbvpDlaBhCd2Xq3TD
xR/DMDi7znkh3TUX3uwL+xnr+k0krIH0jBIeUQUE7NeNIoT6wgbjJ4Ty5rFq76qB
AODmmZ4vO7lmnupSyqUQbHfYohDmyICSKiStf8UOEj1o+jSWNmrgUYUv0tDtDUH+
Cn0vByah8gJAnoZX8Y8BM1jmRc3YoNHWpvtTQIhIBPkVZ//+NOKvDZvwUUPZFb+M
1PzMSfX7WsIDiUrUHpdvtZsoBniaMk0WS1EqVBRvEprqUXad1eHF19yuhtLxeUPH
8xh/7o8kYzjqVJvs7blTT8DztxRDScWHeGKSVdwoEJupbCwc5R3qfGaD6PWyhI1x
9R5Swsp64nLptTCwH7ZmgdJbC9IqN3cz1Nadd5v2Q2wr21KvZnj7zI2ijkPKGnZo
kqYQHghqnDkGPFVjdls/RKUXGCQIXZFQb+FeuyCvpVlz9Ol8+DnTYtgFjjc6VcU1
kApIqsE8V8GQEyzmm/OIAf6xtsA+mEUUh3Qds16KNCwBCRglC8I6v5IWnBc8PEJz
a+wtwjx+tyKSSlAMEvcDNtrWVN+3JNrwCFG+Q+QjMrwLiAgACrtJZ6e6PmLj9sZv
FPbZM8rzbzN7Zqd7XVNY37KHjqNs7zzDScAnATTUCKThp8ijDgtmG3ZhExmOPayc
74aQLham4TBO
=WSpr
-----END PGP SIGNATURE-----
Merge tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are mostly ARM cpufreq driver updates, including one new
MediaTek driver that has just passed all of the reviews, with the
addition of a revert of a recent intel_pstate commit, some core
cpufreq changes and a DT-related update of the operating performance
points (OPP) support code.
Specifics:
- Add new cpufreq driver for the MediaTek MT6779 platform called
mediatek-hw along with corresponding DT bindings (Hector.Yuan).
- Add DCVS interrupt support to the qcom-cpufreq-hw driver (Thara
Gopinath).
- Make the qcom-cpufreq-hw driver set the dvfs_possible_from_any_cpu
policy flag (Taniya Das).
- Blocklist more Qualcomm platforms in cpufreq-dt-platdev (Bjorn
Andersson).
- Make the vexpress cpufreq driver set the CPUFREQ_IS_COOLING_DEV
flag (Viresh Kumar).
- Add new cpufreq driver callback to allow drivers to register with
the Energy Model in a consistent way and make several drivers use
it (Viresh Kumar).
- Change the remaining users of the .ready() cpufreq driver callback
to move the code from it elsewhere and drop it from the cpufreq
core (Viresh Kumar).
- Revert recent intel_pstate change adding HWP guaranteed performance
change notification support to it that led to problems, because the
notification in question is triggered prematurely on some systems
(Rafael Wysocki).
- Convert the OPP DT bindings to DT schema and clean them up while at
it (Rob Herring)"
* tag 'pm-5.15-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (23 commits)
Revert "cpufreq: intel_pstate: Process HWP Guaranteed change notification"
cpufreq: mediatek-hw: Add support for CPUFREQ HW
cpufreq: Add of_perf_domain_get_sharing_cpumask
dt-bindings: cpufreq: add bindings for MediaTek cpufreq HW
cpufreq: Remove ready() callback
cpufreq: sh: Remove sh_cpufreq_cpu_ready()
cpufreq: acpi: Remove acpi_cpufreq_cpu_ready()
cpufreq: qcom-hw: Set dvfs_possible_from_any_cpu cpufreq driver flag
cpufreq: blocklist more Qualcomm platforms in cpufreq-dt-platdev
cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support
cpufreq: scmi: Use .register_em() to register with energy model
cpufreq: vexpress: Use .register_em() to register with energy model
cpufreq: scpi: Use .register_em() to register with energy model
dt-bindings: opp: Convert to DT schema
dt-bindings: Clean-up OPP binding node names in examples
ARM: dts: omap: Drop references to opp.txt
cpufreq: qcom-cpufreq-hw: Use .register_em() to register with energy model
cpufreq: omap: Use .register_em() to register with energy model
cpufreq: mediatek: Use .register_em() to register with energy model
cpufreq: imx6q: Use .register_em() to register with energy model
...
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
Currently, the "auto-movable" online policy does not allow for hotplugged
KERNEL (ZONE_NORMAL) memory to increase the amount of MOVABLE memory we
can have, primarily, because there is no coordiantion across memory
devices and we don't want to create zone-imbalances accidentially when
unplugging memory.
However, within a single memory device it's different. Let's allow for
KERNEL memory within a dynamic memory group to allow for more MOVABLE
within the same memory group. The only thing we have to take care of is
that the managing driver avoids zone imbalances by unplugging MOVABLE
memory first, otherwise there can be corner cases where unplug of memory
could result in (accidential) zone imbalances.
virtio-mem is the only user of dynamic memory groups and recently added
support for prioritizing unplug of ZONE_MOVABLE over ZONE_NORMAL, so we
don't need a new toggle to enable it for dynamic memory groups.
We limit this handling to dynamic memory groups, because:
* We want to keep the runtime overhead for collecting stats when
onlining a single memory block small. We tend to have only a handful of
dynamic memory groups, but we can have quite some static memory groups
(e.g., 256 DIMMs).
* It doesn't make too much sense for static memory groups, as we try
onlining all applicable memory blocks either completely to ZONE_MOVABLE
or not. In ordinary operation, we won't have a mixture of zones within
a static memory group.
When adding memory to a dynamic memory group, we'll first online memory to
ZONE_MOVABLE as long as early KERNEL memory allows for it. Then, we'll
online the next unit(s) to ZONE_NORMAL, until we can online the next
unit(s) to ZONE_MOVABLE.
For a simple virtio-mem device with a MOVABLE:KERNEL ratio of 3:1, it will
result in a layout like:
[M][M][M][M][M][M][M][M][N][M][M][M][N][M][M][M]...
^ movable memory due to early kernel memory
^ allows for more movable memory ...
^-----^ ... here
^ allows for more movable memory ...
^-----^ ... here
While the created layout is sub-optimal when it comes to contiguous zones,
it gives us the maximum flexibility when dynamically growing/shrinking a
device; we can grow small VMs really big in small steps, and still shrink
reliably to e.g., 1/4 of the maximum VM size in this example, removing
full memory blocks along with meta data more reliably.
Mark dynamic memory groups in the xarray such that we can efficiently
iterate over them when collecting stats. In usual setups, we have one
virtio-mem device per NUMA node, and usually only a small number of NUMA
nodes.
Note: for now, there seems to be no compelling reason to make this
behavior configurable.
Link: https://lkml.kernel.org/r/20210806124715.17090-10-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hui Zhu <teawater@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use memory groups to improve our "auto-movable" onlining policy:
1. For static memory groups (e.g., a DIMM), online a memory block MOVABLE
only if all other memory blocks in the group are either MOVABLE or could
be onlined MOVABLE. A DIMM will either be MOVABLE or not, not a mixture.
2. For dynamic memory groups (e.g., a virtio-mem device), online a
memory block MOVABLE only if all other memory blocks inside the
current unit are either MOVABLE or could be onlined MOVABLE. For a
virtio-mem device with a device block size with 512 MiB, all 128 MiB
memory blocks wihin a 512 MiB unit will either be MOVABLE or not, not
a mixture.
We have to pass the memory group to zone_for_pfn_range() to take the
memory group into account.
Note: for now, there seems to be no compelling reason to make this
behavior configurable.
Link: https://lkml.kernel.org/r/20210806124715.17090-9-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hui Zhu <teawater@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's track all present pages in each memory group. Especially, track
memory present in ZONE_MOVABLE and memory present in one of the kernel
zones (which really only is ZONE_NORMAL right now as memory groups only
apply to hotplugged memory) separately within a memory group, to prepare
for making smart auto-online decision for individual memory blocks within
a memory group based on group statistics.
Link: https://lkml.kernel.org/r/20210806124715.17090-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hui Zhu <teawater@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In our "auto-movable" memory onlining policy, we want to make decisions
across memory blocks of a single memory device. Examples of memory
devices include ACPI memory devices (in the simplest case a single DIMM)
and virtio-mem. For now, we don't have a connection between a single
memory block device and the real memory device. Each memory device
consists of 1..X memory block devices.
Let's logically group memory blocks belonging to the same memory device in
"memory groups". Memory groups can span multiple physical ranges and a
memory group itself does not contain any information regarding physical
ranges, only properties (e.g., "max_pages") necessary for improved memory
onlining.
Introduce two memory group types:
1) Static memory group: E.g., a single ACPI memory device, consisting
of 1..X memory resources. A memory group consists of 1..Y memory
blocks. The whole group is added/removed in one go. If any part
cannot get offlined, the whole group cannot be removed.
2) Dynamic memory group: E.g., a single virtio-mem device. Memory is
dynamically added/removed in a fixed granularity, called a "unit",
consisting of 1..X memory blocks. A unit is added/removed in one go.
If any part of a unit cannot get offlined, the whole unit cannot be
removed.
In case of 1) we usually want either all memory managed by ZONE_MOVABLE or
none. In case of 2) we usually want to have as many units as possible
managed by ZONE_MOVABLE. We want a single unit to be of the same type.
For now, memory groups are an internal concept that is not exposed to user
space; we might want to change that in the future, though.
add_memory() users can specify a mgid instead of a nid when passing the
MHP_NID_IS_MGID flag.
Link: https://lkml.kernel.org/r/20210806124715.17090-4-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hui Zhu <teawater@gmail.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm/memory_hotplug: "auto-movable" online policy and memory groups", v3.
I. Goal
The goal of this series is improving in-kernel auto-online support. It
tackles the fundamental problems that:
1) We can create zone imbalances when onlining all memory blindly to
ZONE_MOVABLE, in the worst case crashing the system. We have to know
upfront how much memory we are going to hotplug such that we can
safely enable auto-onlining of all hotplugged memory to ZONE_MOVABLE
via "online_movable". This is far from practical and only applicable in
limited setups -- like inside VMs under the RHV/oVirt hypervisor which
will never hotplug more than 3 times the boot memory (and the
limitation is only in place due to the Linux limitation).
2) We see more setups that implement dynamic VM resizing, hot(un)plugging
memory to resize VM memory. In these setups, we might hotplug a lot of
memory, but it might happen in various small steps in both directions
(e.g., 2 GiB -> 8 GiB -> 4 GiB -> 16 GiB ...). virtio-mem is the
primary driver of this upstream right now, performing such dynamic
resizing NUMA-aware via multiple virtio-mem devices.
Onlining all hotplugged memory to ZONE_NORMAL means we basically have
no hotunplug guarantees. Onlining all to ZONE_MOVABLE means we can
easily run into zone imbalances when growing a VM. We want a mixture,
and we want as much memory as reasonable/configured in ZONE_MOVABLE.
Details regarding zone imbalances can be found at [1].
3) Memory devices consist of 1..X memory block devices, however, the
kernel doesn't really track the relationship. Consequently, also user
space has no idea. We want to make per-device decisions.
As one example, for memory hotunplug it doesn't make sense to use a
mixture of zones within a single DIMM: we want all MOVABLE if
possible, otherwise all !MOVABLE, because any !MOVABLE part will easily
block the whole DIMM from getting hotunplugged.
As another example, virtio-mem operates on individual units that span
1..X memory blocks. Similar to a DIMM, we want a unit to either be all
MOVABLE or !MOVABLE. A "unit" can be thought of like a DIMM, however,
all units of a virtio-mem device logically belong together and are
managed (added/removed) by a single driver. We want as much memory of
a virtio-mem device to be MOVABLE as possible.
4) We want memory onlining to be done right from the kernel while adding
memory, not triggered by user space via udev rules; for example, this
is reqired for fast memory hotplug for drivers that add individual
memory blocks, like virito-mem. We want a way to configure a policy in
the kernel and avoid implementing advanced policies in user space.
The auto-onlining support we have in the kernel is not sufficient. All we
have is a) online everything MOVABLE (online_movable) b) online everything
!MOVABLE (online_kernel) c) keep zones contiguous (online). This series
allows configuring c) to mean instead "online movable if possible
according to the coniguration, driven by a maximum MOVABLE:KERNEL ratio"
-- a new onlining policy.
II. Approach
This series does 3 things:
1) Introduces the "auto-movable" online policy that initially operates on
individual memory blocks only. It uses a maximum MOVABLE:KERNEL ratio
to make a decision whether a memory block will be onlined to
ZONE_MOVABLE or not. However, in the basic form, hotplugged KERNEL
memory does not allow for more MOVABLE memory (details in the
patches). CMA memory is treated like MOVABLE memory.
2) Introduces static (e.g., DIMM) and dynamic (e.g., virtio-mem) memory
groups and uses group information to make decisions in the
"auto-movable" online policy across memory blocks of a single memory
device (modeled as memory group). More details can be found in patch
#3 or in the DIMM example below.
3) Maximizes ZONE_MOVABLE memory within dynamic memory groups, by
allowing ZONE_NORMAL memory within a dynamic memory group to allow for
more ZONE_MOVABLE memory within the same memory group. The target use
case is dynamic VM resizing using virtio-mem. See the virtio-mem
example below.
I remember that the basic idea of using a ratio to implement a policy in
the kernel was once mentioned by Vitaly Kuznetsov, but I might be wrong (I
lost the pointer to that discussion).
For me, the main use case is using it along with virtio-mem (and DIMMs /
ppc64 dlpar where necessary) for dynamic resizing of VMs, increasing the
amount of memory we can hotunplug reliably again if we might eventually
hotplug a lot of memory to a VM.
III. Target Usage
The target usage will be:
1) Linux boots with "mhp_default_online_type=offline"
2) User space (e.g., systemd unit) configures memory onlining (according
to a config file and system properties), for example:
* Setting memory_hotplug.online_policy=auto-movable
* Setting memory_hotplug.auto_movable_ratio=301
* Setting memory_hotplug.auto_movable_numa_aware=true
3) User space enabled auto onlining via "echo online >
/sys/devices/system/memory/auto_online_blocks"
4) User space triggers manual onlining of all already-offline memory
blocks (go over offline memory blocks and set them to "online")
IV. Example
For DIMMs, hotplugging 4 GiB DIMMs to a 4 GiB VM with a configured ratio of
301% results in the following layout:
Memory block 0-15: DMA32 (early)
Memory block 32-47: Normal (early)
Memory block 48-79: Movable (DIMM 0)
Memory block 80-111: Movable (DIMM 1)
Memory block 112-143: Movable (DIMM 2)
Memory block 144-275: Normal (DIMM 3)
Memory block 176-207: Normal (DIMM 4)
... all Normal
(-> hotplugged Normal memory does not allow for more Movable memory)
For virtio-mem, using a simple, single virtio-mem device with a 4 GiB VM
will result in the following layout:
Memory block 0-15: DMA32 (early)
Memory block 32-47: Normal (early)
Memory block 48-143: Movable (virtio-mem, first 12 GiB)
Memory block 144: Normal (virtio-mem, next 128 MiB)
Memory block 145-147: Movable (virtio-mem, next 384 MiB)
Memory block 148: Normal (virtio-mem, next 128 MiB)
Memory block 149-151: Movable (virtio-mem, next 384 MiB)
... Normal/Movable mixture as above
(-> hotplugged Normal memory allows for more Movable memory within
the same device)
Which gives us maximum flexibility when dynamically growing/shrinking a
VM in smaller steps.
V. Doc Update
I'll update the memory-hotplug.rst documentation, once the overhaul [1] is
usptream. Until then, details can be found in patch #2.
VI. Future Work
1) Use memory groups for ppc64 dlpar
2) Being able to specify a portion of (early) kernel memory that will be
excluded from the ratio. Like "128 MiB globally/per node" are excluded.
This might be helpful when starting VMs with extremely small memory
footprint (e.g., 128 MiB) and hotplugging memory later -- not wanting
the first hotplugged units getting onlined to ZONE_MOVABLE. One
alternative would be a trigger to not consider ZONE_DMA memory
in the ratio. We'll have to see if this is really rrequired.
3) Indicate to user space that MOVABLE might be a bad idea -- especially
relevant when memory ballooning without support for balloon compaction
is active.
This patch (of 9):
For implementing a new memory onlining policy, which determines when to
online memory blocks to ZONE_MOVABLE semi-automatically, we need the
number of present early (boot) pages -- present pages excluding hotplugged
pages. Let's track these pages per zone.
Pass a page instead of the zone to adjust_present_page_count(), similar as
adjust_managed_page_count() and derive the zone from the page.
It's worth noting that a memory block to be offlined/onlined is either
completely "early" or "not early". add_memory() and friends can only add
complete memory blocks and we only online/offline complete (individual)
memory blocks.
Link: https://lkml.kernel.org/r/20210806124715.17090-1-david@redhat.com
Link: https://lkml.kernel.org/r/20210806124715.17090-2-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Marek Kedzierski <mkedzier@redhat.com>
Cc: Hui Zhu <teawater@gmail.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm: remove pfn_valid_within() and CONFIG_HOLES_IN_ZONE".
After recent updates to freeing unused parts of the memory map, no
architecture can have holes in the memory map within a pageblock. This
makes pfn_valid_within() check and CONFIG_HOLES_IN_ZONE configuration
option redundant.
The first patch removes them both in a mechanical way and the second patch
simplifies memory_hotplug::test_pages_in_a_zone() that had
pfn_valid_within() surrounded by more logic than simple if.
This patch (of 2):
After recent changes in freeing of the unused parts of the memory map and
rework of pfn_valid() in arm and arm64 there are no architectures that can
have holes in the memory map within a pageblock and so nothing can enable
CONFIG_HOLES_IN_ZONE which guards non trivial implementation of
pfn_valid_within().
With that, pfn_valid_within() is always hardwired to 1 and can be
completely removed.
Remove calls to pfn_valid_within() and CONFIG_HOLES_IN_ZONE.
Link: https://lkml.kernel.org/r/20210713080035.7464-1-rppt@kernel.org
Link: https://lkml.kernel.org/r/20210713080035.7464-2-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* pm-cpufreq:
Revert "cpufreq: intel_pstate: Process HWP Guaranteed change notification"
cpufreq: mediatek-hw: Add support for CPUFREQ HW
cpufreq: Add of_perf_domain_get_sharing_cpumask
dt-bindings: cpufreq: add bindings for MediaTek cpufreq HW
cpufreq: Remove ready() callback
cpufreq: sh: Remove sh_cpufreq_cpu_ready()
cpufreq: acpi: Remove acpi_cpufreq_cpu_ready()
cpufreq: qcom-hw: Set dvfs_possible_from_any_cpu cpufreq driver flag
cpufreq: blocklist more Qualcomm platforms in cpufreq-dt-platdev
cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support
cpufreq: scmi: Use .register_em() to register with energy model
cpufreq: vexpress: Use .register_em() to register with energy model
cpufreq: scpi: Use .register_em() to register with energy model
cpufreq: qcom-cpufreq-hw: Use .register_em() to register with energy model
cpufreq: omap: Use .register_em() to register with energy model
cpufreq: mediatek: Use .register_em() to register with energy model
cpufreq: imx6q: Use .register_em() to register with energy model
cpufreq: dt: Use .register_em() to register with energy model
cpufreq: Add callback to register with energy model
cpufreq: vexpress: Set CPUFREQ_IS_COOLING_DEV flag
There are variables(power.may_skip_resume and dev->power.must_resume)
and DPM_FLAG_MAY_SKIP_RESUME flags to control the resume of devices after
a system wide suspend transition.
Setting the DPM_FLAG_MAY_SKIP_RESUME flag means that the driver allows
its "noirq" and "early" resume callbacks to be skipped if the device
can be left in suspend after a system-wide transition into the working
state. PM core determines that the driver's "noirq" and "early" resume
callbacks should be skipped or not with dev_pm_skip_resume() function by
checking power.may_skip_resume variable.
power.must_resume variable is getting set to false in __device_suspend()
function without checking device's DPM_FLAG_MAY_SKIP_RESUME settings.
In problematic scenario, where all the devices in the suspend_late
stage are successful and some device can fail to suspend in
suspend_noirq phase. So some devices successfully suspended in suspend_late
stage are not getting chance to execute __device_suspend_noirq()
to set dev->power.must_resume variable to true and not getting
resumed in early_resume phase.
Add a check for device's DPM_FLAG_MAY_SKIP_RESUME flag before
setting power.must_resume variable in __device_suspend function.
Fixes: 6e176bf8d4 ("PM: sleep: core: Do not skip callbacks in the resume phase")
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This function has the 'irq' parameter which isn't ever used, so drop it.
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull swiotlb updates from Konrad Rzeszutek Wilk:
"A new feature called restricted DMA pools. It allows SWIOTLB to
utilize per-device (or per-platform) allocated memory pools instead of
using the global one.
The first big user of this is ARM Confidential Computing where the
memory for DMA operations can be set per platform"
* 'stable/for-linus-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: (23 commits)
swiotlb: use depends on for DMA_RESTRICTED_POOL
of: restricted dma: Don't fail device probe on rmem init failure
of: Move of_dma_set_restricted_buffer() into device.c
powerpc/svm: Don't issue ultracalls if !mem_encrypt_active()
s390/pv: fix the forcing of the swiotlb
swiotlb: Free tbl memory in swiotlb_exit()
swiotlb: Emit diagnostic in swiotlb_exit()
swiotlb: Convert io_default_tlb_mem to static allocation
of: Return success from of_dma_set_restricted_buffer() when !OF_ADDRESS
swiotlb: add overflow checks to swiotlb_bounce
swiotlb: fix implicit debugfs declarations
of: Add plumbing for restricted DMA pool
dt-bindings: of: Add restricted DMA pool
swiotlb: Add restricted DMA pool initialization
swiotlb: Add restricted DMA alloc/free support
swiotlb: Refactor swiotlb_tbl_unmap_single
swiotlb: Move alloc_size to swiotlb_find_slots
swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing
swiotlb: Update is_swiotlb_active to add a struct device argument
swiotlb: Update is_swiotlb_buffer to add a struct device argument
...
Merge misc updates from Andrew Morton:
"173 patches.
Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
oom-kill, migration, ksm, percpu, vmstat, and madvise)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
mm/madvise: add MADV_WILLNEED to process_madvise()
mm/vmstat: remove unneeded return value
mm/vmstat: simplify the array size calculation
mm/vmstat: correct some wrong comments
mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
selftests: vm: add COW time test for KSM pages
selftests: vm: add KSM merging time test
mm: KSM: fix data type
selftests: vm: add KSM merging across nodes test
selftests: vm: add KSM zero page merging test
selftests: vm: add KSM unmerge test
selftests: vm: add KSM merge test
mm/migrate: correct kernel-doc notation
mm: wire up syscall process_mrelease
mm: introduce process_mrelease system call
memblock: make memblock_find_in_range method private
mm/mempolicy.c: use in_task() in mempolicy_slab_node()
mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
mm/mempolicy: advertise new MPOL_PREFERRED_MANY
mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
...
There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.
memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.
Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.
This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().
Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> [ACPI]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr> [riscv]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
mem_section to section_nr could be costly since it iterates all section
roots to check if the given mem_section is in its range.
On the other hand, __nr_to_section() which converts section_nr to
mem_section can be done in O(1).
Let's pass section_nr instead of mem_section ptr to find_memory_block() in
order to reduce needless iterations.
Link: https://lkml.kernel.org/r/20210707150212.855-3-ohoono.kwon@samsung.com
Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This series consists of the usual driver updates (ufs, qla2xxx,
target, smartpqi, lpfc, mpt3sas). The core change causing the most
churn was replacing the command request field request with a macro,
allowing us to offset map to it and remove the redundant field; the
same was also done for the tag field. The most impactful change is
the final removal of scsi_ioctl, which has been deprecated for over a
decade.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYTD/TiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdUkAQCjb3Ux
4K9438mMelHlzM4er1S1IJ0WNnvObaVMNO9LBwD+JUz+rHsrKvuEX9j3g3C3u6JH
hC3BUEW8f2LLnujWanQ=
=lC5o
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (ufs, qla2xxx,
target, smartpqi, lpfc, mpt3sas).
The core change causing the most churn was replacing the command
request field request with a macro, allowing us to offset map to it
and remove the redundant field; the same was also done for the tag
field.
The most impactful change is the final removal of scsi_ioctl, which
has been deprecated for over a decade"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (293 commits)
scsi: ufs: Fix ufshcd_request_sense_async() for Samsung KLUFG8RHDA-B2D1
scsi: ufs: ufs-exynos: Fix static checker warning
scsi: mpt3sas: Use the proper SCSI midlayer interfaces for PI
scsi: lpfc: Use the proper SCSI midlayer interfaces for PI
scsi: lpfc: Copyright updates for 14.0.0.1 patches
scsi: lpfc: Update lpfc version to 14.0.0.1
scsi: lpfc: Add bsg support for retrieving adapter cmf data
scsi: lpfc: Add cmf_info sysfs entry
scsi: lpfc: Add debugfs support for cm framework buffers
scsi: lpfc: Add support for maintaining the cm statistics buffer
scsi: lpfc: Add rx monitoring statistics
scsi: lpfc: Add support for the CM framework
scsi: lpfc: Add cmfsync WQE support
scsi: lpfc: Add support for cm enablement buffer
scsi: lpfc: Add cm statistics buffer support
scsi: lpfc: Add EDC ELS support
scsi: lpfc: Expand FPIN and RDF receive logging
scsi: lpfc: Add MIB feature enablement support
scsi: lpfc: Add SET_HOST_DATA mbox cmd to pass date/time info to firmware
scsi: fc: Add EDC ELS definition
...
some updates to the basic clk types to use determine_rate for the
divider type and add a power of two fractional divider flag though.
Otherwise, this is a collection of clk driver updates. More than half
the diffstat is in the Qualcomm clk driver where we add a bunch of data
to describe clks on various SoCs and fix bugs. The other big new thing
in here is the Mediatek MT8192 clk driver. That's been under review for
a while and it's nice to see that it's finally upstream.
Beyond that it's the usual set of minor fixes and tweaks to clk drivers.
There are some non-clk driver bits in here which have all been acked by
the respective maintainers.
New Drivers:
- Support video, gpu, display clks on qcom sc7280 SoCs
- GCC clks on qcom MSM8953, SM4250/6115, and SM6350 SoCs
- Multimedia clks (MMCC) on qcom MSM8994/MSM8992
- RPMh clks on qcom SM6350 SoCs
- Support for Mediatek MT8192 SoCs
- Add display (DU and DSI) clocks on Renesas R-Car V3U
- Add I2C, DMAC, USB, sound (SSIF-2), GPIO, CANFD, and ADC clocks and
resets on Renesas RZ/G2L
Updates:
- Support the SD/OE pin on IDT VersaClock 5 and 6 clock generators
- Add power of two flag to fractional divider clk type
- Migrate some clk drivers to clk_divider_ops.determine_rate
- Migrate to clk_parent_data in gcc-sdm660
- Fix CLKOUT clocks on i.MX8MM and i.MX8MN by using imx_clk_hw_mux2
- Switch from .round_rate to .determine_rate in clk-divider-gate
- Fix clock tree update for TF-A controlled clocks for all i.MX8M
- Add missing M7 core clock for i.MX8MN
- YAML conversion of rk3399 clock controller binding
- Removal of GRF dependency for the rk3328/rk3036 pll types
- Drop CLK_IS_CRITICAL flag from Tegra fuse clk
- Make CLK_R9A06G032 Kconfig symbol invisible
- Convert various DT bindings to YAML
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmExEooRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSXXBhAAvhHm4fcm3fRjNdfImd+jDEl8XSvg+w43
adSnmVxbYM6ZVNOiJ4CJWHbj0hOY/PJnsQYWbV0xXvXW+zXva6p495MMHHOGSi2o
lMgZVMvj5UAwu304ZC9Xfn31dwo8XdGrltp4JqIcI2NEBMh1/PlZW22esT+jDiWN
3SWFD3M7lu88xTREyiEu11FY3z/KiGzbGlqYcbivx1X0sHVnBRbl4qcqZway+BmQ
95Ma4YWwhvDGYc+ypKH2EPxs/LikHXj05nMooigy65DOQ5wrM4L1eWkwmVUf6h+e
t4x7sAVysLnkihzdH5r2pw6CcAIom76v8w0+maSfk+jINUu1LeGVuat1eXSesFTu
49o+uTKRghkUe/Qh6r+7lbo8AZXQq+wUsLTYRuaWT/mSb+svAtJaUWAru8tJnMlH
oK6OehcQwz4nGhH0HnBK1jCVdtgckxPBw8F/GYN9rYhsccIe0XmFjX1rzMM3s8De
PLl6QO7Xzd+xb/FwAU8+S1WpKFdPU6ILTUnI2Ma3Mn/gfjZEZHvWAdTjo4oZGEsw
+N4n924ArptbeSLRrlNUtqx4BVDL5yo54xS5gefNpmD5yezO7aoUtN0aGcBq+01p
Qw0N5hKtcdsNYLBEFSvBGcZZmErMZbPwMXHWiUwNymXBDzJKgj5d+ks+1vJ3iCNW
R5r9hvATJPQ=
=Rrqg
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk updates from Stephen Boyd:
"Nothing changed in the clk framework core this time around. We did get
some updates to the basic clk types to use determine_rate for the
divider type and add a power of two fractional divider flag though.
Otherwise, this is a collection of clk driver updates. More than half
the diffstat is in the Qualcomm clk driver where we add a bunch of
data to describe clks on various SoCs and fix bugs. The other big new
thing in here is the Mediatek MT8192 clk driver. That's been under
review for a while and it's nice to see that it's finally upstream.
Beyond that it's the usual set of minor fixes and tweaks to clk
drivers. There are some non-clk driver bits in here which have all
been acked by the respective maintainers.
New Drivers:
- Support video, gpu, display clks on qcom sc7280 SoCs
- GCC clks on qcom MSM8953, SM4250/6115, and SM6350 SoCs
- Multimedia clks (MMCC) on qcom MSM8994/MSM8992
- RPMh clks on qcom SM6350 SoCs
- Support for Mediatek MT8192 SoCs
- Add display (DU and DSI) clocks on Renesas R-Car V3U
- Add I2C, DMAC, USB, sound (SSIF-2), GPIO, CANFD, and ADC clocks and
resets on Renesas RZ/G2L
Updates:
- Support the SD/OE pin on IDT VersaClock 5 and 6 clock generators
- Add power of two flag to fractional divider clk type
- Migrate some clk drivers to clk_divider_ops.determine_rate
- Migrate to clk_parent_data in gcc-sdm660
- Fix CLKOUT clocks on i.MX8MM and i.MX8MN by using imx_clk_hw_mux2
- Switch from .round_rate to .determine_rate in clk-divider-gate
- Fix clock tree update for TF-A controlled clocks for all i.MX8M
- Add missing M7 core clock for i.MX8MN
- YAML conversion of rk3399 clock controller binding
- Removal of GRF dependency for the rk3328/rk3036 pll types
- Drop CLK_IS_CRITICAL flag from Tegra fuse clk
- Make CLK_R9A06G032 Kconfig symbol invisible
- Convert various DT bindings to YAML"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (128 commits)
dt-bindings: clock: samsung: fix header path in example
clk: tegra: fix old-style declaration
clk: qcom: Add SM6350 GCC driver
MAINTAINERS: clock: include S3C and S5P in Samsung SoC clock entry
dt-bindings: clock: samsung: convert S5Pv210 AudSS to dtschema
dt-bindings: clock: samsung: convert Exynos AudSS to dtschema
dt-bindings: clock: samsung: convert Exynos4 to dtschema
dt-bindings: clock: samsung: convert Exynos3250 to dtschema
dt-bindings: clock: samsung: convert Exynos542x to dtschema
dt-bindings: clock: samsung: add bindings for Exynos external clock
dt-bindings: clock: samsung: convert Exynos5250 to dtschema
clk: vc5: Add properties for configuring SD/OE behavior
clk: vc5: Use dev_err_probe
dt-bindings: clk: vc5: Add properties for configuring the SD/OE pin
dt-bindings: clock: brcm,iproc-clocks: fix armpll properties
clk: zynqmp: Fix kernel-doc format
clk: at91: clk-generated: Limit the requested rate to our range
clk: ralink: avoid to set 'CLK_IS_CRITICAL' flag for gates
clk: zynqmp: Fix a memory leak
clk: zynqmp: Check the return type
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmEt+hwACgkQUqAMR0iA
lPLppBAAiyrUNVmqqtdww+IJajEs1uD/4FqPsysHRwroHBFymJeQG1XCwUpDZ7jj
6gXT0chxyjQE18gT/W9nf+PSmA9XvIVA1WSR+WCECTNW3YoZXqtgwiHfgnitXYku
HlmoZLthYeuoXWw2wn+hVLfTRh6VcPHYEaC21jXrs6B1pOXHbvjJ5eTLHlX9oCfL
UKSK+jFTHAJcn/GskRzviBe0Hpe8fqnkRol2XX13ltxqtQ73MjaGNu7imEH6/Pa7
/MHXWtuWJtOvuYz17aztQP4Qwh1xy+kakMy3aHucdlxRBTP4PTzzTuQI3L/RYi6l
+ttD7OHdRwqFAauBLY3bq3uJjYb5v/64ofd8DNnT2CJvtznY8wrPbTdFoSdPcL2Q
69/opRWHcUwbU/Gt4WLtyQf3Mk0vepgMbbVg1B5SSy55atRZaXMrA2QJ/JeawZTB
KK6D/mE7ccze/YFzsySunCUVKCm0veoNxEAcakCCZKXSbsvd1MYcIRC0e+2cv6e5
2NEH7gL4dD+5tqu5nzvIuKDn3NrDQpbi28iUBoFbkxRgcVyvHJ9AGSa62wtb5h3D
OgkqQMdVKBbjYNeUodPlQPzmXZDasytavyd0/BC/KENOcBvU/8gW++2UZTfsh/1A
dLjgwFBdyJncQcCS9Abn20/EKntbIMEX8NLa97XWkA3fuzMKtak=
=yEVq
-----END PGP SIGNATURE-----
Merge tag 'printk-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Optionally, provide an index of possible printk messages via
<debugfs>/printk/index/. It can be used when monitoring important
kernel messages on a farm of various hosts. The monitor has to be
updated when some messages has changed or are not longer available by
a newly deployed kernel.
- Add printk.console_no_auto_verbose boot parameter. It allows to
generate crash dump even with slow consoles in a reasonable time
frame.
- Remove printk_safe buffers. The messages are always stored directly
to the main logbuffer, even in NMI or recursive context. Also it
allows to serialize syslog operations by a mutex instead of a spin
lock.
- Misc clean up and build fixes.
* tag 'printk-for-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk/index: Fix -Wunused-function warning
lib/nmi_backtrace: Serialize even messages about idle CPUs
printk: Add printk.console_no_auto_verbose boot parameter
printk: Remove console_silent()
lib/test_scanf: Handle n_bits == 0 in random tests
printk: syslog: close window between wait and read
printk: convert @syslog_lock to mutex
printk: remove NMI tracking
printk: remove safe buffers
printk: track/limit recursion
lib/nmi_backtrace: explicitly serialize banner and regs
printk: Move the printk() kerneldoc comment to its new home
printk/index: Fix warning about missing prototypes
MIPS/asm/printk: Fix build failure caused by printk
printk: index: Add indexing support to dev_printk
printk: Userspace format indexing support
printk: Rework parse_prefix into printk_parse_prefix
printk: Straighten out log_flags into printk_info_flags
string_helpers: Escape double quotes in escape_special
printk/console: Check consistent sequence number when handling race in console_unlock()
Here is the big set of driver core patches for 5.15-rc1.
These do change a number of different things across different
subsystems, and because of that, there were 2 stable tags created that
might have already come into your tree from different pulls that did the
following
- changed the bus remove callback to return void
- sysfs iomem_get_mapping rework
The latter one will cause a tiny merge issue with your tree, as there
was a last-minute fix for this in 5.14 in your tree, but the fixup
should be "obvious". If you want me to provide a fixed merge for this,
please let me know.
Other than those two things, there's only a few small things in here:
- kernfs performance improvements for huge numbers of sysfs
users at once
- tiny api cleanups
- other minor changes
All of these have been in linux-next for a while with no reported
problems, other than the before-mentioned merge issue.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYS+FLQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylXuACfWECnysDtXNe66DdETCFs1a1RToYAoMokWeU5
s8VFP1NY2BjmxJbkebLL
=8kVu
-----END PGP SIGNATURE-----
Merge tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core patches for 5.15-rc1.
These do change a number of different things across different
subsystems, and because of that, there were 2 stable tags created that
might have already come into your tree from different pulls that did
the following
- changed the bus remove callback to return void
- sysfs iomem_get_mapping rework
Other than those two things, there's only a few small things in here:
- kernfs performance improvements for huge numbers of sysfs users at
once
- tiny api cleanups
- other minor changes
All of these have been in linux-next for a while with no reported
problems, other than the before-mentioned merge issue"
* tag 'driver-core-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (33 commits)
MAINTAINERS: Add dri-devel for component.[hc]
driver core: platform: Remove platform_device_add_properties()
ARM: tegra: paz00: Handle device properties with software node API
bitmap: extend comment to bitmap_print_bitmask/list_to_buf
drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI
topology: use bin_attribute to break the size limitation of cpumap ABI
lib: test_bitmap: add bitmap_print_bitmask/list_to_buf test cases
cpumask: introduce cpumap_print_list/bitmask_to_buf to support large bitmask and list
sysfs: Rename struct bin_attribute member to f_mapping
sysfs: Invoke iomem_get_mapping() from the sysfs open callback
debugfs: Return error during {full/open}_proxy_open() on rmmod
zorro: Drop useless (and hardly used) .driver member in struct zorro_dev
zorro: Simplify remove callback
sh: superhyway: Simplify check in remove callback
nubus: Simplify check in remove callback
nubus: Make struct nubus_driver::remove return void
kernfs: dont call d_splice_alias() under kernfs node lock
kernfs: use i_lock to protect concurrent inode updates
kernfs: switch kernfs to use an rwsem
kernfs: use VFS negative dentry caching
...
- Update ACPICA code in the kernel to upstream revision 20210730
including the following changes:
* Add support for the AEST table (data compiler) to iASL (Bob
Moore).
* Fix an if statement (add parens) (Bob Moore).
* Drop trailing semicolon from some macros (Bob Moore).
* Fix compilation of WPBT table with no command-line arguments
in iASL (Bob Moore).
* Add method name "_DIS" for use with aslmethod.c (Bob Moore).
* Add new DBG2 Serial Port Subtypes (Marcin Wojtas).
- Add new PCH FIVR methods to the DPTF code (Srinivas Pandruvada).
- Add support for the new 16550-compatible Serial Port Subtype to
the SPCR table parsing code (Marcin Wojtas).
- Add DMI quirk for Lenovo Yoga 9 (14INTL5) to the ACPI button
driver (Ulrich Huber).
- Add LoongArch support for ACPI_PROCESSOR/ACPI_NUMA (Huacai Chen).
- Add memory semantics to acpi_os_map_memory() (Lorenzo Pieralisi).
- Replace deprecated CPU-hotplug functions in the ACPI processor
driver (Sebastian Andrzej Siewior).
- Optimize I2C-bus handling in the XPower PMIC driver (Hans de Goede).
- Make platform-profile catch profile changes initiated by user space
and notify user processes of them (Hans de Goede).
- Clean up the ACPI companion binding and unbinding code and update
debug messaging in the ACPI power resources code (Rafael Wysocki).
- Clean up a couple of code pieces related to configfs (Andy
Shevchenko).
- Rearrange the FPDT table parsing code to avoid printing warning
messages for reserved record types (Adrian Huang).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmEtI2kSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxVXIP/jfi52N6F/PngBXR0f2CG1w9z02vHNej
ylf0wruJ52MlVVBzizHHTTh/OtHnAziWsFzVieMkPCc1+xZXIORbGuoEEZw3E+Pz
MUL2QjwLcSYcSqmC1D/aU51aFZLo/26R9ODAMNNzIFqMIbWq9sKCWliQXPKI+/f9
0zuiKYx3alVGEHU1Gl+qzIppnXBdyeI+irDM7mCA5W4anlmCj1tn36yK4deatx5f
NiwHpuC71ddVKHlI/UICmtIBXBCTULKYuqcHN28E1Vhn/4ieXxEmIrFoKeMd6Zhe
hTejCyejwp+vvoqRl4UPmIkC5KPUbTmpsrNWvzvOQyssZq0sorg7IAu/kM9ePJZD
VnaKT1JBWhACp4XhqfqvI8UoES9C8a39q2nXrGwLUy8/3x+F2EsOn/Awl6KHGu5f
HuVCYoQFPY0OHjz6CAwsw0iuL1Qcj4bf/ixm89bBCQmBEyX5WhpD+gEQB1YnjYYm
qctzqz60mBF7RDyGqIGWirOfgkbriJ8QnTxkdv1SYfJiOu5V0vg7d22ESOX6YPze
PmF3OWC4YOcQHsHKuMB8z3X9GW+cP7pohmcFhdaFQ8g1cqqEhkjCtcC9jSTSktuY
ck+0uy5R8gU/OkVcXWznQlVa26wdfa4VcVNSY6JR6Xy/v5a5AgkQPGbN6x92jXpS
75h9WtL17/mO
=YRjN
-----END PGP SIGNATURE-----
Merge tag 'acpi-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA kernel code to upstream revision 20210730,
clean up the ACPI companion binding code, optimize the I2C handling in
the XPower PMIC driver, add 16550-compatible Serial Port Subtype
support to the SPCR parsing code, add a few LoongArch support bits,
add a ne quirk to the button driver, add new PCH FIVR methods to the
DPTF code, replace deprecated CPU-hotplug functions in the processor
driver, improve the acpi_os_map_memory() handling on non-x86 and do
some assorted cleanups.
Specifics:
- Update ACPICA code in the kernel to upstream revision 20210730
including the following changes:
- Add support for the AEST table (data compiler) to iASL (Bob
Moore)
- Fix an if statement (add parens) (Bob Moore)
- Drop trailing semicolon from some macros (Bob Moore)
- Fix compilation of WPBT table with no command-line arguments in
iASL (Bob Moore)
- Add method name "_DIS" for use with aslmethod.c (Bob Moore)
- Add new DBG2 Serial Port Subtypes (Marcin Wojtas)
- Add new PCH FIVR methods to the DPTF code (Srinivas Pandruvada)
- Add support for the new 16550-compatible Serial Port Subtype to the
SPCR table parsing code (Marcin Wojtas)
- Add DMI quirk for Lenovo Yoga 9 (14INTL5) to the ACPI button driver
(Ulrich Huber)
- Add LoongArch support for ACPI_PROCESSOR/ACPI_NUMA (Huacai Chen)
- Add memory semantics to acpi_os_map_memory() (Lorenzo Pieralisi)
- Replace deprecated CPU-hotplug functions in the ACPI processor
driver (Sebastian Andrzej Siewior)
- Optimize I2C-bus handling in the XPower PMIC driver (Hans de Goede)
- Make platform-profile catch profile changes initiated by user space
and notify user processes of them (Hans de Goede)
- Clean up the ACPI companion binding and unbinding code and update
debug messaging in the ACPI power resources code (Rafael Wysocki)
- Clean up a couple of code pieces related to configfs (Andy
Shevchenko)
- Rearrange the FPDT table parsing code to avoid printing warning
messages for reserved record types (Adrian Huang)"
* tag 'acpi-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (27 commits)
ACPI: power: Drop name from struct acpi_power_resource
ACPI: power: Use acpi_handle_debug() to print debug messages
ACPI: tables: FPDT: Do not print FW_BUG message if record types are reserved
ACPI: button: Add DMI quirk for Lenovo Yoga 9 (14INTL5)
ACPI: Add memory semantics to acpi_os_map_memory()
ACPI: SPCR: Add support for the new 16550-compatible Serial Port Subtype
ACPI: platform-profile: call sysfs_notify() from platform_profile_store()
ACPICA: Update version to 20210730
ACPICA: Add method name "_DIS" For use with aslmethod.c
ACPICA: iASL: Fix for WPBT table with no command-line arguments
ACPICA: Headers: Add new DBG2 Serial Port Subtypes
ACPICA: Macros should not use a trailing semicolon
ACPICA: Fix an if statement (add parens)
ACPICA: iASL: Add support for the AEST table (data compiler)
ACPI: processor: Replace deprecated CPU-hotplug functions
ACPI: DPTF: Add new PCH FIVR methods
ACPI: configfs: Make get_header() to return error pointer
ACPI: configfs: Use sysfs_emit() in "show" functions
driver core: Split device_platform_notify()
software nodes: Split software_node_notify()
...
- Address 3 PCI device power management issues (Rafael Wysocki).
- Add Power Limit4 support for Alder Lake to the Intel RAPL power
capping driver (Sumeet Pawnikar).
- Add HWP guaranteed performance change notification support to
the intel_pstate driver (Srinivas Pandruvada).
- Replace deprecated CPU-hotplug functions in code related to power
management (Sebastian Andrzej Siewior).
- Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).
- Add support for 'required-opps' DT property to the generic power
domains (genpd) framework and use this property for I2C on ARM64
sc7180 (Rajendra Nayak).
- Fix Kconfig issue related to genpd (Geert Uytterhoeven).
- Increase energy calculation precision in the Energy Model (Lukasz
Luba).
- Fix kobject deletion in the exit code of the schedutil cpufreq
governor (Kevin Hao).
- Unmark some functions as kernel-doc in the PM core to avoid
false-positive documentation build warnings (Randy Dunlap).
- Check RTC features instead of ops in suspend_test Alexandre
Belloni).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmEtIvYSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxnQ8QAK2QCyfFAdrPE3zn+XdTcVC8rpYhL0d2
3YpbS2obZ4lOHsrNy7SrrU5NdgvCFPHl14Py5rxu4QZW8EF5km6SqJYj9MhrDsEm
wJ/ZM3Zbaj98MAls7pulHxabMo8hvGfMSmcNOFfgh7yqCm5eu8gNnt/LbLaB9IvI
v46ZGNvnpbtkWgk4Eaq+v3Y5/+4CoUfuAFn7K21SHNcgzDbhE3Ii6cam/rwPeCdn
a+LcDLzeDQpnnYKukx7Zyk7Z+FblXWHWZ/ClLcR8JgYYSrbrXxSLtRdM5EtLje2J
ZnqFKqmlMcq2OhTjBSoHSJjiOkxyz5eWWrEf7d1/oH19WSW+rv5VfZUXAkYtfGKo
OJChuIomrLNqJhAwTnxG3fnVh2NMhLwDVEjkFgvej4shCZXhoVgWu39mWnbyTxV3
57adiuVUv6AboH6Lyvi/yzV7sWmZbL/U/YVvt+Z9SEh0syy6YBb87bW3Mr1qnLeG
QF4AmF553qzHwd9uxUoFfiUoBxH1GvNmOWtx6lYKhy0lm3AwvW1XHKqsDG4KbVJg
zzok7J2yv1/1rd8dJ+8tCwyyC3WJjNjtq/ULOltugwQpqVK4PCwWAUmjJaTP3Gpp
ZH/bGuLSQWxxiinwKCkFSxwbewJu+AvtoXYKBfk/Gf5O6GyUyryBbBMO9dsqP27A
Zg3P4+ejS7RN
=sx6G
-----END PGP SIGNATURE-----
Merge tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These address some PCI device power management issues, add new
hardware support to the RAPL power capping driver, add HWP guaranteed
performance change notification support to the intel_pstate driver,
replace deprecated CPU-hotplug functions in a few places, update CPU
PM notifiers to use raw spinlocks, update the PM domains framework
(new DT property support, Kconfig fix), do a couple of cleanups in
code related to system sleep, and improve the energy model and the
schedutil cpufreq governor.
Specifics:
- Address 3 PCI device power management issues (Rafael Wysocki).
- Add Power Limit4 support for Alder Lake to the Intel RAPL power
capping driver (Sumeet Pawnikar).
- Add HWP guaranteed performance change notification support to the
intel_pstate driver (Srinivas Pandruvada).
- Replace deprecated CPU-hotplug functions in code related to power
management (Sebastian Andrzej Siewior).
- Update CPU PM notifiers to use raw spinlocks (Valentin Schneider).
- Add support for 'required-opps' DT property to the generic power
domains (genpd) framework and use this property for I2C on ARM64
sc7180 (Rajendra Nayak).
- Fix Kconfig issue related to genpd (Geert Uytterhoeven).
- Increase energy calculation precision in the Energy Model (Lukasz
Luba).
- Fix kobject deletion in the exit code of the schedutil cpufreq
governor (Kevin Hao).
- Unmark some functions as kernel-doc in the PM core to avoid
false-positive documentation build warnings (Randy Dunlap).
- Check RTC features instead of ops in suspend_test Alexandre
Belloni)"
* tag 'pm-5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: domains: Fix domain attach for CONFIG_PM_OPP=n
powercap: Add Power Limit4 support for Alder Lake SoC
cpufreq: intel_pstate: Process HWP Guaranteed change notification
thermal: intel: Allow processing of HWP interrupt
notifier: Remove atomic_notifier_call_chain_robust()
PM: cpu: Make notifier chain use a raw_spinlock_t
PM: sleep: unmark 'state' functions as kernel-doc
arm64: dts: sc7180: Add required-opps for i2c
PM: domains: Add support for 'required-opps' to set default perf state
opp: Don't print an error if required-opps is missing
cpufreq: schedutil: Use kobject release() method to free sugov_tunables
PM: EM: Increase energy calculation precision
PM: sleep: check RTC features instead of ops in suspend_test
PM: sleep: s2idle: Replace deprecated CPU-hotplug functions
cpufreq: Replace deprecated CPU-hotplug functions
powercap: intel_rapl: Replace deprecated CPU-hotplug functions
PCI: PM: Enable PME if it can be signaled from D3cold
PCI: PM: Avoid forcing PCI_D0 for wakeup reasons inconsistently
PCI: Use pci_update_current_state() in pci_enable_device_flags()
Pull ARM cpufreq driver changes for v5.15 from Viresh Kumar:
"This contains:
- Update cpufreq-dt blocklist with more platforms (Bjorn Andersson).
- Allow freq changes from any CPU for qcom-hw driver (Taniya Das).
- Add DSVS interrupt's support for qcom-hw driver (Thara Gopinath).
- A new callback (->register_em()) to register EM at a more convenient
point of time."
* 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: qcom-hw: Set dvfs_possible_from_any_cpu cpufreq driver flag
cpufreq: blocklist more Qualcomm platforms in cpufreq-dt-platdev
cpufreq: qcom-cpufreq-hw: Add dcvs interrupt support
cpufreq: scmi: Use .register_em() to register with energy model
cpufreq: vexpress: Use .register_em() to register with energy model
cpufreq: scpi: Use .register_em() to register with energy model
cpufreq: qcom-cpufreq-hw: Use .register_em() to register with energy model
cpufreq: omap: Use .register_em() to register with energy model
cpufreq: mediatek: Use .register_em() to register with energy model
cpufreq: imx6q: Use .register_em() to register with energy model
cpufreq: dt: Use .register_em() to register with energy model
cpufreq: Add callback to register with energy model
cpufreq: vexpress: Set CPUFREQ_IS_COOLING_DEV flag
Core changes:
- The usual set of small fixes and improvements all over the place, but nothing
outstanding
MSI changes:
- Further consolidation of the PCI/MSI interrupt chip code
- Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts
of platform devices in the same way as PCI exposes them.
Driver changes:
- Support for ARM GICv3 EPPI partitions
- Treewide conversion to generic_handle_domain_irq() for all chained
interrupt controllers
- Conversion to bitmap_zalloc() throughout the irq chip drivers
- The usual set of small fixes and improvements
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnpsTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoS+/EACQdpRkzl3IDIYqThxVZ8KQzp2rKKVn
qisAQiWg/6koNJx/yYy62KNAUyKjCIObNtRnWi7OAOx6OvNtQTD2WOLAwkh3Pgw1
8ePYYl55k+yCs8VoITsZM9jYeO+Tk878pU2A6R943zR+g6G7bskGJrxEyZ9TbzIe
qKfusNKnRY9/jMQaRALUAAtA9VIVR867GqORX5X8hKz8yE2rqlpb4y+1CFba5BTV
Vlxw7cIXvXBn7BKAom5diRqEGDNJEbX+56jJ7yDZshgLo7m11D7QLw72kmb6TNVC
g7PchvFi4afpc1ifEAAp0tk4RiSIAQ91nS3n0+jLcLbodOjIkl14eY02ZCJGAP29
uslyzUbmy1wgejG6CA63JtZ4MYdrf/OSMGuoN78qnOKYcIsWFzOvlJmBWWNW34qW
LCaUF9QdJ/slXu6B4vIx30GfN9q4myml8bFUobE5q9mBRrEk4R0B7iyBvPu1xKYr
ZEan67prI5VEu+afJGpp4r294m4HNVkMLfl3nYmE5+y4MoLeMNKDY3IPTvI9iP4G
kaFgoPvQo23WnuclNYpJ+CaA4aRASlB2nTY+oAXIYfehbey9EW5vq4/EK864ek6w
oyUTepxxNhE81tG2jpQbf2tR4COsEHy986clxqPP4AvsZXcbypCw8O2FcflpQbHO
5DLEAfTmp7cziQ==
=qyll
-----END PGP SIGNATURE-----
Merge tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates to the interrupt core and driver subsystems:
Core changes:
- The usual set of small fixes and improvements all over the place,
but nothing stands out
MSI changes:
- Further consolidation of the PCI/MSI interrupt chip code
- Make MSI sysfs code independent of PCI/MSI and expose the MSI
interrupts of platform devices in the same way as PCI exposes them.
Driver changes:
- Support for ARM GICv3 EPPI partitions
- Treewide conversion to generic_handle_domain_irq() for all chained
interrupt controllers
- Conversion to bitmap_zalloc() throughout the irq chip drivers
- The usual set of small fixes and improvements"
* tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
platform-msi: Add ABI to show msi_irqs of platform devices
genirq/msi: Move MSI sysfs handling from PCI to MSI core
genirq/cpuhotplug: Demote debug printk to KERN_DEBUG
irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy
irqdomain: Export irq_domain_disconnect_hierarchy()
irqchip/gic-v3: Fix priority comparison when non-secure priorities are used
irqchip/apple-aic: Fix irq_disable from within irq handlers
pinctrl/rockchip: drop the gpio related codes
gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type
gpio/rockchip: support next version gpio controller
gpio/rockchip: use struct rockchip_gpio_regs for gpio controller
gpio/rockchip: add driver for rockchip gpio
dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank
pinctrl/rockchip: add pinctrl device to gpio bank struct
pinctrl/rockchip: separate struct rockchip_pin_bank to a head file
pinctrl/rockchip: always enable clock for gpio controller
genirq: Fix kernel doc indentation
EDAC/altera: Convert to generic_handle_domain_irq()
powerpc: Bulk conversion to generic_handle_domain_irq()
nios2: Bulk conversion to generic_handle_domain_irq()
...
A few small fixes for regmaps this time, plus support for
allowing drivers to select raw spinlocks for the locks in order
to allow usage in interrutpt controllers.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmEsyXYTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0DigB/92NOc4pBVTGIRKL+zaM2YMVaH2wCzp
+9B6A+W/9ZZEjaHPNHKeQJjcSGcC7bo8YmXgcH+eNUUG5PsGcNurxOpp4+hmIUKn
QrhP+vqk9/fMh2cLsekkpcsYOmbRNaapNzhSCPW5MxQuVHohoE46dNjf74pqvO4d
3SC9iJnTf78DCJsaDZRVDXf9d8mqaoEIdAVSOFwzT6afIrjG+rCyuzgPNV5G9DMC
0haTCIoIVsWmhC9wc7r7KFyY6k/kyg+BzMX+opHujGo1uZZKWbnv7ca7bwKay2Jk
Hvrf3dHWiLmHCH6hTwA2qFXMZu2qr1iJuBJ8KfvWlFULGpJ9ZlOhsvro
=1pX9
-----END PGP SIGNATURE-----
Merge tag 'regmap-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap updates from Mark Brown:
"A few small fixes for regmaps this time, plus support for allowing
drivers to select raw spinlocks for the locks in order to allow usage
in interrutpt controllers"
* tag 'regmap-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: teach regmap to use raw spinlocks if requested in the config
regmap: allow const array for {devm_,}regmap_field_bulk_alloc reg_fields
regmap: Prefer unsigned int to bare use of unsigned
regmap: fix the offset of register error log