When I implemented the storage layer bio splitting, I was under the
assumption that we'll never split metadata bios. But Qu reminded me that
this can actually happen with very old file systems with unaligned
metadata chunks and RAID0.
I still haven't seen such a case in practice, but we better handled this
case, especially as it is fairly easily to do not calling the ->end_іo
method directly in btrfs_end_io_work, and using the proper
btrfs_orig_bbio_end_io helper instead.
In addition to the old file system with unaligned metadata chunks case
documented in the commit log, the combination of the new scrub code
with Johannes pending raid-stripe-tree also triggers this case. We
spent some time debugging it and found that this patch solves
the problem.
Fixes: 103c19723c ("btrfs: split the bio submission path into a separate file")
CC: stable@vger.kernel.org # 6.3+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Syzkaller got a lot of crashes like:
KASAN: use-after-free Write in *_timers*
All of these crashes point to the same memory area:
The buggy address belongs to the object at ffff88801f870000
which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 5320 bytes inside of
8192-byte region [ffff88801f870000, ffff88801f872000)
This area belongs to :
batadv_priv->batadv_priv_dat->delayed_work->timer_list
The reason for these issues is the lack of synchronization. Delayed
work (batadv_dat_purge) schedules new timer/work while the device
is being deleted. As the result new timer/delayed work is set after
cancel_delayed_work_sync() was called. So after the device is freed
the timer list contains pointer to already freed memory.
Found by Linux Verification Center (linuxtesting.org) with syzkaller.
Cc: stable@kernel.org
Fixes: 2f1dfbe185 ("batman-adv: Distributed ARP Table - implement local storage")
Signed-off-by: Vladislav Efanov <VEfanov@ispras.ru>
Acked-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Fix a regression introduced inadvertently during the 6.3 cycle by a
commit making the Intel int340x thermal driver use sysfs_emit_at()
instead of scnprintf() (Srinivas Pandruvada).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRw3GoSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxd/8P/3/0nPD8CaK5Oa6fBQFwiR8unCTfQSh7
7Tdut0HV47W9m8gj58S3okCXT4iR3Sv9M8wWYbH81oSXrrRKCS1BOVSK65jvHq8Q
3qbnft0xOBaYzcP0zlKQ6D0YkJKTjS8c6uqdiv0xMyvYd7kdbcPACaW+9HutIp9o
k92sSWo/sbxi4aKr9BjoGULwT2AcwbW+4gchrokHh01SFqIDDWFV/lR/hjPkwMkG
HX8LjjOboy1bzy8Vdam5Q41MNKY3jdKz2qabd9kgBgXVd/rTH9D1cfYud07Hdi7A
qpkvQSaPA08jt+Kh+rAZWj4GUU058EEm2hKFg0oqc4oRMxgRcB9MlRu61XTjzhDY
XrPFboOHHD1XWU8hCV0yh3QmjxyVh3njaoSPeosPJLtZ7RFCp90T0SHxSKYCvo7C
d0TEW1JDdVWvoqZl2JB8rjSixKIZX1neKLDjuZc9Fdgx0Y2N70eR7SZTiu3wohSO
muYDHyeJ6SV9DwKWPrdVQGvjNfmMdFay7Z7WdaugTFVI475rLRF05iZyijrie3b0
ntvy4LtqEB4Lj3mdl5I6j7b72YcHMzZCuwl7DHMWwoPo4AsePKYxslpJI/wrf0CD
oUqhz8H/5ftA2VMdWjaIDB10u1pS1qVtNgLahguVc2HM5cCk7TPVw3ZjDlenXofw
GTGRPMJnGtxP
=Qxds
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix a regression introduced inadvertently during the 6.3 cycle by a
commit making the Intel int340x thermal driver use sysfs_emit_at()
instead of scnprintf() (Srinivas Pandruvada)"
* tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: intel: int340x: Add new line for UUID display
Fix 3 issues related to the ->fast_switch callback in the AMD
P-state cpufreq driver (Gautham R. Shenoy and Wyes Karny).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRw2+cSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxW68P/2W9mcDxlepkfN9KEUgZ6ianmoRdHqSU
KpL8uB4Snavd0QQYnRUbNPeDht5cAMZT1cEzmcdkGAzGz2SpD/44LH0T8zFO24aM
SAdlo6j72VWiRmR5POvEZbKPRczCWFPlpqKkQyh227Msb5tGayo7fRa1VpRqo/AH
YwYWaW5Ur5XfU0c8E5K+P3Rj2dz+RsFzZ4chb5yd14ae2Qr8f6EnOVte0ja0bFUF
uPneqJ27U4MoDW1s/bkud32lJTeGLi8BwgIIyP6L7a8m5QdWxv+pTZMsa/NW7a+v
3+3GrXoLZp6xeBGdt0z6/04BwpcRUh6ZpVlloGeYoSMpnQpGleqeJaVepWZCAKDm
K0Gc7nLTLQUMO877oOz17cDLdDgXbIDmngw9ubEW0N7g5yOGzSCTfQyYYItN0dky
mnTMXU5TM2LktH3V1y34HTTZTZU5TDZkLzoFNAgHqmlFhEqEdaN6l5AJJ6rPt9IM
ZwJNpLz3R4rQ+XwGofebQGSlwpQVAzUrigiHJ2VJAbFxWqw4iohYkgD+1e8B/VzG
e5DygskN1FFTJ7oMXlPSDWqulCRghOillqtUWocdHJAbnkglkLpexxqUAsoNahDt
qA553sM5/lp29uNlpuz6AbqHscjOto94aC3uK3kxNUGsDy6CRcbn5VVIN5ceDVk+
SnzKHfnh65vv
=6XJv
-----END PGP SIGNATURE-----
Merge tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Fix three issues related to the ->fast_switch callback in the AMD
P-state cpufreq driver (Gautham R. Shenoy and Wyes Karny)"
* tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
cpufreq: amd-pstate: Add ->fast_switch() callback
When media is not ready do not assume that the capacity information from
the identify command is valid, i.e. ->total_bytes
->partition_align_bytes ->{volatile,persistent}_only_bytes. Explicitly
zero out the capacity resources and exit early.
Given zero-init of those fields this patch is functionally equivalent to
the prior state, but it improves readability and robustness going
forward.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168506118166.3004974.13523455340007852589.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
- fix incorrect output in in-tree gpio tools
- fix a shell coding issue in gpio-sim selftests
- correctly set the permissions for debugfs attributes exposed by gpio-mockup
- fix chip name and pin count in gpio-f7188x for one of the supported models
- fix numberspace pollution when using dynamically and statically allocated
GPIOs together
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmRws3YACgkQEacuoBRx
13JHlhAAkyUNjtvkcTdO9buB2AXB3e/JH31MgSEEzoAi9kr54n70D+mtmSz5iX0X
4a1KjZrd5Gd7nSGA3AHll7jxL1Rfa4vJsTID/khfhIPB0arGl7brZ0J2fItRUGqz
oxAvpH3OpSWI+QWQMzp/NEZV0F5rfWBme93w6OqYItzGNr4LfP+3x49zPDUngOjp
d3EsDzQbgxezYvrmWjPvZuZvk3RYud+dfU8bjvtnI6TJZqIxzePmcqEAFgqsnDJF
VLEJQFPGpLaX7U+ZepcAQAs2s5ggk6Mb0JdGKp1w9hQ9w91TClBUgUURG6hzdLmZ
w6/X3iGaFGaHugA4DmWhYd4FWE7H58dWazw/RVK4RM2ZGZ9cT8R8OVQlDsBhesyG
tOTY4RJlU1DgMt66nlUvf8r3ECdm2ayvFq7qSN9xRDQ0W4Ti3+QU2+/re3s8jevR
VYAo1BHtVai28+lCMjopK+fYBe8G4DqsGx9Uh3y0RxYIFvMKdo4EQF9Ku3yTEcq3
3FNFO00KU5ktKgnx7z6bnO7+F3OvHUeSRh44U8O6pm/odfdkUn7CHjFVi2dM6yO5
763UKwMZq9rDT2owdwljolIRECb8IAN0siBhhHYYW4oAFTD2w2vTQ/y2KYrxbntv
CptwauQMM9vjzg9bxMP+4R5YVZhZhtJ4I1oa2kQx7eXIPBy/6OM=
=R4dV
-----END PGP SIGNATURE-----
Merge tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix incorrect output in in-tree gpio tools
- fix a shell coding issue in gpio-sim selftests
- correctly set the permissions for debugfs attributes exposed by
gpio-mockup
- fix chip name and pin count in gpio-f7188x for one of the supported
models
- fix numberspace pollution when using dynamically and statically
allocated GPIOs together
* tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio-f7188x: fix chip name and pin count on Nuvoton chip
gpiolib: fix allocation of mixed dynamic/static GPIOs
gpio: mockup: Fix mode of debugfs files
selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change
tools: gpio: fix debounce_period_us output of lsgpio
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRwqRsACgkQxWXV+ddt
WDuCPQ//T8JVY6usnGF/Fw/3zbtDNvrdQLDfp3HovIg7gmLIBda0bT05w4Q46FUU
l4BV0bHyTUNWPlXUmrrSmt8HipRe2z4Wjwc16azdLmSs5zf0FO1LbsCKDmM8Ncid
LTi2jzyyb3E44ZzC/i7RCaBt+vYRb2ZmtZ/glh3K4H0GgTAYl1GxZoAoYgBnvmlG
nvmlWWDaM2cRKaUREm75il37LKLIlW5jvdUFQrqwWNgUH72ay5/7SZxHywlk8x6b
qwhhp+s6bMUNzi6CqE2SLnESjI9yl0l/0gLebhDXVulo0BiCrti+YLpueP4eQs1B
yYXX3PvHOXhoN4tUQ4yDF9G57To4Gw1aiQOnWOOLcbyGG1ZgyekpoRRXh6r74LKt
FDyWT+u/xd78by1km3VzqmvKtqHnRFNMYfP+MMDIhyhy5prKCWeVo7bC+2FP+89o
kv9+0Z0w0lkLycFfLaewZkEv0/WY8GMuT7kptHQ2Ao6ulAvG+j97sgVBFGXJjeCr
B1OAGdeTF79IV139bCxPA62cat87Zrh15mZN+y7U32Vs2JkOqbT0LTQGKoVs/TCI
AyHCDb8oOfGiebibnEDrDNtubz7NFCq4ntZRmuv5FJ+l2d1wl6ZvsI+DoYP7Zide
DLR7ZtPs1Yvm27xDjs+fVmMx4nuNGikEbPZPxJro1CjLVzCEt7k=
=elHB
-----END PGP SIGNATURE-----
Merge tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- handle memory allocation error in checksumming helper (reported by
syzbot)
- fix lockdep splat when aborting a transaction, add NOFS protection
around invalidate_inode_pages2 that could allocate with GFP_KERNEL
- reduce chances to hit an ENOSPC during scrub with RAID56 profiles
* tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: use nofs when cleaning up aborted transactions
btrfs: handle memory allocation failure in btrfs_csum_one_bio
btrfs: scrub: try harder to mark RAID56 block groups read-only
core:
- fix drmm_mutex_init lock class
mgag200:
- fix gamma lut initialisation
pl111:
- fix FB depth on IMPD-1 framebuffer
amdgpu:
- Fix missing BO unlocking in KIQ error path
- Avoid spurious secure display error messages
- SMU13 fix
- Fix an OD regression
- GPU reset display IRQ warning fix
- MST fix
radeon:
- Fix a DP regression
i915:
- PIPEDMC disabling fix for bigjoiner config
panel:
- fix aya neo air plus quirk
sched:
- remove redundant NULL check
qaic:
- fix NNC message corruption
- Grab ch_lock during QAIC_ATTACH_SLICE_BO
- Flush the transfer list again
- Validate if BO is sliced before slicing
- Validate user data before grabbing any lock
- initialize ret variable to 0
- silence some uninitialized variable warnings
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmRwSkgACgkQDHTzWXnE
hr7biw//U0n2qc0rBv+G2qdCu7zF6kuJZ83gStBRiybTm0qynUJA+wppwSyqUEDg
EQqoIy0hMiKpcAxFQuXpLA3tk2F7UGHxEV2aaKEMhwrU+HBCDrpUqVCZjhe6DQbT
bwj+SMvPCfrHhGfPbiGw7Vsw/qPYbC5ulrsn2th9vvwDuxL1wwtYIpJzA/xhV7eO
sWjUKYVtuXSrI/oqZLzoZU6YCjf7G2wShkkSHTna/zjmHdWbYEiyqZ6huL70LJBD
fZWI5w9HM8ABMMrE4W95TubMZOwDvsG7mb6G0WnS7P4Lq6Aab5CT1I6IkBB86pYz
kSW1VMUpPIE9+E7m3OBL66L/i6xCmBZ5nEHqjsK9DYvpHX/L2IG26SdE/e6Ai29Y
aMLxi4hM45FNfAOIB9Z36JJ9gaYCXf2hL/KAmrWovVlpT1vLkH/6DZ3nWNiNuhvT
5PKKTM4w3zOHzv0R/xrb3dfg+0idSPlEd4+0bsjM5STAMc1pyp6PhEHtEI2HqFwM
vkJ+RqK/lVXnDXyjgZnjbGsDaUQ3Rcw8/G3l+79j8mJcfeRYknGw/698BGPDBcpG
9hMxxIk8ttXYqhwmTCGTy1D/xf3lPOLmdTTACokvOEyd6VxijaeI2apbV0HYVnca
mPvSdxpKs+xx6nACSuxOG02E6Z4k85/kdNZeCmHEV9KGJACLIpw=
=IDNm
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This week's collection is pretty spread out, accel/qaic has a bunch of
fixes, amdgpu, then lots of single fixes across a bunch of places.
core:
- fix drmm_mutex_init lock class
mgag200:
- fix gamma lut initialisation
pl111:
- fix FB depth on IMPD-1 framebuffer
amdgpu:
- Fix missing BO unlocking in KIQ error path
- Avoid spurious secure display error messages
- SMU13 fix
- Fix an OD regression
- GPU reset display IRQ warning fix
- MST fix
radeon:
- Fix a DP regression
i915:
- PIPEDMC disabling fix for bigjoiner config
panel:
- fix aya neo air plus quirk
sched:
- remove redundant NULL check
qaic:
- fix NNC message corruption
- Grab ch_lock during QAIC_ATTACH_SLICE_BO
- Flush the transfer list again
- Validate if BO is sliced before slicing
- Validate user data before grabbing any lock
- initialize ret variable to 0
- silence some uninitialized variable warnings"
* tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Have Payload Properly Created After Resume
drm/amd/display: Fix warning in disabling vblank irq
drm/amd/pm: Fix output of pp_od_clk_voltage
drm/amd/pm: add missing NotifyPowerSource message mapping for SMU13.0.7
drm/radeon: reintroduce radeon_dp_work_func content
drm/amdgpu: don't enable secure display on incompatible platforms
drm:amd:amdgpu: Fix missing buffer object unlock in failure path
accel/qaic: Fix NNC message corruption
accel/qaic: Grab ch_lock during QAIC_ATTACH_SLICE_BO
accel/qaic: Flush the transfer list again
accel/qaic: Validate if BO is sliced before slicing
accel/qaic: Validate user data before grabbing any lock
accel/qaic: initialize ret variable to 0
drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration
drm: fix drmm_mutex_init()
drm/sched: Remove redundant check
drm: panel-orientation-quirks: Change Air's quirk to support Air Plus
accel/qaic: silence some uninitialized variable warnings
drm/pl111: Fix FB depth on IMPD-1 framebuffer
drm/mgag200: Fix gamma lut not initialized.
I tried to streamline our user memory copy code fairly aggressively in
commit adfcf4231b ("x86: don't use REP_GOOD or ERMS for user memory
copies"), in order to then be able to clean up the code and inline the
modern FSRM case in commit 577e6a7fd5 ("x86: inline the 'rep movs' in
user copies for the FSRM case").
We had reports [1] of that causing regressions earlier with blogbench,
but that turned out to be a horrible benchmark for that case, and not a
sufficient reason for re-instating "rep movsb" on older machines.
However, now Eric Dumazet reported [2] a regression in performance that
seems to be a rather more real benchmark, where due to the removal of
"rep movs" a TCP stream over a 100Gbps network no longer reaches line
speed.
And it turns out that with the simplified the calling convention for the
non-FSRM case in commit 427fda2c8a ("x86: improve on the non-rep
'copy_user' function"), re-introducing the ERMS case is actually fairly
simple.
Of course, that "fairly simple" is glossing over several missteps due to
having to fight our assembler alternative code. This code really wanted
to rewrite a conditional branch to have two different targets, but that
made objtool sufficiently unhappy that this instead just ended up doing
a choice between "jump to the unrolled loop, or use 'rep movsb'
directly".
Let's see if somebody finds a case where the kernel memory copies also
care (see commit 68674f94ff: "x86: don't use REP_GOOD or ERMS for
small memory copies"). But Eric does argue that the user copies are
special because networking tries to copy up to 32KB at a time, if
order-3 pages allocations are possible.
In-kernel memory copies are typically small, unless they are the special
"copy pages at a time" kind that still use "rep movs".
Link: https://lore.kernel.org/lkml/202305041446.71d46724-yujie.liu@intel.com/ [1]
Link: https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/ [2]
Reported-and-tested-by: Eric Dumazet <edumazet@google.com>
Fixes: adfcf4231b ("x86: don't use REP_GOOD or ERMS for user memory copies")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
'struct evsel' uses a union for the two lists. This turned out to be
error prone.
For example:
[root@quaco ~]# perf stat --bpf-prog 5246
Error: cpu-clock event does not have sample flags 66c660
failed to set filter "(null)" on event cpu-clock with 2 (No such file or directory)
[root@quaco ~]# perf stat --bpf-prog 5246
Fix this issue by separating the two lists.
Fixes: 56ec9457a4 ("perf bpf filter: Implement event sample filtering")
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kernel-team@meta.com
Link: https://lore.kernel.org/r/20230519235757.3613719-1-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Picking the changes from:
3632679d9e ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
That includes a define (IP_PROTOCOL) that isn't being used in generating
any id -> string table used by 'perf trace'.
Addresses this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/lkml/ZHD/Ms0DMq7viaq+@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copy the kernel version of the header to fix the header diff build
warning. Some new definitions were only added to the tools side header,
but these are only used in Perf so move them to a different header.
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Link: https://lore.kernel.org/r/20230522102604.1081416-1-james.clark@arm.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: acme@kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: siyanteng@loongson.cn
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
llvm-strip is not really required. Remove this dependency to make it
easier to build perf with BUILD_BPF_SKEL=1.
Committer notes:
This removes the need for the 'llvm' package just to get llvm-strip.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
demangle-cxx.cpp requires a C++ compiler, but feature checks may fail
because of the absence of this. Add a CONFIG_CXX_DEMANGLE so that the
source isn't built if not supported. Copy libbfd and cplus demangle
variants to a weak symbol-elf.c version so they aren't dependent on
C++. These variants are only built with the build option
BUILD_NONDISTRO=1.
Committer note:
This also handles this build break when a C++ compiler isn't available:
CXX /tmp/build/perf/util/demangle-cxx.o
/bin/sh: g++: command not found
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20230417192546.99923-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Change "../cs-etm.h" to just "../../../util/cs-etm.h" as ../cs-etm.h
doesn't exist.
Suggested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230515165039.544045-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
BPF CO-RE requires 3 underscores for the ignored suffix rule but it
mistakenly used only 2. Let's fix it.
Fixes: 3a8b8fc317 ("perf bpf filter: Support pre-5.16 kernels where 'mem_hops' isn't in 'union perf_mem_data_src'")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230525000307.3202449-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We failed to initialize n_banks for spi-nor-generic flashes, which
caused a devide by zero when computing the bank_size.
By default we consider that all chips have a single bank. Initialize
the default number of banks for spi-nor-generic flashes. Even if the
bug is fixed with this simple initialization, check the n_banks value
before dividing so that we make sure this kind of bug won't occur again
if some other struct instance is created uninitialized.
Suggested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217448
Fixes: 9d6c5d64f0 ("mtd: spi-nor: Introduce the concept of bank")
Link: https://lore.kernel.org/all/20230516225108.29194-1-todd.e.brandt@intel.com/
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230518085440.2363676-1-tudor.ambarus@linaro.org
A few functions provide an empty interface definition when
CONFIG_MTD_NAND_INGENIC_ECC is disabled, but they are accidentally
defined as global functions in the header:
drivers/mtd/nand/raw/ingenic/ingenic_ecc.h:39:5: error: no previous prototype for 'ingenic_ecc_calculate'
drivers/mtd/nand/raw/ingenic/ingenic_ecc.h:46:5: error: no previous prototype for 'ingenic_ecc_correct'
drivers/mtd/nand/raw/ingenic/ingenic_ecc.h:53:6: error: no previous prototype for 'ingenic_ecc_release'
drivers/mtd/nand/raw/ingenic/ingenic_ecc.h:57:21: error: no previous prototype for 'of_ingenic_ecc_get'
Turn them into 'static inline' definitions instead.
Fixes: 15de8c6efd ("mtd: rawnand: ingenic: Separate top-level and SoC specific code")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230516202133.559488-1-arnd@kernel.org
Following errors were seen with um-x86_64-gcc12/um-allyesconfig:
+ /kisskb/src/drivers/mtd/spi-nor/spansion.c: error: 'op' is used uninitialized [-Werror=uninitialized]: => 495:27, 364:27
Initialise local struct spi_mem_op with all zeros at declaration in
order to avoid using garbage data for fields that are not explicitly
set afterwards.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: c87c9b11c5 ("mtd: spi-nor: spansion: Determine current address mode")
Fixes: 6afcc84080 ("mtd: spi-nor: spansion: Add support for Infineon S25FS256T")
Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://lore.kernel.org/linux-mtd/20230509193900.948753-1-tudor.ambarus@linaro.org
Pull NVMe fix from Keith:
"nvme fixes for 6.4
One nvme quirk (Tatsuki)"
* tag 'nvme-6.4-2023-05-26' of git://git.infradead.org/nvme:
NVMe: Add MAXIO 1602 to bogus nid list.
In the error path, a of_node_put() for platform is missing.
Just add it.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Acked-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/20230523151223.109551-9-herve.codina@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
Quite a few fixes to address set of assorted issues:
1. NULL pointer dereference if the ffa driver doesn't provide remove()
callback as it is currently executed unconditionally
2. FF-A core probe failure on systems with v1.0 firmware as the new
partition info get count flag is used unconditionally
3. Failure to register more than one logical partition or service within
the same physical partition as the device name contains only VM ID
which will be same for all but each will have unique UUID.
4. Rejection of certain memory interface transmissions by the receivers
(secure partitions) as few MBZ fields are non-zero due to lack of
explicit re-initialization of those fields
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEunHlEgbzHrJD3ZPhAEG6vDF+4pgFAmRaI48ACgkQAEG6vDF+
4phsfhAAq8HGnwUXzO2sOgDE+dB1QIc+tXNgThXbjxfeusG94rfQAqqWCPBDjfV7
baqOL494nuEXukzGKRGpkgNVTisyUISLQQpqxfOxY1gpR5ENIip1iZZlaszeXAQ2
XbQiMZisIo2xrPaBwfWdqs+h3Dat2FWfLVdi6rZ8QGXQUO5onNbO6Q4OrvanOd3/
C0pY+vjGwo9qWTH+y1VLg6D3lXMYZ1v+lSK9Xqr2p2F2+3xrPUd2W0Gwl3wc92fT
2jngCv/b9baTp8VS8Kveusv0pDas8l/zaLaxjSIOZKejqGX7l4LdKWFdUAGJ+iKm
zLiAqgNFWKJ+st/BkXhoLfSJ4Z66IkkJToNGFr15TSPLh76H6uUu1S9SKLYuFQA+
WV5y3oGIZc41GFnRsIaX7E+qAp5++IuIzu3oZ7K4SmH52LJuw1BP/NeDO+weuB5N
BU7pST7zLlmdIEiIf+mrATFpQf2c2+MHLoGATb8xtIJ+AtlAcC9ggXPiFZAliyLp
CvnBJBz6DEftBK2wMJU7nL0qOIxvdNlU9+yOFpQwAInS0W2scF0m8+NLb73lPvke
mWvf10si6s2L+uY5HJO3sew6XWtqU6LcT/KkDeE34ZBzomVHNsCbhVt2UjB/a9SD
bOu1FFAs3fAz/baisqyfsS6+pTD+NCz+PHLuu2LFjUzFG7/c/sc=
=h9Ch
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRwxusACgkQYKtH/8kJ
UicxNxAA5L1mHCBKhi4DGDrHJWXtzlFfiYdaMeixckLpMObjV3G4KvZuhoTPAn7l
JRaOi7CzFC+YkN/Ee++JFdcUcLmFHG5ff/Os2CJ+lY+LmfdDLl6PhfiIdVdU1Qa2
zmXI45x2ih1rK2SionS8IGd2tqkeNxe/mZq7t5+9xIHZgVEFZfNmyRNqeU74IPoV
pwes+A0DHLnN0HuoKOnmH98o5v6PXWPTSgLCq6Gv6haip/os+0qECyl7hz1zCkqr
Kk7XK2ZrVkft9zPSAVfNyxtx2sCYTtuErec9vBFNnd5rQQLfU36FRFeiRqmroriy
6C4oBrDyM5HNEXy2vzvRWw8fLsLn7Xb4Epu/QwbKZ7EyaIsXiRON8m3a/KtODLQw
0ums73fJT0QPrBo/0t2cIIu1FNqjEfRm+wi4J/YrBk9fx4pb4B56MZS2rd42Swoj
XOiN4RDq9Y7aBiKxDeld6YaJdl3NLd3lukQClp4IYj8kWFk/UGdUjQRDkznfQ4OA
erWoErPknvFwPOJAMh8B3byPNNpZ1dyt1sISLyX7OMfMDKp/QEle1dP6oBwC5DhR
kxPh30Yy/DHpxKOqHRkYpu5CNb66g18RD7PKopi59POF6qQrOLUJAqztQPgST4zT
YX3a/b5PDbee+MyoZqY/jNEsmZ6vFOxMA51K2Kt8ZNjdDZi55ws=
=p+fd
-----END PGP SIGNATURE-----
Merge tag 'ffa-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux into arm/fixes
Arm FF-A fixes for v6.4
Quite a few fixes to address set of assorted issues:
1. NULL pointer dereference if the ffa driver doesn't provide remove()
callback as it is currently executed unconditionally
2. FF-A core probe failure on systems with v1.0 firmware as the new
partition info get count flag is used unconditionally
3. Failure to register more than one logical partition or service within
the same physical partition as the device name contains only VM ID
which will be same for all but each will have unique UUID.
4. Rejection of certain memory interface transmissions by the receivers
(secure partitions) as few MBZ fields are non-zero due to lack of
explicit re-initialization of those fields
* tag 'ffa-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors
firmware: arm_ffa: Fix FFA device names for logical partitions
firmware: arm_ffa: Fix usage of partition info get count flag
firmware: arm_ffa: Check if ffa_driver remove is present before executing
Link: https://lore.kernel.org/r/20230509143453.1188753-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The option name should not have the dashes. Current version shows four
dashes for the option.
$ perf ftrace latency -h
Usage: perf ftrace [<options>] [<command>]
or: perf ftrace [<options>] -- [<command>] [<options>]
or: perf ftrace {trace|latency} [<options>] [<command>]
or: perf ftrace {trace|latency} [<options>] -- [<command>] [<options>]
-b, --use-bpf Use BPF to measure function latency
-n, ----use-nsec Use nano-second histogram
-T, --trace-funcs <func>
Show latency of given function
Fixes: 84005bb614 ("perf ftrace latency: Add -n/--use-nsec option")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230525212038.3535851-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes the error checking in nfcsim.c.
The DebugFS kernel API is developed in
a way that the caller can safely ignore the errors that
occur during the creation of DebugFS nodes.
Signed-off-by: Osama Muhammad <osmtendev@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a compiler warning:
include/linux/dev_printk.h: In function 'ov2680_probe':
include/linux/dev_printk.h:144:31: warning: 'high' may be used uninitialized [-Wmaybe-uninitialized]
144 | dev_printk_index_wrap(_dev_err, KERN_ERR, dev, dev_fmt(fmt), ##__VA_ARGS__)
| ^~~~~~~~
In function 'ov2680_detect',
inlined from 'ov2680_s_config' at drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:468:8,
inlined from 'ov2680_probe' at drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:647:8:
drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:376:13: note: 'high' was declared here
376 | u32 high, low;
| ^~~~
'high' is indeed uninitialized after the ov_read_reg8() call failed, so there is no
point showing the value. Just say that the read failed. But low can also be used
uninitialized later, so just make it more robust and properly zero the high and low
variables.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
When a message was received the last_initiator is set to 0xff.
This will force the signal free time for the next transmit
to that for a new initiator. However, if a new transmit is
already in progress, then don't set last_initiator, since
that's the initiator of the current transmit. Overwriting
this would cause the signal free time of a following transmit
to be that of the new initiator instead of a next transmit.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Explicitly disable the CEC adapter in cec_devnode_unregister()
Usually this does not really do anything important, but for drivers
that use the CEC pin framework this is needed to properly stop the
hrtimer. Without this a crash would happen when such a driver is
unloaded with rmmod.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
When VCODEC_CAPABILITY_4K_DISABLED is not set in dec_capability, skip
formats that are not MTK_FMT_DEC so only decoder formats is updated in
mtk_init_vdec_params.
Fixes: e25528e1db ("media: mediatek: vcodec: Use 4K frame size when supported by stateful decoder")
Signed-off-by: Pin-yen Lin <treapking@chromium.org>
Reviewed-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: Yunfei Dong <yunfei.dong@mediatek.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
In an earlier commit, setting the which field of the subdev format struct
in video_get_subdev_format was moved to a designated initializer that also
zeroes all other fields. However, the memset call that was zeroing the
fields earlier was left in place, causing the which field to be cleared
after being set in the initializer.
Remove the memset call from video_get_subdev_format to avoid clearing the
initialized which field.
Fixes: ecefa105cc ("media: Zero-initialize all structures passed to subdev pad operations")
Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com>
Acked-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Tested-by: Andrey Konovalov <andrey.konovalov@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
In the event of a change in XGBE mode, the current auto-negotiation
needs to be reset and the AN cycle needs to be re-triggerred. However,
the current code ignores the return value of xgbe_set_mode(), leading to
false information as the link is declared without checking the status
register.
Fix this by propagating the mode switch status information to
xgbe_phy_status().
Fixes: e57f7a3fea ("amd-xgbe: Prepare for working with more than one type of phy")
Co-developed-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most protos' poll() methods insert a memory barrier between
writes to sk_err and sk_error_report(). This dates back to
commit a4d258036e ("tcp: Fix race in tcp_poll").
I guess we should do the same thing in TLS, tcp_poll() does
not hold the socket lock.
Fixes: 3c4d755915 ("tls: kernel TLS support")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drm-misc-fixes for v6.4-rc4:
- A few non-trivial fixes to qaic.
- Fix drmm_mutex_init always using same lock class.
- Fix pl111 fb depth.
- Fix uninitialised gamma lut in mgag200.
- Add Aya Neo Air Plus quirk.
- Trivial null check removal in scheduler.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/d19f748c-2c5b-8140-5b05-a8282dfef73e@linux.intel.com
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmRu2ZQACgkQSD+KveBX
+j4sQAgAtCLJfYADpU7yWPZ3SMAnTelc65PfawdjEyBlRjE2CxsQ/bnRiWUXuR5i
twsGvAJxQfpDBV0z/TxcdYznpLxHSIhosg0pRH+Tc4Lk6PjdFK7Sviqyw9JTrCJX
rVaLJ5N877Pu5gH++OqFs8C+2gkrppQKWJtrmUaAPPdXixkr2q2pwlJiCOLpGYTN
9yGKBO29wtvD0AJsssS8VJgy6uHMaOsTTACR4fYuV/huTKa5QfYBdJY1HqEs7Fe2
LhGyLjVx6nHD8kt+fL9J2LJ6CLnG9a50reKuZnCOBa42obahUBCAQ5mr1mQlzNr5
cTOmAZM+OWOa7bXJAY9AbRorlr1A4w==
=DRJZ
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2023-05-24
This series includes bug fixes for the mlx5 driver.
* tag 'mlx5-fixes-2023-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
Documentation: net/mlx5: Wrap notes in admonition blocks
Documentation: net/mlx5: Add blank line separator before numbered lists
Documentation: net/mlx5: Use bullet and definition lists for vnic counters description
Documentation: net/mlx5: Wrap vnic reporter devlink commands in code blocks
net/mlx5: Fix check for allocation failure in comp_irqs_request_pci()
net/mlx5: DR, Add missing mutex init/destroy in pattern manager
net/mlx5e: Move Ethernet driver debugfs to profile init callback
net/mlx5e: Don't attach netdev profile while handling internal error
net/mlx5: Fix post parse infra to only parse every action once
net/mlx5e: Use query_special_contexts cmd only once per mdev
net/mlx5: fw_tracer, Fix event handling
net/mlx5: SF, Drain health before removing device
net/mlx5: Drain health before unregistering devlink
net/mlx5e: Do not update SBCM when prio2buffer command is invalid
net/mlx5e: Consider internal buffers size in port buffer calculations
net/mlx5e: Prevent encap offload when neigh update is running
net/mlx5e: Extract remaining tunnel encap code to dedicated file
====================
Link: https://lore.kernel.org/r/20230525034847.99268-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzkaller found a data race of pkt_sk(sk)->num.
The value is changed under lock_sock() and po->bind_lock, so we
need READ_ONCE() to access pkt_sk(sk)->num without these locks in
packet_bind_spkt(), packet_bind(), and sk_diag_fill().
Note that WRITE_ONCE() is already added by commit c7d2ef5dd4
("net/packet: annotate accesses to po->bind").
BUG: KCSAN: data-race in packet_bind / packet_do_bind
write (marked) to 0xffff88802ffd1cee of 2 bytes by task 7322 on cpu 0:
packet_do_bind+0x446/0x640 net/packet/af_packet.c:3236
packet_bind+0x99/0xe0 net/packet/af_packet.c:3321
__sys_bind+0x19b/0x1e0 net/socket.c:1803
__do_sys_bind net/socket.c:1814 [inline]
__se_sys_bind net/socket.c:1812 [inline]
__x64_sys_bind+0x40/0x50 net/socket.c:1812
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
read to 0xffff88802ffd1cee of 2 bytes by task 7318 on cpu 1:
packet_bind+0xbf/0xe0 net/packet/af_packet.c:3322
__sys_bind+0x19b/0x1e0 net/socket.c:1803
__do_sys_bind net/socket.c:1814 [inline]
__se_sys_bind net/socket.c:1812 [inline]
__x64_sys_bind+0x40/0x50 net/socket.c:1812
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
value changed: 0x0300 -> 0x0000
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7318 Comm: syz-executor.4 Not tainted 6.3.0-13380-g7fddb5b5300c #4
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Fixes: 96ec632714 ("packet: Diag core and basic socket info dumping")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230524232934.50950-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Python 3.9.0 or newer supports combining dicts() with |,
but older versions of Python are still used in the wild
(e.g. on CentOS 8, which goes EoL May 31, 2024).
With Python 3.6.8 we get:
TypeError: unsupported operand type(s) for |: 'dict' and 'dict'
Use older syntax. Tested with non-legacy families only.
Fixes: f036d936ca ("tools: ynl: Add fixed-header support to ynl")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Tested-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230524170712.2036128-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Simon Kapadia reported the following issue:
<quote>
The Online Amateur Radio Community (OARC) has recently been experimenting
with building a nationwide packet network in the UK.
As part of our experimentation, we have been testing out packet on 300bps HF,
and playing with net/rom. For HF packet at this baud rate you really need
to make sure that your MTU is relatively low; AX.25 suggests a PACLEN of 60,
and a net/rom PACLEN of 40 to go with that.
However the Linux net/rom support didn't work with a low PACLEN;
the mkiss module would truncate packets if you set the PACLEN below about 200 or so, e.g.:
Apr 19 14:00:51 radio kernel: [12985.747310] mkiss: ax1: truncating oversized transmit packet!
This didn't make any sense to me (if the packets are smaller why would they
be truncated?) so I started investigating.
I looked at the packets using ethereal, and found that many were just huge
compared to what I would expect.
A simple net/rom connection request packet had the request and then a bunch
of what appeared to be random data following it:
</quote>
Simon provided a patch that I slightly revised:
Not only we must not use skb_tailroom(), we also do
not want to count NR_NETWORK_LEN twice.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Co-Developed-by: Simon Kapadia <szymon@kapadia.pl>
Signed-off-by: Simon Kapadia <szymon@kapadia.pl>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Simon Kapadia <szymon@kapadia.pl>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524141456.1045467-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We encountered a kernel call trace issue which was related to
ndo_xdp_xmit callback on our i.MX8MP platform. The reproduce
steps show as follows.
1. The FEC port (eth0) connects to a PC port, and the PC uses
pktgen_sample03_burst_single_flow.sh to generate packets and
send these packets to the FEC port. Notice that the script must
be executed before step 2.
2. Run the "./xdp_redirect eth0 eth1" command on i.MX8MP, the
eth1 interface is the dwmac. Then there will be a call trace
issue soon. Please see the log for more details.
The root cause is that the NETDEV_XDP_ACT_NDO_XMIT feature is
enabled by default, so when the step 2 command is exexcuted
and packets have already been sent to eth0, the stmmac_xdp_xmit()
starts running before the stmmac_xdp_set_prog() finishes. To
resolve this issue, we disable the NETDEV_XDP_ACT_NDO_XMIT
feature by default and turn on/off this feature when the bpf
program is installed/uninstalled which just like the other
ethernet drivers.
Call Trace log:
[ 306.311271] ------------[ cut here ]------------
[ 306.315910] WARNING: CPU: 0 PID: 15 at lib/timerqueue.c:55 timerqueue_del+0x68/0x70
[ 306.323590] Modules linked in:
[ 306.326654] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.4.0-rc1+ #37
[ 306.333277] Hardware name: NXP i.MX8MPlus EVK board (DT)
[ 306.338591] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 306.345561] pc : timerqueue_del+0x68/0x70
[ 306.349577] lr : __remove_hrtimer+0x5c/0xa0
[ 306.353777] sp : ffff80000b7c3920
[ 306.357094] x29: ffff80000b7c3920 x28: 0000000000000000 x27: 0000000000000001
[ 306.364244] x26: ffff80000a763a40 x25: ffff0000d0285a00 x24: 0000000000000001
[ 306.371390] x23: 0000000000000001 x22: ffff000179389a40 x21: 0000000000000000
[ 306.378537] x20: ffff000179389aa0 x19: ffff0000d2951308 x18: 0000000000001000
[ 306.385686] x17: f1d3000000000000 x16: 00000000c39c1000 x15: 55e99bbe00001a00
[ 306.392835] x14: 09000900120aa8c0 x13: e49af1d300000000 x12: 000000000000c39c
[ 306.399987] x11: 100055e99bbe0000 x10: ffff8000090b1048 x9 : ffff8000081603fc
[ 306.407133] x8 : 000000000000003c x7 : 000000000000003c x6 : 0000000000000001
[ 306.414284] x5 : ffff0000d2950980 x4 : 0000000000000000 x3 : 0000000000000000
[ 306.421432] x2 : 0000000000000001 x1 : ffff0000d2951308 x0 : ffff0000d2951308
[ 306.428585] Call trace:
[ 306.431035] timerqueue_del+0x68/0x70
[ 306.434706] __remove_hrtimer+0x5c/0xa0
[ 306.438549] hrtimer_start_range_ns+0x2bc/0x370
[ 306.443089] stmmac_xdp_xmit+0x174/0x1b0
[ 306.447021] bq_xmit_all+0x194/0x4b0
[ 306.450612] __dev_flush+0x4c/0x98
[ 306.454024] xdp_do_flush+0x18/0x38
[ 306.457522] fec_enet_rx_napi+0x6c8/0xc68
[ 306.461539] __napi_poll+0x40/0x220
[ 306.465038] net_rx_action+0xf8/0x240
[ 306.468707] __do_softirq+0x128/0x3a8
[ 306.472378] run_ksoftirqd+0x40/0x58
[ 306.475961] smpboot_thread_fn+0x1c4/0x288
[ 306.480068] kthread+0x124/0x138
[ 306.483305] ret_from_fork+0x10/0x20
[ 306.486889] ---[ end trace 0000000000000000 ]---
Fixes: 66c0e13ad2 ("drivers: net: turn on XDP features")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524125714.357337-1-wei.fang@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Do skb_put() after a new skb has been successfully allocated otherwise
the reused skb leads to skb_panics or incorrect packet sizes.
Fixes: f92e1869d7 ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230524194908.147145-1-tbogendoerfer@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If plen is null when passed in, we only checked for null
in one of the two places where it could be used. Although
plen is always valid (not null) for current callers of the
SMB2_change_notify function, this change makes it more consistent.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/all/202305251831.3V1gbbFs-lkp@intel.com/
Signed-off-by: Steve French <stfrench@microsoft.com>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmRv7UIACgkQiiy9cAdy
T1GzUQv+KF/IDyb5wzxamh35hSDLzQo1KKaGPdumN+xyQXUhiaml3XqWfQPWC3EO
vDF4a5zvi5Wm0TNwwYqUFwgMBKFqlUHw64qEkIv6MW/IHOv8/CYepBIeTLwIQCyr
REJgfU1oJJLa0U4DsPYpwgEVqnuFdb20oaKMPVTgCAHnnpKsBTtKa7ZDbCBZtHOV
URg7at6c/Dc6uWGOWRif++llmq5a5b6sBxtZ+C99dQGDKvSqbFTOf6If1u6HAaO0
m75DPcb9o2IA2lLjxALbbIeofPeEphkcH2WBUNHC2tfFo91EcndVxfKB/jo8wzP7
/5MGHlFEjmupPsPJq6bbMNj+jyPa3UM/CGqsw8ij4SmmIIt0FbBsFPuEMIPmtAsW
GJL0/Nf1cDiJJIeMahaW936VRK66VLkEGvhKFCxVpPA93IN0eNh1E0HSXsGrpYJ+
lp4edXJam/2rHngbPgB+LUaPHoVTj0xZwTDGDzNlyI6S5HWcBv43CgzlFF1sWtpD
4CNjpqzS
=sHwf
-----END PGP SIGNATURE-----
Merge tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb directory moves and client fixes from Steve French:
"Four smb3 client fixes (three of which marked for stable) and three
patches to move of fs/cifs and fs/ksmbd to a new common "fs/smb"
parent directory
- Move the client and server source directories to a common parent
directory:
fs/cifs -> fs/smb/client
fs/ksmbd -> fs/smb/server
fs/smbfs_common -> fs/smb/common
- important readahead fix
- important fix for SMB1 regression
- fix for missing mount option ("mapchars") in mount API conversion
- minor debugging improvement"
* tag '6.4-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb3: move Documentation/filesystems/cifs to Documentation/filesystems/smb
cifs: correct references in Documentation to old fs/cifs path
smb: move client and server files to common directory fs/smb
cifs: mapchars mount option ignored
smb3: display debug information better for encryption
cifs: fix smb1 mount regression
cifs: Fix cifs_limit_bvec_subset() to correctly check the maxmimum size
- Fix flush_dcache_page() for usage from irq context
- Handle kprobes breakpoints only in kernel context
- Handle kgdb breakpoints only in kernel context
- Use num_present_cpus() in alternative patching code
- Enable LOCKDEP support
- Add lightweight spinlock checks
- Flush AGP gatt writes and adjust gatt mask in parisc_agp_mask_memory()
- Allow to reboot machine after system halt
- Improve cache flushing for PCXL in arch_sync_dma_for_cpu()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZG/QkgAKCRD3ErUQojoP
X9MBAQCe2rJxNPmdUVZ8okvKTE/qukbJxa+pSlYm/+w5s4Q6nAD/TCBM5Meil0Q2
IOYAadCt4qaZfCOcIHBOgRqrnLXUbQQ=
=e2J6
-----END PGP SIGNATURE-----
Merge tag 'parisc-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"Quite a bunch of real bugfixes in here and most of them are tagged for
backporting: A fix for cache flushing from irq context, a kprobes &
kgdb breakpoint handling fix, and a fix in the alternative code
patching function to take care of CPU hotplugging.
parisc now provides LOCKDEP support and comes with a lightweight
spinlock check. Both features helped me to find the cache flush bug.
Additionally writing the AGP gatt has been fixed, the machine allows
the user to reboot after a system halt and arch_sync_dma_for_cpu() has
been optimized for PCXL PCUs.
Summary:
- Fix flush_dcache_page() for usage from irq context
- Handle kprobes breakpoints only in kernel context
- Handle kgdb breakpoints only in kernel context
- Use num_present_cpus() in alternative patching code
- Enable LOCKDEP support
- Add lightweight spinlock checks
- Flush AGP gatt writes and adjust gatt mask in parisc_agp_mask_memory()
- Allow to reboot machine after system halt
- Improve cache flushing for PCXL in arch_sync_dma_for_cpu()"
* tag 'parisc-for-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix flush_dcache_page() for usage from irq context
parisc: Handle kgdb breakpoints only in kernel context
parisc: Handle kprobes breakpoints only in kernel context
parisc: Allow to reboot machine after system halt
parisc: Enable LOCKDEP support
parisc: Add lightweight spinlock checks
parisc: Use num_present_cpus() in alternative patching code
parisc: Flush gatt writes and adjust gatt mask in parisc_agp_mask_memory()
parisc: Improve cache flushing for PCXL in arch_sync_dma_for_cpu()
It turns out that udev under certain circumstances will concurrently try
to load the same modules over-and-over excessively. This isn't a kernel
bug, but it ends up affecting the kernel, to the point that under
certain circumstances we can fail to boot, because the kernel uses a lot
of memory to read all the module data all at once.
Note that it isn't a memory leak, it's just basically a thundering herd
problem happening at bootup with a lot of CPUs, with the worst cases
then being pretty bad.
Admittedly the worst situations are somewhat contrived: lots and lots of
CPUs, not a lot of memory, and KASAN enabled to make it all slower and
as such (unintentionally) exacerbate the problem.
Luis explains: [1]
"My best assessment of the situation is that each CPU in udev ends up
triggering a load of duplicate set of modules, not just one, but *a
lot*. Not sure what heuristics udev uses to load a set of modules per
CPU."
Petr Pavlu chimes in: [2]
"My understanding is that udev workers are forked. An initial kmod
context is created by the main udevd process but no sharing happens
after the fork. It means that the mentioned memory pool logic doesn't
really kick in.
Multiple parallel load requests come from multiple udev workers, for
instance, each handling an udev event for one CPU device and making
the exactly same requests as all others are doing at the same time.
The optimization idea would be to recognize these duplicate requests
at the udevd/kmod level and converge them"
Note that module loading has tried to mitigate this issue before, see
for example commit 064f4536d1 ("module: avoid allocation if module is
already present and ready"), which has a few ASCII graphs on memory use
due to this same issue.
However, while that noticed that the module was already loaded, and
exited with an error early before spending any more time on setting up
the module, it didn't handle the case of multiple concurrent module
loads all being active - but not complete - at the same time.
Yes, one of them will eventually win the race and finalize its copy, and
the others will then notice that the module already exists and error
out, but while this all happens, we have tons of unnecessary concurrent
work being done.
Again, the real fix is for udev to not do that (maybe it should use
threads instead of fork, and have actual shared data structures and not
cause duplicate work). That real fix is apparently not trivial.
But it turns out that the kernel already has a pretty good model for
dealing with concurrent access to the same file: the i_writecount of the
inode.
In fact, the module loading already indirectly uses 'i_writecount' ,
because 'kernel_file_read()' will in fact do
ret = deny_write_access(file);
if (ret)
return ret;
...
allow_write_access(file);
around the read of the file data. We do not allow concurrent writes to
the file, and return -ETXTBUSY if the file was open for writing at the
same time as the module data is loaded from it.
And the solution to the reader concurrency problem is to simply extend
this "no concurrent writers" logic to simply be "exclusive access".
Note that "exclusive" in this context isn't really some absolute thing:
it's only exclusion from writers and from other "special readers" that
do this writer denial. So we simply introduce a variation of that
"deny_write_access()" logic that not only denies write access, but also
requires that this is the _only_ such access that denies write access.
Which means that you can't start loading a module that is already being
loaded as a module by somebody else, or you will get the same -ETXTBSY
error that you would get if there were writers around.
[ It also means that you can't try to load a currently executing
executable as a module, for the same reason: executables do that same
"deny_write_access()" thing, and that's obviously where the whole
ETXTBSY logic traditionally came from.
This is not a problem for kernel modules, since the set of normal
executable files and kernel module files is entirely disjoint. ]
This new function is called "exclusive_deny_write_access()", and the
implementation is trivial, in that it's just an atomic decrement of
i_writecount if it was 0 before.
To use that new exclusivity check, all we then do is wrap the module
loading with that exclusive_deny_write_access()() / allow_write_access()
pair. The actual patch is a bit bigger than that, because we want to
surround not just the "load file data" part, but the whole module setup,
to get maximum exclusion.
So this ends up splitting up "finit_module()" into a few helper
functions to make it all very clear and legible.
In Luis' test-case (bringing up 255 vcpu's in a virtual machine [3]),
the "wasted vmalloc" space (ie module data read into a vmalloc'ed area
in order to be loaded as a module, but then discarded because somebody
else loaded the same module instead) dropped from 1.8GiB to 474kB. Yes,
that's gigabytes to kilobytes.
It doesn't drop completely to zero, because even with this change, you
can still end up having completely serial pointless module loads, where
one udev process has loaded a module fully (and thus the kernel has
released that exclusive lock on the module file), and then another udev
process tries to load the same module again.
So while we cannot fully get rid of the fundamental bug in user space,
we _can_ get rid of the excessive concurrent thundering herd effect.
A couple of final side notes on this all:
- This tweak only affects the "finit_module()" system call, which gives
the kernel a file descriptor with the module data.
You can also just feed the module data as raw data from user space
with "init_module()" (note the lack of 'f' at the beginning), and
obviously for that case we do _not_ have any "exclusive read" logic.
So if you absolutely want to do things wrong in user space, and try
to load the same module multiple times, and error out only later when
the kernel ends up saying "you can't load the same module name
twice", you can still do that.
And in fact, some distros will do exactly that, because they will
uncompress the kernel module data in user space before feeding it to
the kernel (mainly because they haven't started using the new kernel
side decompression yet).
So this is not some absolute "you can't do concurrent loads of the
same module". It's literally just a very simple heuristic that will
catch it early in case you try to load the exact same module file at
the same time, and in that case avoid a potentially nasty situation.
- There is another user of "deny_write_access()": the verity code that
enables fs-verity on a file (the FS_IOC_ENABLE_VERITY ioctl).
If you use fs-verity and you care about verifying the kernel modules
(which does make sense), you should do it *before* loading said
kernel module. That may sound obvious, but now the implementation
basically requires it. Because if you try to do it concurrently, the
kernel may refuse to load the module file that is being set up by the
fs-verity code.
- This all will obviously mean that if you insist on loading the same
module in parallel, only one module load will succeed, and the others
will return with an error.
That was true before too, but what is different is that the -ETXTBSY
error can be returned *before* the success case of another process
fully loading and instantiating the module.
Again, that might sound obvious, and it is indeed the whole point of
the whole change: we are much quicker to notice the whole "you're
already in the process of loading this module".
So it's very much intentional, but it does mean that if you just
spray the kernel with "finit_module()", and expect that the module is
immediately loaded afterwards without checking the return value, you
are doing something horribly horribly wrong.
I'd like to say that that would never happen, but the whole _reason_
for this commit is that udev is currently doing something horribly
horribly wrong, so ...
Link: https://lore.kernel.org/all/ZEGopJ8VAYnE7LQ2@bombadil.infradead.org/ [1]
Link: https://lore.kernel.org/all/23bd0ce6-ef78-1cd8-1f21-0e706a00424a@suse.com/ [2]
Link: https://lore.kernel.org/lkml/ZG%2Fa+nrt4%2FAAUi5z@bombadil.infradead.org/ [3]
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Tested-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZG9CygAKCRCRxhvAZXjc
opSUAP94up0d2bhB4CDRGkszpBefogBXyEylT8v+1EPtzs8K6QEA9OEbn4wWsIlh
vYLUjejArgUGuxDl7iiZzAx8p6n9qws=
=lEs3
-----END PGP SIGNATURE-----
Merge tag 'vfs/v6.4-rc3/misc.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- During the acl rework we merged this cycle the generic_listxattr()
helper had to be modified in a way that in principle it would allow
for POSIX ACLs to be reported. At least that was the impression we
had initially. Because before the acl rework POSIX ACLs would be
reported if the filesystem did have POSIX ACL xattr handlers in
sb->s_xattr. That logic changed and now we can simply check whether
the superblock has SB_POSIXACL set and if the inode has
inode->i_{default_}acl set report the appropriate POSIX ACL name.
However, we didn't realize that generic_listxattr() was only ever
used by two filesystems. Both of them don't support POSIX ACLs via
sb->s_xattr handlers and so never reported POSIX ACLs via
generic_listxattr() even if they raised SB_POSIXACL and did contain
inodes which had acls set. The example here is nfs4.
As a result, generic_listxattr() suddenly started reporting POSIX
ACLs when it wouldn't have before. Since SB_POSIXACL implies that the
umask isn't stripped in the VFS nfs4 can't just drop SB_POSIXACL from
the superblock as it would also alter umask handling for them.
So just have generic_listxattr() not report POSIX ACLs as it never
did anyway. It's documented as such.
- Our SB_* flags currently use a signed integer and we shift the last
bit causing UBSAN to complain about undefined behavior. Switch to
using unsigned. While the original patch used an explicit unsigned
bitshift it's now pretty common to rely on the BIT() macro in a lot
of headers nowadays. So the patch has been adjusted to use that.
- Add Namjae as ntfs reviewer. They're already active this cycle so
let's make it explicit right now.
* tag 'vfs/v6.4-rc3/misc.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ntfs: Add myself as a reviewer
fs: don't call posix_acl_listxattr in generic_listxattr
fs: fix undefined behavior in bit shift for SB_NOUSER
Current release - regressions:
- net: fix skb leak in __skb_tstamp_tx()
- eth: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
Current release - new code bugs:
- handshake:
- fix sock->file allocation
- fix handshake_dup() ref counting
- bluetooth:
- fix potential double free caused by hci_conn_unlink
- fix UAF in hci_conn_hash_flush
Previous releases - regressions:
- core: fix stack overflow when LRO is disabled for virtual interfaces
- tls: fix strparser rx issues
- bpf:
- fix many sockmap/TCP related issues
- fix a memory leak in the LRU and LRU_PERCPU hash maps
- init the offload table earlier
- eth: mlx5e:
- do as little as possible in napi poll when budget is 0
- fix using eswitch mapping in nic mode
- fix deadlock in tc route query code
Previous releases - always broken:
- udplite: fix NULL pointer dereference in __sk_mem_raise_allocated()
- raw: fix output xfrm lookup wrt protocol
- smc: reset connection when trying to use SMCRv2 fails
- phy: mscc: enable VSC8501/2 RGMII RX clock
- eth: octeontx2-pf: fix TSOv6 offload
- eth: cdc_ncm: deal with too low values of dwNtbOutMaxSize
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRvOisSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkMW8P/3rZy4Yy2bIWFCkxKD/aPvqG60ZZfvV/
sB7Qu3X0OLiDNAmdDsXjCFeMYnV4cxDvwxjFUVQX0ZZEilEbGQ2XlOaFTpXS3jeW
UQup55DW7VG6BkuNJipwtLkLSQ498Z+qinRPsmNPVADkItHHbyrSnKNjh34ruhly
P5edWJ/3PuzoK2hN/izgBpk0i1UC1+tSKKANV5dlIWb6CXY9C8pvr0CScuGb5rKv
xAs40Rp1eaFmkYkhbAn3H2fvSOoCr2aSDeS2SvRAxca9OUcrUAjnnsLTVq5WI22/
PxSESy6wfE2e5+q1AwskwBdFO3LLKheVYJF2KzSlRk4FuWk50GbwbpueRSOYEU7b
2w0MveYggr4m3B06/2esrsr6bEPsb4QFKE+hubX5FmIPECOz+dOA0RW4mOysvzqM
q+xEuR9uWFsrMO7WVU7/4oF02HqAfAtaEn/87aniGz5o7bzPbmyyyBKfmb4s2c13
TU828rEBNGkmqxSwsZHUOt21IJoOa646W99zsmGpRo/m47pFx093HVR22Hr1dH0B
BllhsmtvJZ2XsWkR2Q9aAyyluc3/b3yI24OM125y7bIBWte2MF908xaStx/al+AF
jPL/ioEQKNsOJKHan9EzhbyH98RCfEotLb+ha/qNQ9GGjKROHsTn9EgP7h7367oo
yS8QLmvng01f
=hz3D
-----END PGP SIGNATURE-----
Merge tag 'net-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and bpf.
Current release - regressions:
- net: fix skb leak in __skb_tstamp_tx()
- eth: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
Current release - new code bugs:
- handshake:
- fix sock->file allocation
- fix handshake_dup() ref counting
- bluetooth:
- fix potential double free caused by hci_conn_unlink
- fix UAF in hci_conn_hash_flush
Previous releases - regressions:
- core: fix stack overflow when LRO is disabled for virtual
interfaces
- tls: fix strparser rx issues
- bpf:
- fix many sockmap/TCP related issues
- fix a memory leak in the LRU and LRU_PERCPU hash maps
- init the offload table earlier
- eth: mlx5e:
- do as little as possible in napi poll when budget is 0
- fix using eswitch mapping in nic mode
- fix deadlock in tc route query code
Previous releases - always broken:
- udplite: fix NULL pointer dereference in __sk_mem_raise_allocated()
- raw: fix output xfrm lookup wrt protocol
- smc: reset connection when trying to use SMCRv2 fails
- phy: mscc: enable VSC8501/2 RGMII RX clock
- eth: octeontx2-pf: fix TSOv6 offload
- eth: cdc_ncm: deal with too low values of dwNtbOutMaxSize"
* tag 'net-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
udplite: Fix NULL pointer dereference in __sk_mem_raise_allocated().
net: phy: mscc: enable VSC8501/2 RGMII RX clock
net: phy: mscc: remove unnecessary phydev locking
net: phy: mscc: add support for VSC8501
net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE
net/handshake: Enable the SNI extension to work properly
net/handshake: Unpin sock->file if a handshake is cancelled
net/handshake: handshake_genl_notify() shouldn't ignore @flags
net/handshake: Fix uninitialized local variable
net/handshake: Fix handshake_dup() ref counting
net/handshake: Remove unneeded check from handshake_dup()
ipv6: Fix out-of-bounds access in ipv6_find_tlv()
net: ethernet: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCs
docs: netdev: document the existence of the mail bot
net: fix skb leak in __skb_tstamp_tx()
r8169: Use a raw_spinlock_t for the register locks.
page_pool: fix inconsistency for page_pool_ring_[un]lock()
bpf, sockmap: Test progs verifier error with latest clang
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops
bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer
...
Traditionally, all CPUs in a system have identical numbers of SMT
siblings. That changes with hybrid processors where some logical CPUs
have a sibling and others have none.
Today, the CPU boot code sets the global variable smp_num_siblings when
every CPU thread is brought up. The last thread to boot will overwrite
it with the number of siblings of *that* thread. That last thread to
boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it
is an Ecore, smp_num_siblings == 1.
smp_num_siblings describes if the *system* supports SMT. It should
specify the maximum number of SMT threads among all cores.
Ensure that smp_num_siblings represents the system-wide maximum number
of siblings by always increasing its value. Never allow it to decrease.
On MeteorLake-P platform, this fixes a problem that the Ecore CPUs are
not updated in any cpu sibling map because the system is treated as an
UP system when probing Ecore CPUs.
Below shows part of the CPU topology information before and after the
fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore).
...
-/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff
-/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11
+/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21
...
-/sys/devices/system/cpu/cpu12/topology/package_cpus:001000
-/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12
+/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff
+/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21
Notice that the "before" 'package_cpus_list' has only one CPU. This
means that userspace tools like lscpu will see a little laptop like
an 11-socket system:
-Core(s) per socket: 1
-Socket(s): 11
+Core(s) per socket: 16
+Socket(s): 1
This is also expected to make the scheduler do rather wonky things
too.
[ dhansen: remove CPUID detail from changelog, add end user effects ]
CC: stable@kernel.org
Fixes: bbb65d2d36 ("x86: use cpuid vector 0xb when available for detecting cpu topology")
Fixes: 95f3d39ccf ("x86/cpu/topology: Provide detect_extended_topology_early()")
Suggested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20230323015640.27906-1-rui.zhang%40intel.com