qemu/system
Pierrick Bouvier 2865bf1c57 system/physmem: fix use-after-free with dispatch
A use-after-free bug was reported when booting a Linux kernel during the
pci setup phase. It's quite hard to reproduce (needs smp, and favored by
having several pci devices with BAR and specific Linux config, which
is Debian default one in this case).

After investigation (see the associated bug ticket), it appears that,
under specific conditions, we might access a cached AddressSpaceDispatch
that was reclaimed by RCU thread meanwhile.
In the Linux boot scenario, during the pci phase, memory region are
destroyed/recreated, resulting in exposition of the bug.

The core of the issue is that we cache the dispatch associated to
current cpu in cpu->cpu_ases[asidx].memory_dispatch. It is updated with
tcg_commit, which runs asynchronously on a given cpu.
At some point, we leave the rcu critial section, and the RCU thread
starts reclaiming it, but tcg_commit is not yet invoked, resulting in
the use-after-free.

It's not the first problem around this area, and commit 0d58c66068 [1]
("softmmu: Use async_run_on_cpu in tcg_commit") already tried to
address it. It did a good job, but it seems that we found a specific
situation where it's not enough.

This patch takes a simple approach: remove the cached value creating the
issue, and make sure we always get the current mapping for address
space, using address_space_to_dispatch(cpu->cpu_ases[asidx].as).
It's equivalent to qatomic_rcu_read(&as->current_map)->dispatch;
This is not really costly, we just need two dereferences,
including one atomic (rcu) read, which is negligible considering we are
already on mmu slow path anyway.

Note that tcg_commit is still needed, as it's taking care of flushing
TLB, removing previously mapped entries.

Another solution would be to cache directly values under the dispatch
(dispatch themselves are not ref counted), keep an active reference on
associated memory section, and release it when appropriate (tricky).
Given the time already spent debugging this area now and previously, I
strongly prefer eliminating the root of the issue, instead of adding
more complexity for a hypothetical performance gain. RCU is precisely
used to ensure good performance when reading data, so caching is not as
beneficial as it might seem IMHO.

[1] 0d58c66068

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3040
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Tested-by: Michael Tokarev <mjt@tls.msk.ru>
Message-ID: <20250724161142.2803091-1-pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2025-07-29 13:56:39 +02:00
..
arch_init.c system: Replace arch_type global by qemu_arch_available() helper 2025-03-11 20:03:26 +01:00
async-teardown.c qemu/osdep: Add excluded fd parameter to qemu_close_all_open_fd() 2024-08-05 08:21:59 +10:00
balloon.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
bootdevice.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
cpu-timers.c include/exec: Split out icount.h 2025-04-23 14:08:44 -07:00
cpus.c accel: Rename 'system/accel-ops.h' -> 'accel/accel-cpu-ops.h' 2025-07-15 19:34:33 +02:00
datadir.c pc-bios: Move device tree files in their own subdir 2025-04-25 17:09:58 +02:00
device_tree-stub.c hw/core/machine.c: Make -machine dumpdtb=file.dtb with no DTB an error 2025-02-25 15:32:57 +00:00
device_tree.c hw/core/machine.c: Make -machine dumpdtb=file.dtb with no DTB an error 2025-02-25 15:32:57 +00:00
dirtylimit.c Miscellaneous patches for 2025-04-24 2025-04-24 13:44:57 -04:00
dma-helpers.c include/exec: Split out icount.h 2025-04-23 14:08:44 -07:00
globals-target.c system: Extract target-specific globals to their own compilation unit 2025-03-11 20:03:26 +01:00
globals.c accel/tcg: Restrict 'icount_align_option' global to TCG 2025-03-06 15:46:17 +01:00
ioport.c include/system: Move exec/ioport.h to system/ioport.h 2025-04-23 14:08:21 -07:00
main.c system/main: comment lock rationale 2025-05-28 08:07:59 +01:00
memory_ldst.c.inc memory: pass MemTxAttrs to memory_access_is_direct() 2025-02-12 11:33:05 -05:00
memory_mapping.c include/system: Move exec/address-spaces.h to system/address-spaces.h 2025-04-23 14:08:21 -07:00
memory-internal.h system/memory: Remove DEVICE_HOST_ENDIAN definition 2025-04-25 17:09:58 +02:00
memory.c Accelerators patches 2025-07-16 07:13:40 -04:00
meson.build ram-block-attributes: Introduce RamBlockAttributes to manage RAMBlock with guest_memfd 2025-06-23 16:03:59 -04:00
physmem.c system/physmem: fix use-after-free with dispatch 2025-07-29 13:56:39 +02:00
qdev-monitor.c system/qdev: Remove pointless NULL check in qdev_device_add_from_qdict 2025-07-10 16:18:43 +01:00
qemu-seccomp.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
qtest.c qemu: Convert target_words_bigendian() to TargetInfo API 2025-07-15 02:56:39 -04:00
ram-block-attributes.c ram-block-attributes: Introduce RamBlockAttributes to manage RAMBlock with guest_memfd 2025-06-23 16:03:59 -04:00
rtc.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
runstate-action.c include: Rename sysemu/ -> system/ 2024-12-20 17:44:56 +01:00
runstate-hmp-cmds.c qapi: Move include/qapi/qmp/ to include/qobject/ 2025-02-10 15:33:16 +01:00
runstate.c Accelerators patches 2025-07-16 07:13:40 -04:00
tpm-hmp-cmds.c
tpm.c tpm: "qemu -tpmdev help" should return success 2025-07-15 10:22:33 +04:00
trace-events ram-block-attributes: Introduce RamBlockAttributes to manage RAMBlock with guest_memfd 2025-06-23 16:03:59 -04:00
trace.h system: Rename softmmu/ directory as system/ 2023-10-08 21:08:08 +02:00
vl.c hw/nvram/fw_cfg: Remove legacy FW_CFG_ORDER_OVERRIDE 2025-05-30 09:52:08 +02:00
watchpoint.c include/exec: Split out watchpoint.h 2025-04-23 14:08:36 -07:00