mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2025-11-20 16:34:02 +00:00
Using per-cpu storage for @x86_cpu_to_logical_apicid is not optimal. Broadcast IPI will need at least one cache line per cpu to access this field. __x2apic_send_IPI_mask() is using standard bitmask operators. By converting x86_cpu_to_logical_apicid to an array, we divide by 16x number of needed cache lines, because we find 16 values per cache line. CPU prefetcher can kick nicely. Also move @cluster_masks to READ_MOSTLY section to avoid false sharing. Tested on a dual socket host with 256 cpus, cost for a full broadcast is now 11 usec instead of 33 usec. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211007143556.574911-1-eric.dumazet@gmail.com |
||
|---|---|---|
| .. | ||
| apic_common.c | ||
| apic_flat_64.c | ||
| apic_noop.c | ||
| apic_numachip.c | ||
| apic.c | ||
| bigsmp_32.c | ||
| hw_nmi.c | ||
| io_apic.c | ||
| ipi.c | ||
| local.h | ||
| Makefile | ||
| msi.c | ||
| probe_32.c | ||
| probe_64.c | ||
| vector.c | ||
| x2apic_cluster.c | ||
| x2apic_phys.c | ||
| x2apic_uv_x.c | ||