mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2025-11-23 16:24:15 +00:00
So the problem this patch is trying to address is as follows:
CPU0 CPU1
context_switch(A, B)
ttwu(A)
LOCK A->pi_lock
A->on_cpu == 0
finish_task_switch(A)
prev_state = A->state <-.
WMB |
A->on_cpu = 0; |
UNLOCK rq0->lock |
| context_switch(C, A)
`-- A->state = TASK_DEAD
prev_state == TASK_DEAD
put_task_struct(A)
context_switch(A, C)
finish_task_switch(A)
A->state == TASK_DEAD
put_task_struct(A)
The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.
Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.
We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.
The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.
Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org> # v3.1+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes:
|
||
|---|---|---|
| .. | ||
| auto_group.c | ||
| auto_group.h | ||
| clock.c | ||
| completion.c | ||
| core.c | ||
| cpuacct.c | ||
| cpuacct.h | ||
| cpudeadline.c | ||
| cpudeadline.h | ||
| cpupri.c | ||
| cpupri.h | ||
| cputime.c | ||
| deadline.c | ||
| debug.c | ||
| fair.c | ||
| features.h | ||
| idle_task.c | ||
| idle.c | ||
| loadavg.c | ||
| Makefile | ||
| rt.c | ||
| sched.h | ||
| stats.c | ||
| stats.h | ||
| stop_task.c | ||
| wait.c | ||