* [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned
@ 2023-10-30 6:36 Hou Tao
2023-11-02 6:00 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 2+ messages in thread
From: Hou Tao @ 2023-10-30 6:36 UTC (permalink / raw)
To: bpf
Cc: Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko, Song Liu,
Hao Luo, Yonghong Song, Daniel Borkmann, KP Singh,
Stanislav Fomichev, Jiri Olsa, John Fastabend, Hsin-Wei Hung,
houtao1
From: Hou Tao <houtao1@huawei.com>
When there are concurrent uref release and bpf timer init operations,
the following sequence diagram is possible. It will break the guarantee
provided by bpf_timer: bpf_timer will still be alive after userspace
application releases or unpins the map. It also will lead to kmemleak
for old kernel version which doesn't release bpf_timer when map is
released.
bpf program X:
bpf_timer_init()
lock timer->lock
read timer->timer as NULL
read map->usercnt != 0
process Y:
close(map_fd)
// put last uref
bpf_map_put_uref()
atomic_dec_and_test(map->usercnt)
array_map_free_timers()
bpf_timer_cancel_and_free()
// just return
read timer->timer is NULL
t = bpf_map_kmalloc_node()
timer->timer = t
unlock timer->lock
Fix the problem by checking map->usercnt after timer->timer is assigned,
so when there are concurrent uref release and bpf timer init, either
bpf_timer_cancel_and_free() from uref release reads a no-NULL timer
or the newly-added atomic64_read() returns a zero usercnt.
Because atomic_dec_and_test(map->usercnt) and READ_ONCE(timer->timer)
in bpf_timer_cancel_and_free() are not protected by a lock, so add
a memory barrier to guarantee the order between map->usercnt and
timer->timer. Also use WRITE_ONCE(timer->timer, x) to match the lockless
read of timer->timer in bpf_timer_cancel_and_free().
Reported-by: Hsin-Wei Hung <hsinweih@uci.edu>
Closes: https://lore.kernel.org/bpf/CABcoxUaT2k9hWsS1tNgXyoU3E-=PuOgMn737qK984fbFmfYixQ@mail.gmail.com
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
v3:
* patch #1: only check map->usercnt once and call kfree() with
spin-lock acquired in error handling path (Alexei)
update the commit messsage to explain that the patch only
fixes the broken-guarantee problem for bpf_timer. The
kmemleak problem will be fixed by another patchset.
* patch #2: remove the selftest patch because it demonstrates the
use-after-free problem for map-in-map and the kmemleak
problem is just the superficial phenomenon. It will be
re-added and refined in another patchset.
v2: https://lore.kernel.org/bpf/20231020014214.2471419-1-houtao@huaweicloud.com
* patch #1: use smp_mb() instead of smp_mb__before_atomic()
* patch #2: use WRITE_ONCE(timer->timer, x) to match the lockless read
of timer->timer
v1: https://lore.kernel.org/bpf/20231017125717.241101-1-houtao@huaweicloud.com
kernel/bpf/helpers.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index e46ac288a1080..aed93df5c8aa0 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1177,13 +1177,6 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
ret = -EBUSY;
goto out;
}
- if (!atomic64_read(&map->usercnt)) {
- /* maps with timers must be either held by user space
- * or pinned in bpffs.
- */
- ret = -EPERM;
- goto out;
- }
/* allocate hrtimer via map_kmalloc to use memcg accounting */
t = bpf_map_kmalloc_node(map, sizeof(*t), GFP_ATOMIC, map->numa_node);
if (!t) {
@@ -1196,7 +1189,21 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
rcu_assign_pointer(t->callback_fn, NULL);
hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT);
t->timer.function = bpf_timer_cb;
- timer->timer = t;
+ WRITE_ONCE(timer->timer, t);
+ /* Guarantee the order between timer->timer and map->usercnt. So
+ * when there are concurrent uref release and bpf timer init, either
+ * bpf_timer_cancel_and_free() called by uref release reads a no-NULL
+ * timer or atomic64_read() below returns a zero usercnt.
+ */
+ smp_mb();
+ if (!atomic64_read(&map->usercnt)) {
+ /* maps with timers must be either held by user space
+ * or pinned in bpffs.
+ */
+ WRITE_ONCE(timer->timer, NULL);
+ kfree(t);
+ ret = -EPERM;
+ }
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
return ret;
@@ -1374,7 +1381,7 @@ void bpf_timer_cancel_and_free(void *val)
/* The subsequent bpf_timer_start/cancel() helpers won't be able to use
* this timer, since it won't be initialized.
*/
- timer->timer = NULL;
+ WRITE_ONCE(timer->timer, NULL);
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
if (!t)
--
2.29.2
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned
2023-10-30 6:36 [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned Hou Tao
@ 2023-11-02 6:00 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 2+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-11-02 6:00 UTC (permalink / raw)
To: Hou Tao
Cc: bpf, martin.lau, alexei.starovoitov, andrii, song, haoluo,
yonghong.song, daniel, kpsingh, sdf, jolsa, john.fastabend,
hsinweih, houtao1
Hello:
This patch was applied to bpf/bpf.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Mon, 30 Oct 2023 14:36:16 +0800 you wrote:
> From: Hou Tao <houtao1@huawei.com>
>
> When there are concurrent uref release and bpf timer init operations,
> the following sequence diagram is possible. It will break the guarantee
> provided by bpf_timer: bpf_timer will still be alive after userspace
> application releases or unpins the map. It also will lead to kmemleak
> for old kernel version which doesn't release bpf_timer when map is
> released.
>
> [...]
Here is the summary with links:
- [bpf-next,v3] bpf: Check map->usercnt after timer->timer is assigned
https://git.kernel.org/bpf/bpf/c/fd381ce60a2d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-11-02 6:00 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-30 6:36 [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned Hou Tao
2023-11-02 6:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox