* [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace
@ 2024-09-23 13:40 Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map Leon Hwang
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Leon Hwang @ 2024-09-23 13:40 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, eddyz87, iii, leon.hwang, kernel-patches-bot
Previously, I fixed a tailcall infinite loop issue caused by trampoline[0].
At this time, I fix a tailcall infinite loop issue caused by tailcall and
freplace combination by preventing updating extended prog to prog_array map
and preventing extending tail callee prog with freplace.
v2 -> v3:
* Address comments from Alexei:
* Stop hacking JIT.
* Prevent the specific use case at attach/update time.
v1 -> v2:
* Address comment from Eduard:
* Explain why nop5 and xor/nop3 are swapped at prologue.
* Address comment from Alexei:
* Disallow attaching tail_call_reachable freplace prog to
not-tail_call_reachable target in verifier.
* Update "bpf, arm64: Fix tailcall infinite loop caused by freplace" with
latest arm64 JIT code.
Links:
[0] https://lore.kernel.org/bpf/20230912150442.2009-1-hffilwlqm@gmail.com/
Leon Hwang (4):
bpf: Prevent updating extended prog to prog_array map
bpf: Prevent extending tail callee prog with freplace
selftests/bpf: Add a test case to confirm a tailcall infinite loop
issue has been prevented
selftests/bpf: Add cases to test tailcall in freplace
include/linux/bpf.h | 2 +
kernel/bpf/arraymap.c | 13 +-
kernel/bpf/syscall.c | 19 +-
.../selftests/bpf/prog_tests/tailcalls.c | 196 +++++++++++++++++-
.../tailcall_bpf2bpf_hierarchy_freplace.c | 30 +++
.../testing/selftests/bpf/progs/tc_bpf2bpf.c | 37 +++-
6 files changed, 290 insertions(+), 7 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c
--
2.44.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map
2024-09-23 13:40 [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
@ 2024-09-23 13:40 ` Leon Hwang
2024-09-25 1:24 ` Eduard Zingerman
2024-09-23 13:40 ` [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace Leon Hwang
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Leon Hwang @ 2024-09-23 13:40 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, eddyz87, iii, leon.hwang, kernel-patches-bot
This patch prevents a potential infinite loop issue caused by combination
of tailcal and freplace partially.
For example:
tc_bpf2bpf.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
__noinline
int subprog_tc(struct __sk_buff *skb)
{
return skb->len * 2;
}
SEC("tc")
int entry_tc(struct __sk_buff *skb)
{
return subprog_tc(skb);
}
char __license[] SEC("license") = "GPL";
tailcall_bpf2bpf_hierarchy_freplace.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");
int count = 0;
static __noinline
int subprog_tail(struct __sk_buff *skb)
{
bpf_tail_call_static(skb, &jmp_table, 0);
return 0;
}
SEC("freplace")
int entry_freplace(struct __sk_buff *skb)
{
count++;
subprog_tail(skb);
subprog_tail(skb);
return count;
}
char __license[] SEC("license") = "GPL";
The attach target of entry_freplace is subprog_tc, and the tail callee
in subprog_tail is entry_tc.
Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace
-> subprog_tail --tailcall-> entry_tc, because tail_call_cnt in
entry_freplace will count from zero for every time of entry_freplace
execution. Kernel will panic:
[ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____)
(stack is (____ptrval____)..(____ptrval____))
[ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI
[ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE
6.10.0-rc6-g026dcdae8d3e-dirty #72
[ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Call Trace:
[ 15.310490] <#DF>
[ 15.310490] ? die+0x36/0x90
[ 15.310490] ? handle_stack_overflow+0x4d/0x60
[ 15.310490] ? exc_double_fault+0x117/0x1a0
[ 15.310490] ? asm_exc_double_fault+0x23/0x30
[ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] </#DF>
[ 15.310490] <TASK>
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] ...
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] bpf_test_run+0x210/0x370
[ 15.310490] ? bpf_test_run+0x128/0x370
[ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0
[ 15.310490] __sys_bpf+0xdbf/0x2c40
[ 15.310490] ? clockevents_program_event+0x52/0xf0
[ 15.310490] ? lock_release+0xbf/0x290
[ 15.310490] __x64_sys_bpf+0x1e/0x30
[ 15.310490] do_syscall_64+0x68/0x140
[ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 15.310490] RIP: 0033:0x7f133b52725d
[ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48
[ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX:
0000000000000141
[ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX:
00007f133b52725d
[ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI:
000000000000000a
[ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09:
00007ffddbc102a0
[ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12:
0000000000000004
[ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15:
00007f133b6ed000
[ 15.310490] </TASK>
[ 15.310490] Modules linked in: bpf_testmod(OE)
[ 15.310490] ---[ end trace 0000000000000000 ]---
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
This patch partially prevents this panic by preventing updating extended
prog to prog_array map.
If a prog or its subprog has been extended by freplace prog, the prog
can not be updated to prog_array map.
Alongside next patch, the panic will be prevented completely.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 1 +
kernel/bpf/arraymap.c | 7 ++++++-
kernel/bpf/syscall.c | 7 ++++++-
3 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0c3893c471711..048aa2625cbef 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1483,6 +1483,7 @@ struct bpf_prog_aux {
bool xdp_has_frags;
bool exception_cb;
bool exception_boundary;
+ bool is_extended; /* true if extended by freplace program */
struct bpf_arena *arena;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 79660e3fca4c1..8d97bae98fa70 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -951,7 +951,12 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
if (IS_ERR(prog))
return prog;
- if (!bpf_prog_map_compatible(map, prog)) {
+ if (!bpf_prog_map_compatible(map, prog) || prog->aux->is_extended) {
+ /* Extended prog can not be tail callee. It's to prevent a
+ * potential infinite loop like:
+ * tail callee prog entry -> tail callee prog subprog ->
+ * freplace prog entry --tailcall-> tail callee prog entry.
+ */
bpf_prog_put(prog);
return ERR_PTR(-EINVAL);
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8a4117f6d7610..18b3f9216b050 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3292,8 +3292,11 @@ static void bpf_tracing_link_release(struct bpf_link *link)
bpf_trampoline_put(tr_link->trampoline);
/* tgt_prog is NULL if target is a kernel function */
- if (tr_link->tgt_prog)
+ if (tr_link->tgt_prog) {
+ if (link->prog->type == BPF_PROG_TYPE_EXT)
+ tr_link->tgt_prog->aux->is_extended = false;
bpf_prog_put(tr_link->tgt_prog);
+ }
}
static void bpf_tracing_link_dealloc(struct bpf_link *link)
@@ -3523,6 +3526,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
if (prog->aux->dst_trampoline && tr != prog->aux->dst_trampoline)
/* we allocated a new trampoline, so free the old one */
bpf_trampoline_put(prog->aux->dst_trampoline);
+ if (prog->type == BPF_PROG_TYPE_EXT)
+ tgt_prog->aux->is_extended = true;
prog->aux->dst_prog = NULL;
prog->aux->dst_trampoline = NULL;
--
2.44.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace
2024-09-23 13:40 [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map Leon Hwang
@ 2024-09-23 13:40 ` Leon Hwang
2024-09-25 5:32 ` Eduard Zingerman
2024-09-23 13:40 ` [PATCH bpf-next v3 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test tailcall in freplace Leon Hwang
3 siblings, 1 reply; 11+ messages in thread
From: Leon Hwang @ 2024-09-23 13:40 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, eddyz87, iii, leon.hwang, kernel-patches-bot
Alongside previous patch, the infinite loop issue caused by combination of
tailcal and freplace can be prevented completely.
The previous patch can not prevent the use case that updates a prog to
prog_array map and then extends the prog with freplace prog.
This patch fixes the case by preventing extending a prog, which has been
updated to prog_array map, with freplace prog.
If a prog has been updated to prog_array map, it or its subprog can not
be extended by freplace prog.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 1 +
kernel/bpf/arraymap.c | 6 +++++-
kernel/bpf/syscall.c | 12 ++++++++++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 048aa2625cbef..b864b37e67c17 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1484,6 +1484,7 @@ struct bpf_prog_aux {
bool exception_cb;
bool exception_boundary;
bool is_extended; /* true if extended by freplace program */
+ atomic_t tail_callee_cnt;
struct bpf_arena *arena;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 8d97bae98fa70..c12e0e3bf6ad0 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -961,13 +961,17 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
return ERR_PTR(-EINVAL);
}
+ atomic_inc(&prog->aux->tail_callee_cnt);
return prog;
}
static void prog_fd_array_put_ptr(struct bpf_map *map, void *ptr, bool need_defer)
{
+ struct bpf_prog *prog = ptr;
+
/* bpf_prog is freed after one RCU or tasks trace grace period */
- bpf_prog_put(ptr);
+ atomic_dec(&prog->aux->tail_callee_cnt);
+ bpf_prog_put(prog);
}
static u32 prog_fd_array_sys_lookup_elem(void *ptr)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 18b3f9216b050..be829016d8182 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3501,6 +3501,18 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
tgt_prog = prog->aux->dst_prog;
}
+ if (prog->type == BPF_PROG_TYPE_EXT &&
+ atomic_read(&tgt_prog->aux->tail_callee_cnt)) {
+ /* Program extensions can not extend target prog when the target
+ * prog has been updated to any prog_array map as tail callee.
+ * It's to prevent a potential infinite loop like:
+ * tgt prog entry -> tgt prog subprog -> freplace prog entry
+ * --tailcall-> tgt prog entry.
+ */
+ err = -EINVAL;
+ goto out_unlock;
+ }
+
err = bpf_link_prime(&link->link.link, &link_primer);
if (err)
goto out_unlock;
--
2.44.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH bpf-next v3 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented
2024-09-23 13:40 [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace Leon Hwang
@ 2024-09-23 13:40 ` Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test tailcall in freplace Leon Hwang
3 siblings, 0 replies; 11+ messages in thread
From: Leon Hwang @ 2024-09-23 13:40 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, eddyz87, iii, leon.hwang, kernel-patches-bot
cd tools/testing/selftests/bpf; ./test_progs -t tailcalls
335/26 tailcalls/tailcall_bpf2bpf_freplace:OK
335 tailcalls:OK
Summary: 1/26 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
.../selftests/bpf/prog_tests/tailcalls.c | 99 ++++++++++++++++++-
.../testing/selftests/bpf/progs/tc_bpf2bpf.c | 24 ++++-
2 files changed, 119 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index 21c5a37846ade..fa3f3bb11b098 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -1525,7 +1525,8 @@ static void test_tailcall_freplace(void)
prog_fd = bpf_program__fd(tc_skel->progs.entry_tc);
freplace_prog = freplace_skel->progs.entry_freplace;
- err = bpf_program__set_attach_target(freplace_prog, prog_fd, "subprog");
+ err = bpf_program__set_attach_target(freplace_prog, prog_fd,
+ "subprog_tailcall_tc");
if (!ASSERT_OK(err, "set_attach_target"))
goto out;
@@ -1534,7 +1535,7 @@ static void test_tailcall_freplace(void)
goto out;
freplace_link = bpf_program__attach_freplace(freplace_prog, prog_fd,
- "subprog");
+ "subprog_tailcall_tc");
if (!ASSERT_OK_PTR(freplace_link, "attach_freplace"))
goto out;
@@ -1556,6 +1557,98 @@ static void test_tailcall_freplace(void)
tailcall_freplace__destroy(freplace_skel);
}
+/* test_tailcall_bpf2bpf_freplace checks the failure that fails to attach a tail
+ * callee prog with freplace prog or fails to update an extended prog to
+ * prog_array map.
+ */
+static void test_tailcall_bpf2bpf_freplace(void)
+{
+ struct tailcall_freplace *freplace_skel = NULL;
+ struct bpf_link *freplace_link = NULL;
+ struct tc_bpf2bpf *tc_skel = NULL;
+ char buff[128] = {};
+ int prog_fd, map_fd;
+ int err, key;
+
+ LIBBPF_OPTS(bpf_test_run_opts, topts,
+ .data_in = buff,
+ .data_size_in = sizeof(buff),
+ .repeat = 1,
+ );
+
+ tc_skel = tc_bpf2bpf__open_and_load();
+ if (!ASSERT_OK_PTR(tc_skel, "tc_bpf2bpf__open_and_load"))
+ goto out;
+
+ prog_fd = bpf_program__fd(tc_skel->progs.entry_tc);
+ freplace_skel = tailcall_freplace__open();
+ if (!ASSERT_OK_PTR(freplace_skel, "tailcall_freplace__open"))
+ goto out;
+
+ err = bpf_program__set_attach_target(freplace_skel->progs.entry_freplace,
+ prog_fd, "subprog_tc");
+ if (!ASSERT_OK(err, "set_attach_target"))
+ goto out;
+
+ err = tailcall_freplace__load(freplace_skel);
+ if (!ASSERT_OK(err, "tailcall_freplace__load"))
+ goto out;
+
+ /* OK to attach then detach freplace prog. */
+
+ freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace,
+ prog_fd, "subprog_tc");
+ if (!ASSERT_OK_PTR(freplace_link, "attach_freplace"))
+ goto out;
+
+ err = bpf_link__destroy(freplace_link);
+ if (!ASSERT_OK(err, "destroy link"))
+ goto out;
+
+ /* OK to update prog_array map then delete element from the map. */
+
+ key = 0;
+ map_fd = bpf_map__fd(freplace_skel->maps.jmp_table);
+ err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY);
+ if (!ASSERT_OK(err, "update jmp_table"))
+ goto out;
+
+ err = bpf_map_delete_elem(map_fd, &key);
+ if (!ASSERT_OK(err, "delete_elem from jmp_table"))
+ goto out;
+
+ /* Fail to attach a tail callee prog with freplace prog. */
+
+ err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY);
+ if (!ASSERT_OK(err, "update jmp_table"))
+ goto out;
+
+ freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace,
+ prog_fd, "subprog_tc");
+ if (!ASSERT_ERR_PTR(freplace_link, "attach_freplace failure"))
+ goto out;
+
+ err = bpf_map_delete_elem(map_fd, &key);
+ if (!ASSERT_OK(err, "delete_elem from jmp_table"))
+ goto out;
+
+ /* Fail to update an extended prog to prog_array map. */
+
+ freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace,
+ prog_fd, "subprog_tc");
+ if (!ASSERT_OK_PTR(freplace_link, "attach_freplace"))
+ goto out;
+
+ err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY);
+ if (!ASSERT_ERR(err, "update jmp_table failure"))
+ goto out;
+
+out:
+ bpf_link__destroy(freplace_link);
+ tailcall_freplace__destroy(freplace_skel);
+ tc_bpf2bpf__destroy(tc_skel);
+}
+
void test_tailcalls(void)
{
if (test__start_subtest("tailcall_1"))
@@ -1606,4 +1699,6 @@ void test_tailcalls(void)
test_tailcall_bpf2bpf_hierarchy_3();
if (test__start_subtest("tailcall_freplace"))
test_tailcall_freplace();
+ if (test__start_subtest("tailcall_bpf2bpf_freplace"))
+ test_tailcall_bpf2bpf_freplace();
}
diff --git a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
index 8a0632c37839a..34f3c780194e4 100644
--- a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
@@ -4,11 +4,30 @@
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
+struct {
+ __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+ __uint(max_entries, 1);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+__noinline
+int subprog_tailcall_tc(struct __sk_buff *skb)
+{
+ int ret = 1;
+
+ bpf_tail_call_static(skb, &jmp_table, 0);
+ __sink(skb);
+ __sink(ret);
+ return ret;
+}
+
__noinline
-int subprog(struct __sk_buff *skb)
+int subprog_tc(struct __sk_buff *skb)
{
int ret = 1;
+ __sink(skb);
__sink(ret);
return ret;
}
@@ -16,7 +35,8 @@ int subprog(struct __sk_buff *skb)
SEC("tc")
int entry_tc(struct __sk_buff *skb)
{
- return subprog(skb);
+ subprog_tc(skb);
+ return subprog_tailcall_tc(skb);
}
char __license[] SEC("license") = "GPL";
--
2.44.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test tailcall in freplace
2024-09-23 13:40 [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
` (2 preceding siblings ...)
2024-09-23 13:40 ` [PATCH bpf-next v3 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented Leon Hwang
@ 2024-09-23 13:40 ` Leon Hwang
3 siblings, 0 replies; 11+ messages in thread
From: Leon Hwang @ 2024-09-23 13:40 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, eddyz87, iii, leon.hwang, kernel-patches-bot
cd tools/testing/selftests/bpf; ./test_progs -t tailcalls
335/27 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_1:OK
335/28 tailcalls/tailcall_bpf2bpf_hierarchy_freplace_2:OK
335 tailcalls:OK
Summary: 1/28 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
.../selftests/bpf/prog_tests/tailcalls.c | 97 +++++++++++++++++++
.../tailcall_bpf2bpf_hierarchy_freplace.c | 30 ++++++
.../testing/selftests/bpf/progs/tc_bpf2bpf.c | 13 +++
3 files changed, 140 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c
diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index fa3f3bb11b098..0564ad6c9b288 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -5,6 +5,7 @@
#include "tailcall_poke.skel.h"
#include "tailcall_bpf2bpf_hierarchy2.skel.h"
#include "tailcall_bpf2bpf_hierarchy3.skel.h"
+#include "tailcall_bpf2bpf_hierarchy_freplace.skel.h"
#include "tailcall_freplace.skel.h"
#include "tc_bpf2bpf.skel.h"
@@ -1649,6 +1650,98 @@ static void test_tailcall_bpf2bpf_freplace(void)
tc_bpf2bpf__destroy(tc_skel);
}
+static void test_tailcall_bpf2bpf_hierarchy_freplace(bool freplace_subprog,
+ bool target_entry2,
+ int count1, int count2)
+{
+ struct tailcall_bpf2bpf_hierarchy_freplace *freplace_skel = NULL;
+ struct bpf_link *freplace_link = NULL;
+ struct tc_bpf2bpf *tc_skel = NULL;
+ int prog_fd, map_fd;
+ char buff[128] = {};
+ int err, key, val;
+
+ LIBBPF_OPTS(bpf_test_run_opts, topts,
+ .data_in = buff,
+ .data_size_in = sizeof(buff),
+ .repeat = 1,
+ );
+
+ tc_skel = tc_bpf2bpf__open_and_load();
+ if (!ASSERT_OK_PTR(tc_skel, "tc_bpf2bpf__open_and_load"))
+ goto out;
+
+ prog_fd = bpf_program__fd(target_entry2 ? tc_skel->progs.entry_tc_2 :
+ tc_skel->progs.entry_tc);
+ freplace_skel = tailcall_bpf2bpf_hierarchy_freplace__open();
+ if (!ASSERT_OK_PTR(freplace_skel, "tailcall_bpf2bpf_hierarchy_freplace__open"))
+ goto out;
+
+ err = bpf_program__set_attach_target(freplace_skel->progs.entry_freplace,
+ prog_fd, freplace_subprog ?
+ "subprog_tailcall_tc" : "entry_tc");
+ if (!ASSERT_OK(err, "set_attach_target"))
+ goto out;
+
+ err = tailcall_bpf2bpf_hierarchy_freplace__load(freplace_skel);
+ if (!ASSERT_OK(err, "tailcall_bpf2bpf_hierarchy_freplace__load"))
+ goto out;
+
+ val = bpf_program__fd(freplace_skel->progs.entry_freplace);
+ map_fd = bpf_map__fd(freplace_skel->maps.jmp_table);
+ key = 0;
+ err = bpf_map_update_elem(map_fd, &key, &val, BPF_ANY);
+ if (!ASSERT_OK(err, "update jmp_table"))
+ goto out;
+
+ freplace_link = bpf_program__attach_freplace(freplace_skel->progs.entry_freplace,
+ prog_fd, freplace_subprog ?
+ "subprog_tailcall_tc" : "entry_tc");
+ if (!ASSERT_OK_PTR(freplace_link, "attach_freplace"))
+ goto out;
+
+ err = bpf_prog_test_run_opts(prog_fd, &topts);
+ ASSERT_OK(err, "test_run_opts");
+ ASSERT_EQ(topts.retval, count1, "test_run_opts retval");
+
+ err = bpf_map_delete_elem(map_fd, &key);
+ if (!ASSERT_OK(err, "delete_elem from jmp_table"))
+ goto out;
+
+ err = bpf_prog_test_run_opts(prog_fd, &topts);
+ ASSERT_OK(err, "test_run_opts again");
+ ASSERT_EQ(topts.retval, count2, "test_run_opts retval again");
+
+out:
+ bpf_link__destroy(freplace_link);
+ tailcall_bpf2bpf_hierarchy_freplace__destroy(freplace_skel);
+ tc_bpf2bpf__destroy(tc_skel);
+}
+
+/* test_tailcall_bpf2bpf_hierarchy_freplace_1 checks the count value of tail
+ * call limit enforcement matches with expectation for the case:
+ *
+ * subprog_tail --tailcall-> entry_freplace
+ * entry_tc --jump-> entry_freplace <
+ * subprog_tail --tailcall-> entry_freplace
+ */
+static void test_tailcall_bpf2bpf_hierarchy_freplace_1(void)
+{
+ test_tailcall_bpf2bpf_hierarchy_freplace(false, false, 34, 35);
+}
+
+/* test_tailcall_bpf2bpf_hierarchy_freplace_2 checks the count value of tail
+ * call limit enforcement matches with expectation for the case:
+ *
+ * subprog_tail --tailcall-> entry_freplace
+ * entry_tc_2 --> subprog_tailcall_tc (call 10 times) --jump-> entry_freplace <
+ * subprog_tail --tailcall-> entry_freplace
+ */
+static void test_tailcall_bpf2bpf_hierarchy_freplace_2(void)
+{
+ test_tailcall_bpf2bpf_hierarchy_freplace(true, true, 340, 350);
+}
+
void test_tailcalls(void)
{
if (test__start_subtest("tailcall_1"))
@@ -1701,4 +1794,8 @@ void test_tailcalls(void)
test_tailcall_freplace();
if (test__start_subtest("tailcall_bpf2bpf_freplace"))
test_tailcall_bpf2bpf_freplace();
+ if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_1"))
+ test_tailcall_bpf2bpf_hierarchy_freplace_1();
+ if (test__start_subtest("tailcall_bpf2bpf_hierarchy_freplace_2"))
+ test_tailcall_bpf2bpf_hierarchy_freplace_2();
}
diff --git a/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c
new file mode 100644
index 0000000000000..6f7c1fac9ddb7
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_freplace.c
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+ __uint(max_entries, 1);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+int count = 0;
+
+static __noinline
+int subprog_tail(struct __sk_buff *skb)
+{
+ bpf_tail_call_static(skb, &jmp_table, 0);
+ return 0;
+}
+
+SEC("freplace")
+int entry_freplace(struct __sk_buff *skb)
+{
+ count++;
+ subprog_tail(skb);
+ subprog_tail(skb);
+ return count;
+}
+
+char __license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
index 34f3c780194e4..beacf60a52677 100644
--- a/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
+++ b/tools/testing/selftests/bpf/progs/tc_bpf2bpf.c
@@ -39,4 +39,17 @@ int entry_tc(struct __sk_buff *skb)
return subprog_tailcall_tc(skb);
}
+SEC("tc")
+int entry_tc_2(struct __sk_buff *skb)
+{
+ int ret, i;
+
+ for (i = 0; i < 10; i++) {
+ ret = subprog_tailcall_tc(skb);
+ __sink(ret);
+ }
+
+ return ret;
+}
+
char __license[] SEC("license") = "GPL";
--
2.44.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map
2024-09-23 13:40 ` [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map Leon Hwang
@ 2024-09-25 1:24 ` Eduard Zingerman
2024-09-26 7:16 ` Leon Hwang
0 siblings, 1 reply; 11+ messages in thread
From: Eduard Zingerman @ 2024-09-25 1:24 UTC (permalink / raw)
To: Leon Hwang, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
On Mon, 2024-09-23 at 21:40 +0800, Leon Hwang wrote:
[...]
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 8a4117f6d7610..18b3f9216b050 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3292,8 +3292,11 @@ static void bpf_tracing_link_release(struct bpf_link *link)
> bpf_trampoline_put(tr_link->trampoline);
>
> /* tgt_prog is NULL if target is a kernel function */
> - if (tr_link->tgt_prog)
> + if (tr_link->tgt_prog) {
> + if (link->prog->type == BPF_PROG_TYPE_EXT)
> + tr_link->tgt_prog->aux->is_extended = false;
> bpf_prog_put(tr_link->tgt_prog);
> + }
> }
>
> static void bpf_tracing_link_dealloc(struct bpf_link *link)
> @@ -3523,6 +3526,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
> if (prog->aux->dst_trampoline && tr != prog->aux->dst_trampoline)
> /* we allocated a new trampoline, so free the old one */
> bpf_trampoline_put(prog->aux->dst_trampoline);
> + if (prog->type == BPF_PROG_TYPE_EXT)
> + tgt_prog->aux->is_extended = true;
>
> prog->aux->dst_prog = NULL;
> prog->aux->dst_trampoline = NULL;
Sorry, this might be a silly question, I do not fully understand how
programs and trampolines are protected against concurrent update.
Sequence of actions in bpf_tracing_prog_attach():
a. call bpf_trampoline_link_prog(&link->link, tr)
this returns success if `tr->extension_prog` is NULL,
meaning trampoline is "free";
b. update tgt_prog->aux->is_extended = true.
Sequence of actions in bpf_tracing_link_release():
c. call bpf_trampoline_unlink_prog(&tr_link->link, tr_link->trampoline)
this sets `tr->extension_prog` to NULL;
d. update tr_link->tgt_prog->aux->is_extended = false.
In a concurrent environment, is it possible to have actions ordered as:
- thread #1: does bpf_tracing_link_release(link pointing to tgt_prog)
- thread #2: does bpf_tracing_prog_attach(some_prog, tgt_prog)
- thread #1: (c) tr->extension_prog is set to NULL
- thread #2: (a) tr->extension_prog is set to some_prog
- thread #2: (b) tgt_prog->aux->is_extended = true
- thread #1: (d) tgt_prog->aux->is_extended = false
Thus, loosing the is_extended mark?
(As far as I understand bpf_trampoline_compute_key() call in
bpf_tracing_prog_attach() it is possible for threads #1 and #2 to
operate on a same trampoline object).
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace
2024-09-23 13:40 ` [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace Leon Hwang
@ 2024-09-25 5:32 ` Eduard Zingerman
2024-09-26 7:19 ` Leon Hwang
0 siblings, 1 reply; 11+ messages in thread
From: Eduard Zingerman @ 2024-09-25 5:32 UTC (permalink / raw)
To: Leon Hwang, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
On Mon, 2024-09-23 at 21:40 +0800, Leon Hwang wrote:
[...]
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 048aa2625cbef..b864b37e67c17 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1484,6 +1484,7 @@ struct bpf_prog_aux {
> bool exception_cb;
> bool exception_boundary;
> bool is_extended; /* true if extended by freplace program */
> + atomic_t tail_callee_cnt;
Nit: the name is a bit misleading, this counts how many times the
program resides it prog maps. Confusing w/o additional comments.
Maybe something like 'member_of_prog_array_cnt'?
> struct bpf_arena *arena;
> /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
> const struct btf_type *attach_func_proto;
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 8d97bae98fa70..c12e0e3bf6ad0 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -961,13 +961,17 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
> return ERR_PTR(-EINVAL);
> }
>
> + atomic_inc(&prog->aux->tail_callee_cnt);
> return prog;
> }
[...]
> static u32 prog_fd_array_sys_lookup_elem(void *ptr)
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 18b3f9216b050..be829016d8182 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3501,6 +3501,18 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
> tgt_prog = prog->aux->dst_prog;
> }
>
> + if (prog->type == BPF_PROG_TYPE_EXT &&
> + atomic_read(&tgt_prog->aux->tail_callee_cnt)) {
> + /* Program extensions can not extend target prog when the target
> + * prog has been updated to any prog_array map as tail callee.
> + * It's to prevent a potential infinite loop like:
> + * tgt prog entry -> tgt prog subprog -> freplace prog entry
> + * --tailcall-> tgt prog entry.
> + */
> + err = -EINVAL;
> + goto out_unlock;
> + }
> +
> err = bpf_link_prime(&link->link.link, &link_primer);
> if (err)
> goto out_unlock;
Is it possible there is a race between map update and prog attach?
E.g. suppose the following sequence of events:
- thread #1 enters prog_fd_array_get_ptr()
- thread #1 successfully completes prog->aux->is_extended check (not extended)
- thread #2 enters bpf_tracing_prog_attach()
- thread #2 does atomic_read() for tgt_prog and it returns 0
- thread #2 proceeds attaching freplace to tgt_prog
- thread #1 does atomic_inc(&prog->aux->tail_callee_cnt)
Thus arriving to a state when tgt_prog is both a member of a map and
is freplaced. Is this a valid scenario?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map
2024-09-25 1:24 ` Eduard Zingerman
@ 2024-09-26 7:16 ` Leon Hwang
2024-09-27 12:23 ` Eduard Zingerman
0 siblings, 1 reply; 11+ messages in thread
From: Leon Hwang @ 2024-09-26 7:16 UTC (permalink / raw)
To: Eduard Zingerman, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
Hi Eduard,
Thank you for your review.
On 25/9/24 09:24, Eduard Zingerman wrote:
> On Mon, 2024-09-23 at 21:40 +0800, Leon Hwang wrote:
>
> [...]
>
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 8a4117f6d7610..18b3f9216b050 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -3292,8 +3292,11 @@ static void bpf_tracing_link_release(struct bpf_link *link)
>> bpf_trampoline_put(tr_link->trampoline);
>>
>> /* tgt_prog is NULL if target is a kernel function */
>> - if (tr_link->tgt_prog)
>> + if (tr_link->tgt_prog) {
>> + if (link->prog->type == BPF_PROG_TYPE_EXT)
>> + tr_link->tgt_prog->aux->is_extended = false;
>> bpf_prog_put(tr_link->tgt_prog);
>> + }
>> }
>>
>> static void bpf_tracing_link_dealloc(struct bpf_link *link)
>> @@ -3523,6 +3526,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
>> if (prog->aux->dst_trampoline && tr != prog->aux->dst_trampoline)
>> /* we allocated a new trampoline, so free the old one */
>> bpf_trampoline_put(prog->aux->dst_trampoline);
>> + if (prog->type == BPF_PROG_TYPE_EXT)
>> + tgt_prog->aux->is_extended = true;
>>
>> prog->aux->dst_prog = NULL;
>> prog->aux->dst_trampoline = NULL;
>
> Sorry, this might be a silly question, I do not fully understand how
> programs and trampolines are protected against concurrent update.
>
There's no protection against concurrent update.
> Sequence of actions in bpf_tracing_prog_attach():
> a. call bpf_trampoline_link_prog(&link->link, tr)
> this returns success if `tr->extension_prog` is NULL,
> meaning trampoline is "free";
> b. update tgt_prog->aux->is_extended = true.
>
> Sequence of actions in bpf_tracing_link_release():
> c. call bpf_trampoline_unlink_prog(&tr_link->link, tr_link->trampoline)
> this sets `tr->extension_prog` to NULL;
> d. update tr_link->tgt_prog->aux->is_extended = false.
>
> In a concurrent environment, is it possible to have actions ordered as:
> - thread #1: does bpf_tracing_link_release(link pointing to tgt_prog)
> - thread #2: does bpf_tracing_prog_attach(some_prog, tgt_prog)
> - thread #1: (c) tr->extension_prog is set to NULL
> - thread #2: (a) tr->extension_prog is set to some_prog
> - thread #2: (b) tgt_prog->aux->is_extended = true
> - thread #1: (d) tgt_prog->aux->is_extended = false
>
> Thus, loosing the is_extended mark?
Yes, you are correct.
>
> (As far as I understand bpf_trampoline_compute_key() call in
> bpf_tracing_prog_attach() it is possible for threads #1 and #2 to
> operate on a same trampoline object).
>
In order to avoid the above case:
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 6988e432fc3d..1f19b754623c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3281,6 +3281,9 @@ static void bpf_tracing_link_release(struct
bpf_link *link)
struct bpf_tracing_link *tr_link =
container_of(link, struct bpf_tracing_link, link.link);
+ if (link->prog->type == BPF_PROG_TYPE_EXT)
+ tr_link->tgt_prog->aux->is_extended = false;
+
WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
tr_link->trampoline));
@@ -3518,6 +3521,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog
*prog,
if (prog->aux->dst_trampoline && tr != prog->aux->dst_trampoline)
/* we allocated a new trampoline, so free the old one */
bpf_trampoline_put(prog->aux->dst_trampoline);
+ if (prog->type == BPF_PROG_TYPE_EXT)
+ tgt_prog->aux->is_extended = true;
prog->aux->dst_prog = NULL;
prog->aux->dst_trampoline = NULL;
In bpf_tracing_link_release():
1. update tr_link->tgt_prog->aux->is_extended = false.
2. bpf_trampoline_unlink_prog().
In bpf_tracing_prog_attach():
1. bpf_trampoline_link_prog().
2. update tgt_prog->aux->is_extended = true.
Then, it is able to avoid losing the is_extended mark.
Thanks,
Leon
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace
2024-09-25 5:32 ` Eduard Zingerman
@ 2024-09-26 7:19 ` Leon Hwang
2024-09-27 10:58 ` Eduard Zingerman
0 siblings, 1 reply; 11+ messages in thread
From: Leon Hwang @ 2024-09-26 7:19 UTC (permalink / raw)
To: Eduard Zingerman, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
On 25/9/24 13:32, Eduard Zingerman wrote:
> On Mon, 2024-09-23 at 21:40 +0800, Leon Hwang wrote:
>
> [...]
>
>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>> index 048aa2625cbef..b864b37e67c17 100644
>> --- a/include/linux/bpf.h
>> +++ b/include/linux/bpf.h
>> @@ -1484,6 +1484,7 @@ struct bpf_prog_aux {
>> bool exception_cb;
>> bool exception_boundary;
>> bool is_extended; /* true if extended by freplace program */
>> + atomic_t tail_callee_cnt;
>
> Nit: the name is a bit misleading, this counts how many times the
> program resides it prog maps. Confusing w/o additional comments.
> Maybe something like 'member_of_prog_array_cnt'?
>
'member_of_prog_array_cnt' is not accurate enough.
'prog_array_member_cnt' is better, and should alongside comment /*
counts how many times as member of prog_array */.
>> struct bpf_arena *arena;
>> /* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
>> const struct btf_type *attach_func_proto;
>> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
>> index 8d97bae98fa70..c12e0e3bf6ad0 100644
>> --- a/kernel/bpf/arraymap.c
>> +++ b/kernel/bpf/arraymap.c
>> @@ -961,13 +961,17 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
>> return ERR_PTR(-EINVAL);
>> }
>>
>> + atomic_inc(&prog->aux->tail_callee_cnt);
>> return prog;
>> }
>
> [...]
>
>> static u32 prog_fd_array_sys_lookup_elem(void *ptr)
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 18b3f9216b050..be829016d8182 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -3501,6 +3501,18 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
>> tgt_prog = prog->aux->dst_prog;
>> }
>>
>> + if (prog->type == BPF_PROG_TYPE_EXT &&
>> + atomic_read(&tgt_prog->aux->tail_callee_cnt)) {
>> + /* Program extensions can not extend target prog when the target
>> + * prog has been updated to any prog_array map as tail callee.
>> + * It's to prevent a potential infinite loop like:
>> + * tgt prog entry -> tgt prog subprog -> freplace prog entry
>> + * --tailcall-> tgt prog entry.
>> + */
>> + err = -EINVAL;
>> + goto out_unlock;
>> + }
>> +
>> err = bpf_link_prime(&link->link.link, &link_primer);
>> if (err)
>> goto out_unlock;
>
> Is it possible there is a race between map update and prog attach?
Yes, it is possible.
> E.g. suppose the following sequence of events:
> - thread #1 enters prog_fd_array_get_ptr()
> - thread #1 successfully completes prog->aux->is_extended check (not extended)
> - thread #2 enters bpf_tracing_prog_attach()
> - thread #2 does atomic_read() for tgt_prog and it returns 0
> - thread #2 proceeds attaching freplace to tgt_prog
> - thread #1 does atomic_inc(&prog->aux->tail_callee_cnt)
>
> Thus arriving to a state when tgt_prog is both a member of a map and
> is freplaced. Is this a valid scenario?
>
This patch series aims to prevent such case that tgt_prog is a member of
prog_array and is freplaced at the same time.
Without this patch series, a prog can be extended by freplace prog and then
be updated to prog_array, or can be updated to prog_array and then be
extended by freplace prog, in order to construct such case.
This patch aims to prevent "be updated to prog_array and then be extended
by freplace prog".
The previous patch aims to prevent "be extended by freplace prog and then
be updated to prog_array".
So, in order to avoid the above case:
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index a43e62e2a8bb..da4e26029a33 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -948,7 +948,9 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
if (IS_ERR(prog))
return prog;
- if (!bpf_prog_map_compatible(map, prog)) {
+ atomic_inc(&prog->aux->tail_callee_cnt);
+ if (!bpf_prog_map_compatible(map, prog) || prog->aux->is_extended) {
+ atomic_dec(&prog->aux->tail_callee_cnt);
bpf_prog_put(prog);
return ERR_PTR(-EINVAL);
}
1. Increment tail_callee_cnt.
2. Decrement tail_callee_cnt, if prog->aux->is_extended.
Then, thread #2 does atomic_read() for tgt_prog, and it won't return 0.
Thanks,
Leon
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace
2024-09-26 7:19 ` Leon Hwang
@ 2024-09-27 10:58 ` Eduard Zingerman
0 siblings, 0 replies; 11+ messages in thread
From: Eduard Zingerman @ 2024-09-27 10:58 UTC (permalink / raw)
To: Leon Hwang, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
On Thu, 2024-09-26 at 15:19 +0800, Leon Hwang wrote:
[...]
> > E.g. suppose the following sequence of events:
> > - thread #1 enters prog_fd_array_get_ptr()
> > - thread #1 successfully completes prog->aux->is_extended check (not extended)
> > - thread #2 enters bpf_tracing_prog_attach()
> > - thread #2 does atomic_read() for tgt_prog and it returns 0
> > - thread #2 proceeds attaching freplace to tgt_prog
> > - thread #1 does atomic_inc(&prog->aux->tail_callee_cnt)
> >
> > Thus arriving to a state when tgt_prog is both a member of a map and
> > is freplaced. Is this a valid scenario?
> >
>
> This patch series aims to prevent such case that tgt_prog is a member of
> prog_array and is freplaced at the same time.
>
> Without this patch series, a prog can be extended by freplace prog and then
> be updated to prog_array, or can be updated to prog_array and then be
> extended by freplace prog, in order to construct such case.
>
> This patch aims to prevent "be updated to prog_array and then be extended
> by freplace prog".
> The previous patch aims to prevent "be extended by freplace prog and then
> be updated to prog_array".
>
> So, in order to avoid the above case:
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index a43e62e2a8bb..da4e26029a33 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -948,7 +948,9 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
> if (IS_ERR(prog))
> return prog;
>
> - if (!bpf_prog_map_compatible(map, prog)) {
> + atomic_inc(&prog->aux->tail_callee_cnt);
> + if (!bpf_prog_map_compatible(map, prog) || prog->aux->is_extended) {
> + atomic_dec(&prog->aux->tail_callee_cnt);
> bpf_prog_put(prog);
> return ERR_PTR(-EINVAL);
> }
I'm not sure this really solves the issue.
Documentation for both 'atomic_inc' and 'atomic_read'
(used in bpf_tracing_prog_attach()) says that these are operations with
relaxed memory ordering. Meaning that e.g. 'atomic_inc' executed
inside prog_fd_array_get_ptr() is not necessarily immediately visible
for other thread executing 'atomic_read' in bpf_tracing_prog_attach().
I think that some memory barrier is needed (non-relaxed func variant).
But all this gets unnecessarily complicated, neither
prog_fd_array_get_ptr() nor bpf_tracing_prog_attach() are executed
often, I think that 'tail_callee_cnt' and 'is_extended' should be
protected by a mutex.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map
2024-09-26 7:16 ` Leon Hwang
@ 2024-09-27 12:23 ` Eduard Zingerman
0 siblings, 0 replies; 11+ messages in thread
From: Eduard Zingerman @ 2024-09-27 12:23 UTC (permalink / raw)
To: Leon Hwang, bpf
Cc: ast, daniel, andrii, toke, martin.lau, yonghong.song, puranjay,
xukuohai, iii, kernel-patches-bot
On Thu, 2024-09-26 at 15:16 +0800, Leon Hwang wrote:
[...]
> There's no protection against concurrent update.
>
> > Sequence of actions in bpf_tracing_prog_attach():
> > a. call bpf_trampoline_link_prog(&link->link, tr)
> > this returns success if `tr->extension_prog` is NULL,
> > meaning trampoline is "free";
> > b. update tgt_prog->aux->is_extended = true.
> >
> > Sequence of actions in bpf_tracing_link_release():
> > c. call bpf_trampoline_unlink_prog(&tr_link->link, tr_link->trampoline)
> > this sets `tr->extension_prog` to NULL;
> > d. update tr_link->tgt_prog->aux->is_extended = false.
> >
> > In a concurrent environment, is it possible to have actions ordered as:
> > - thread #1: does bpf_tracing_link_release(link pointing to tgt_prog)
> > - thread #2: does bpf_tracing_prog_attach(some_prog, tgt_prog)
> > - thread #1: (c) tr->extension_prog is set to NULL
> > - thread #2: (a) tr->extension_prog is set to some_prog
> > - thread #2: (b) tgt_prog->aux->is_extended = true
> > - thread #1: (d) tgt_prog->aux->is_extended = false
> >
> > Thus, loosing the is_extended mark?
>
> Yes, you are correct.
>
> >
> > (As far as I understand bpf_trampoline_compute_key() call in
> > bpf_tracing_prog_attach() it is possible for threads #1 and #2 to
> > operate on a same trampoline object).
> >
>
> In order to avoid the above case:
>
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 6988e432fc3d..1f19b754623c 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -3281,6 +3281,9 @@ static void bpf_tracing_link_release(struct
> bpf_link *link)
> struct bpf_tracing_link *tr_link =
> container_of(link, struct bpf_tracing_link, link.link);
>
> + if (link->prog->type == BPF_PROG_TYPE_EXT)
> + tr_link->tgt_prog->aux->is_extended = false;
> +
Isn't this too early to reset 'is_extended'?
E.g. consider scenario:
- thread #1 enters bpf_tracing_link_release() and sets is_extended == false
- thread #2 enters prog_fd_array_get_ptr(), is_extended is false,
and it proceeds putting tgt_prog to an array;
- execution of tgt_prog is triggered and freplace patch is still in effect,
because thread #1 had not finished bpf_tracing_link_release() yet.
Here same bug we are trying to protect against (circular tailcall)
is still potentially visible, isn't it?
> WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
> tr_link->trampoline));
>
> @@ -3518,6 +3521,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog
> *prog,
> if (prog->aux->dst_trampoline && tr != prog->aux->dst_trampoline)
> /* we allocated a new trampoline, so free the old one */
> bpf_trampoline_put(prog->aux->dst_trampoline);
> + if (prog->type == BPF_PROG_TYPE_EXT)
> + tgt_prog->aux->is_extended = true;
>
> prog->aux->dst_prog = NULL;
> prog->aux->dst_trampoline = NULL;
>
> In bpf_tracing_link_release():
> 1. update tr_link->tgt_prog->aux->is_extended = false.
> 2. bpf_trampoline_unlink_prog().
>
> In bpf_tracing_prog_attach():
> 1. bpf_trampoline_link_prog().
> 2. update tgt_prog->aux->is_extended = true.
>
> Then, it is able to avoid losing the is_extended mark.
>
> Thanks,
> Leon
>
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-09-27 12:23 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-23 13:40 [PATCH bpf-next v3 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 1/4] bpf: Prevent updating extended prog to prog_array map Leon Hwang
2024-09-25 1:24 ` Eduard Zingerman
2024-09-26 7:16 ` Leon Hwang
2024-09-27 12:23 ` Eduard Zingerman
2024-09-23 13:40 ` [PATCH bpf-next v3 2/4] bpf: Prevent extending tail callee prog with freplace Leon Hwang
2024-09-25 5:32 ` Eduard Zingerman
2024-09-26 7:19 ` Leon Hwang
2024-09-27 10:58 ` Eduard Zingerman
2024-09-23 13:40 ` [PATCH bpf-next v3 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented Leon Hwang
2024-09-23 13:40 ` [PATCH bpf-next v3 4/4] selftests/bpf: Add cases to test tailcall in freplace Leon Hwang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox