From: Leon Hwang <leon.hwang@linux.dev>
To: bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev,
puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com,
iii@linux.ibm.com, leon.hwang@linux.dev,
kernel-patches-bot@fb.com
Subject: [PATCH bpf-next v4 1/4] bpf: Prevent updating extended prog to prog_array map
Date: Sun, 29 Sep 2024 21:27:54 +0800 [thread overview]
Message-ID: <20240929132757.79826-2-leon.hwang@linux.dev> (raw)
In-Reply-To: <20240929132757.79826-1-leon.hwang@linux.dev>
This patch partially prevents a potential infinite loop issue caused by
combination of tailcal and freplace.
For example:
tc_bpf2bpf.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
__noinline
int subprog_tc(struct __sk_buff *skb)
{
return skb->len * 2;
}
SEC("tc")
int entry_tc(struct __sk_buff *skb)
{
return subprog_tc(skb);
}
char __license[] SEC("license") = "GPL";
tailcall_freplace.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");
int count = 0;
SEC("freplace")
int entry_freplace(struct __sk_buff *skb)
{
count++;
bpf_tail_call_static(skb, &jmp_table, 0);
return count;
}
char __license[] SEC("license") = "GPL";
The attach target of entry_freplace is subprog_tc, and the tail callee
in entry_freplace is entry_tc.
Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace
--tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count
from zero for every time of entry_freplace execution. Kernel will panic,
like:
[ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____)
(stack is (____ptrval____)..(____ptrval____))
[ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI
[ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE
6.10.0-rc6-g026dcdae8d3e-dirty #72
[ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Call Trace:
[ 15.310490] <#DF>
[ 15.310490] ? die+0x36/0x90
[ 15.310490] ? handle_stack_overflow+0x4d/0x60
[ 15.310490] ? exc_double_fault+0x117/0x1a0
[ 15.310490] ? asm_exc_double_fault+0x23/0x30
[ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] </#DF>
[ 15.310490] <TASK>
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] ...
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] bpf_test_run+0x210/0x370
[ 15.310490] ? bpf_test_run+0x128/0x370
[ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0
[ 15.310490] __sys_bpf+0xdbf/0x2c40
[ 15.310490] ? clockevents_program_event+0x52/0xf0
[ 15.310490] ? lock_release+0xbf/0x290
[ 15.310490] __x64_sys_bpf+0x1e/0x30
[ 15.310490] do_syscall_64+0x68/0x140
[ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 15.310490] RIP: 0033:0x7f133b52725d
[ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48
[ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX:
0000000000000141
[ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX:
00007f133b52725d
[ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI:
000000000000000a
[ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09:
00007ffddbc102a0
[ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12:
0000000000000004
[ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15:
00007f133b6ed000
[ 15.310490] </TASK>
[ 15.310490] Modules linked in: bpf_testmod(OE)
[ 15.310490] ---[ end trace 0000000000000000 ]---
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
This patch partially prevents this panic by preventing updating extended
prog to prog_array map.
If a prog or its subprog has been extended by freplace prog, the prog
can not be updated to prog_array map.
Alongside next patch, the panic will be prevented completely.
BTW, fix a minor style issue by replacing 8-spaces with a tab.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 2 ++
kernel/bpf/arraymap.c | 21 +++++++++++++++++----
kernel/bpf/core.c | 1 +
kernel/bpf/syscall.c | 42 ++++++++++++++++++++++++++++++++++++------
4 files changed, 56 insertions(+), 10 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 19d8ca8ac960f..aac6d2f42830c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1483,6 +1483,8 @@ struct bpf_prog_aux {
bool xdp_has_frags;
bool exception_cb;
bool exception_boundary;
+ bool is_extended; /* true if extended by freplace program */
+ struct mutex ext_mutex; /* mutex for is_extended */
struct bpf_arena *arena;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 79660e3fca4c1..4a4de4f014be9 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -947,16 +947,29 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
struct file *map_file, int fd)
{
struct bpf_prog *prog = bpf_prog_get(fd);
+ bool is_extended;
if (IS_ERR(prog))
return prog;
- if (!bpf_prog_map_compatible(map, prog)) {
- bpf_prog_put(prog);
- return ERR_PTR(-EINVAL);
- }
+ if (!bpf_prog_map_compatible(map, prog))
+ goto out_put_prog;
+
+ mutex_lock(&prog->aux->ext_mutex);
+ is_extended = prog->aux->is_extended;
+ mutex_unlock(&prog->aux->ext_mutex);
+ if (is_extended)
+ /* Extended prog can not be tail callee. It's to prevent a
+ * potential infinite loop like:
+ * tail callee prog entry -> tail callee prog subprog ->
+ * freplace prog entry --tailcall-> tail callee prog entry.
+ */
+ goto out_put_prog;
return prog;
+out_put_prog:
+ bpf_prog_put(prog);
+ return ERR_PTR(-EINVAL);
}
static void prog_fd_array_put_ptr(struct bpf_map *map, void *ptr, bool need_defer)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 4e07cc057d6f2..ea7f59374b378 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -131,6 +131,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
INIT_LIST_HEAD_RCU(&fp->aux->ksym_prefix.lnode);
#endif
mutex_init(&fp->aux->used_maps_mutex);
+ mutex_init(&fp->aux->ext_mutex);
mutex_init(&fp->aux->dst_mutex);
return fp;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a8f1808a1ca54..db17c52fa35db 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3212,14 +3212,23 @@ static void bpf_tracing_link_release(struct bpf_link *link)
{
struct bpf_tracing_link *tr_link =
container_of(link, struct bpf_tracing_link, link.link);
-
- WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
- tr_link->trampoline));
+ struct bpf_prog *tgt_prog = tr_link->tgt_prog;
+
+ if (link->prog->type == BPF_PROG_TYPE_EXT) {
+ mutex_lock(&tgt_prog->aux->ext_mutex);
+ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
+ tr_link->trampoline));
+ tgt_prog->aux->is_extended = false;
+ mutex_unlock(&tgt_prog->aux->ext_mutex);
+ } else {
+ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
+ tr_link->trampoline));
+ }
bpf_trampoline_put(tr_link->trampoline);
/* tgt_prog is NULL if target is a kernel function */
- if (tr_link->tgt_prog)
+ if (tgt_prog)
bpf_prog_put(tr_link->tgt_prog);
}
@@ -3270,6 +3279,24 @@ static const struct bpf_link_ops bpf_tracing_link_lops = {
.fill_link_info = bpf_tracing_link_fill_link_info,
};
+static int bpf_extend_prog(struct bpf_tracing_link *link,
+ struct bpf_trampoline *tr,
+ struct bpf_prog *tgt_prog)
+{
+ struct bpf_prog_aux *aux = tgt_prog->aux;
+ int err = 0;
+
+ mutex_lock(&aux->ext_mutex);
+ err = bpf_trampoline_link_prog(&link->link, tr);
+ if (err)
+ goto out_unlock;
+
+ aux->is_extended = true;
+out_unlock:
+ mutex_unlock(&aux->ext_mutex);
+ return err;
+}
+
static int bpf_tracing_prog_attach(struct bpf_prog *prog,
int tgt_prog_fd,
u32 btf_id,
@@ -3354,7 +3381,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
* in prog->aux
*
* - if prog->aux->dst_trampoline is NULL, the program has already been
- * attached to a target and its initial target was cleared (below)
+ * attached to a target and its initial target was cleared (below)
*
* - if tgt_prog != NULL, the caller specified tgt_prog_fd +
* target_btf_id using the link_create API.
@@ -3429,7 +3456,10 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
if (err)
goto out_unlock;
- err = bpf_trampoline_link_prog(&link->link, tr);
+ if (prog->type == BPF_PROG_TYPE_EXT)
+ err = bpf_extend_prog(link, tr, tgt_prog);
+ else
+ err = bpf_trampoline_link_prog(&link->link, tr);
if (err) {
bpf_link_cleanup(&link_primer);
link = NULL;
--
2.44.0
next prev parent reply other threads:[~2024-09-29 13:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-29 13:27 [PATCH bpf-next v4 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
2024-09-29 13:27 ` Leon Hwang [this message]
2024-10-01 11:13 ` [PATCH bpf-next v4 1/4] bpf: Prevent updating extended prog to prog_array map Eduard Zingerman
2024-10-01 13:20 ` Leon Hwang
2024-10-01 16:54 ` Eduard Zingerman
2024-09-29 13:27 ` [PATCH bpf-next v4 2/4] bpf: Prevent extending tail callee prog with freplace prog Leon Hwang
2024-09-30 1:53 ` Leon Hwang
2024-10-04 19:33 ` Alexei Starovoitov
2024-10-04 20:37 ` Eduard Zingerman
2024-10-04 20:52 ` Eduard Zingerman
2024-10-04 23:30 ` Alexei Starovoitov
2024-09-29 13:27 ` [PATCH bpf-next v4 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented Leon Hwang
2024-09-29 13:27 ` [PATCH bpf-next v4 4/4] selftests/bpf: Add cases to test tailcall in freplace Leon Hwang
2024-10-01 11:14 ` [PATCH bpf-next v4 0/4] bpf: Fix tailcall infinite loop caused by freplace Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240929132757.79826-2-leon.hwang@linux.dev \
--to=leon.hwang@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=iii@linux.ibm.com \
--cc=kernel-patches-bot@fb.com \
--cc=martin.lau@kernel.org \
--cc=puranjay@kernel.org \
--cc=toke@redhat.com \
--cc=xukuohai@huaweicloud.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox