From: Leon Hwang <leon.hwang@linux.dev>
To: bpf@vger.kernel.org
Cc: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
toke@redhat.com, martin.lau@kernel.org, yonghong.song@linux.dev,
puranjay@kernel.org, xukuohai@huaweicloud.com, eddyz87@gmail.com,
iii@linux.ibm.com, leon.hwang@linux.dev,
kernel-patches-bot@fb.com
Subject: [PATCH bpf-next v4 1/4] bpf: Prevent updating extended prog to prog_array map
Date: Sun, 29 Sep 2024 21:27:54 +0800 [thread overview]
Message-ID: <20240929132757.79826-2-leon.hwang@linux.dev> (raw)
In-Reply-To: <20240929132757.79826-1-leon.hwang@linux.dev>
This patch partially prevents a potential infinite loop issue caused by
combination of tailcal and freplace.
For example:
tc_bpf2bpf.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
__noinline
int subprog_tc(struct __sk_buff *skb)
{
return skb->len * 2;
}
SEC("tc")
int entry_tc(struct __sk_buff *skb)
{
return subprog_tc(skb);
}
char __license[] SEC("license") = "GPL";
tailcall_freplace.c:
// SPDX-License-Identifier: GPL-2.0
\#include <linux/bpf.h>
\#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_PROG_ARRAY);
__uint(max_entries, 1);
__uint(key_size, sizeof(__u32));
__uint(value_size, sizeof(__u32));
} jmp_table SEC(".maps");
int count = 0;
SEC("freplace")
int entry_freplace(struct __sk_buff *skb)
{
count++;
bpf_tail_call_static(skb, &jmp_table, 0);
return count;
}
char __license[] SEC("license") = "GPL";
The attach target of entry_freplace is subprog_tc, and the tail callee
in entry_freplace is entry_tc.
Then, the infinite loop will be entry_tc -> subprog_tc -> entry_freplace
--tailcall-> entry_tc, because tail_call_cnt in entry_freplace will count
from zero for every time of entry_freplace execution. Kernel will panic,
like:
[ 15.310490] BUG: TASK stack guard page was hit at (____ptrval____)
(stack is (____ptrval____)..(____ptrval____))
[ 15.310490] Oops: stack guard page: 0000 [#1] PREEMPT SMP NOPTI
[ 15.310490] CPU: 1 PID: 89 Comm: test_progs Tainted: G OE
6.10.0-rc6-g026dcdae8d3e-dirty #72
[ 15.310490] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX,
1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Call Trace:
[ 15.310490] <#DF>
[ 15.310490] ? die+0x36/0x90
[ 15.310490] ? handle_stack_overflow+0x4d/0x60
[ 15.310490] ? exc_double_fault+0x117/0x1a0
[ 15.310490] ? asm_exc_double_fault+0x23/0x30
[ 15.310490] ? bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] </#DF>
[ 15.310490] <TASK>
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] ...
[ 15.310490] bpf_prog_85781a698094722f_entry+0x4c/0x64
[ 15.310490] bpf_prog_1c515f389a9059b4_entry2+0x19/0x1b
[ 15.310490] bpf_test_run+0x210/0x370
[ 15.310490] ? bpf_test_run+0x128/0x370
[ 15.310490] bpf_prog_test_run_skb+0x388/0x7a0
[ 15.310490] __sys_bpf+0xdbf/0x2c40
[ 15.310490] ? clockevents_program_event+0x52/0xf0
[ 15.310490] ? lock_release+0xbf/0x290
[ 15.310490] __x64_sys_bpf+0x1e/0x30
[ 15.310490] do_syscall_64+0x68/0x140
[ 15.310490] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 15.310490] RIP: 0033:0x7f133b52725d
[ 15.310490] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f
05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8b bb 0d 00 f7 d8 64 89 01 48
[ 15.310490] RSP: 002b:00007ffddbc10258 EFLAGS: 00000206 ORIG_RAX:
0000000000000141
[ 15.310490] RAX: ffffffffffffffda RBX: 00007ffddbc10828 RCX:
00007f133b52725d
[ 15.310490] RDX: 0000000000000050 RSI: 00007ffddbc102a0 RDI:
000000000000000a
[ 15.310490] RBP: 00007ffddbc10270 R08: 0000000000000000 R09:
00007ffddbc102a0
[ 15.310490] R10: 0000000000000064 R11: 0000000000000206 R12:
0000000000000004
[ 15.310490] R13: 0000000000000000 R14: 0000558ec4c24890 R15:
00007f133b6ed000
[ 15.310490] </TASK>
[ 15.310490] Modules linked in: bpf_testmod(OE)
[ 15.310490] ---[ end trace 0000000000000000 ]---
[ 15.310490] RIP: 0010:bpf_prog_3a140cef239a4b4f_subprog_tail+0x14/0x53
[ 15.310490] Code: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00 00 0f 1f 00 55 48 89 e5 f3 0f 1e
fa <50> 50 53 41 55 48 89 fb 49 bd 00 2a 46 82 98 9c ff ff 48 89 df 4c
[ 15.310490] RSP: 0018:ffffb500c0aa0000 EFLAGS: 00000202
[ 15.310490] RAX: ffffb500c0aa0028 RBX: ffff9c98808b7e00 RCX:
0000000000008cb5
[ 15.310490] RDX: 0000000000000000 RSI: ffff9c9882462a00 RDI:
ffff9c98808b7e00
[ 15.310490] RBP: ffffb500c0aa0000 R08: 0000000000000000 R09:
0000000000000000
[ 15.310490] R10: 0000000000000001 R11: 0000000000000000 R12:
ffffb500c01af000
[ 15.310490] R13: ffffb500c01cd000 R14: 0000000000000000 R15:
0000000000000000
[ 15.310490] FS: 00007f133b665140(0000) GS:ffff9c98bbd00000(0000)
knlGS:0000000000000000
[ 15.310490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 15.310490] CR2: ffffb500c0a9fff8 CR3: 0000000102478000 CR4:
00000000000006f0
[ 15.310490] Kernel panic - not syncing: Fatal exception in interrupt
[ 15.310490] Kernel Offset: 0x30000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
This patch partially prevents this panic by preventing updating extended
prog to prog_array map.
If a prog or its subprog has been extended by freplace prog, the prog
can not be updated to prog_array map.
Alongside next patch, the panic will be prevented completely.
BTW, fix a minor style issue by replacing 8-spaces with a tab.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 2 ++
kernel/bpf/arraymap.c | 21 +++++++++++++++++----
kernel/bpf/core.c | 1 +
kernel/bpf/syscall.c | 42 ++++++++++++++++++++++++++++++++++++------
4 files changed, 56 insertions(+), 10 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 19d8ca8ac960f..aac6d2f42830c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1483,6 +1483,8 @@ struct bpf_prog_aux {
bool xdp_has_frags;
bool exception_cb;
bool exception_boundary;
+ bool is_extended; /* true if extended by freplace program */
+ struct mutex ext_mutex; /* mutex for is_extended */
struct bpf_arena *arena;
/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
const struct btf_type *attach_func_proto;
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 79660e3fca4c1..4a4de4f014be9 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -947,16 +947,29 @@ static void *prog_fd_array_get_ptr(struct bpf_map *map,
struct file *map_file, int fd)
{
struct bpf_prog *prog = bpf_prog_get(fd);
+ bool is_extended;
if (IS_ERR(prog))
return prog;
- if (!bpf_prog_map_compatible(map, prog)) {
- bpf_prog_put(prog);
- return ERR_PTR(-EINVAL);
- }
+ if (!bpf_prog_map_compatible(map, prog))
+ goto out_put_prog;
+
+ mutex_lock(&prog->aux->ext_mutex);
+ is_extended = prog->aux->is_extended;
+ mutex_unlock(&prog->aux->ext_mutex);
+ if (is_extended)
+ /* Extended prog can not be tail callee. It's to prevent a
+ * potential infinite loop like:
+ * tail callee prog entry -> tail callee prog subprog ->
+ * freplace prog entry --tailcall-> tail callee prog entry.
+ */
+ goto out_put_prog;
return prog;
+out_put_prog:
+ bpf_prog_put(prog);
+ return ERR_PTR(-EINVAL);
}
static void prog_fd_array_put_ptr(struct bpf_map *map, void *ptr, bool need_defer)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 4e07cc057d6f2..ea7f59374b378 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -131,6 +131,7 @@ struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flag
INIT_LIST_HEAD_RCU(&fp->aux->ksym_prefix.lnode);
#endif
mutex_init(&fp->aux->used_maps_mutex);
+ mutex_init(&fp->aux->ext_mutex);
mutex_init(&fp->aux->dst_mutex);
return fp;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a8f1808a1ca54..db17c52fa35db 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3212,14 +3212,23 @@ static void bpf_tracing_link_release(struct bpf_link *link)
{
struct bpf_tracing_link *tr_link =
container_of(link, struct bpf_tracing_link, link.link);
-
- WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
- tr_link->trampoline));
+ struct bpf_prog *tgt_prog = tr_link->tgt_prog;
+
+ if (link->prog->type == BPF_PROG_TYPE_EXT) {
+ mutex_lock(&tgt_prog->aux->ext_mutex);
+ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
+ tr_link->trampoline));
+ tgt_prog->aux->is_extended = false;
+ mutex_unlock(&tgt_prog->aux->ext_mutex);
+ } else {
+ WARN_ON_ONCE(bpf_trampoline_unlink_prog(&tr_link->link,
+ tr_link->trampoline));
+ }
bpf_trampoline_put(tr_link->trampoline);
/* tgt_prog is NULL if target is a kernel function */
- if (tr_link->tgt_prog)
+ if (tgt_prog)
bpf_prog_put(tr_link->tgt_prog);
}
@@ -3270,6 +3279,24 @@ static const struct bpf_link_ops bpf_tracing_link_lops = {
.fill_link_info = bpf_tracing_link_fill_link_info,
};
+static int bpf_extend_prog(struct bpf_tracing_link *link,
+ struct bpf_trampoline *tr,
+ struct bpf_prog *tgt_prog)
+{
+ struct bpf_prog_aux *aux = tgt_prog->aux;
+ int err = 0;
+
+ mutex_lock(&aux->ext_mutex);
+ err = bpf_trampoline_link_prog(&link->link, tr);
+ if (err)
+ goto out_unlock;
+
+ aux->is_extended = true;
+out_unlock:
+ mutex_unlock(&aux->ext_mutex);
+ return err;
+}
+
static int bpf_tracing_prog_attach(struct bpf_prog *prog,
int tgt_prog_fd,
u32 btf_id,
@@ -3354,7 +3381,7 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
* in prog->aux
*
* - if prog->aux->dst_trampoline is NULL, the program has already been
- * attached to a target and its initial target was cleared (below)
+ * attached to a target and its initial target was cleared (below)
*
* - if tgt_prog != NULL, the caller specified tgt_prog_fd +
* target_btf_id using the link_create API.
@@ -3429,7 +3456,10 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
if (err)
goto out_unlock;
- err = bpf_trampoline_link_prog(&link->link, tr);
+ if (prog->type == BPF_PROG_TYPE_EXT)
+ err = bpf_extend_prog(link, tr, tgt_prog);
+ else
+ err = bpf_trampoline_link_prog(&link->link, tr);
if (err) {
bpf_link_cleanup(&link_primer);
link = NULL;
--
2.44.0
next prev parent reply other threads:[~2024-09-29 13:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-29 13:27 [PATCH bpf-next v4 0/4] bpf: Fix tailcall infinite loop caused by freplace Leon Hwang
2024-09-29 13:27 ` Leon Hwang [this message]
2024-10-01 11:13 ` [PATCH bpf-next v4 1/4] bpf: Prevent updating extended prog to prog_array map Eduard Zingerman
2024-10-01 13:20 ` Leon Hwang
2024-10-01 16:54 ` Eduard Zingerman
2024-09-29 13:27 ` [PATCH bpf-next v4 2/4] bpf: Prevent extending tail callee prog with freplace prog Leon Hwang
2024-09-30 1:53 ` Leon Hwang
2024-10-04 19:33 ` Alexei Starovoitov
2024-10-04 20:37 ` Eduard Zingerman
2024-10-04 20:52 ` Eduard Zingerman
2024-10-04 23:30 ` Alexei Starovoitov
2024-09-29 13:27 ` [PATCH bpf-next v4 3/4] selftests/bpf: Add a test case to confirm a tailcall infinite loop issue has been prevented Leon Hwang
2024-09-29 13:27 ` [PATCH bpf-next v4 4/4] selftests/bpf: Add cases to test tailcall in freplace Leon Hwang
2024-10-01 11:14 ` [PATCH bpf-next v4 0/4] bpf: Fix tailcall infinite loop caused by freplace Eduard Zingerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240929132757.79826-2-leon.hwang@linux.dev \
--to=leon.hwang@linux.dev \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=eddyz87@gmail.com \
--cc=iii@linux.ibm.com \
--cc=kernel-patches-bot@fb.com \
--cc=martin.lau@kernel.org \
--cc=puranjay@kernel.org \
--cc=toke@redhat.com \
--cc=xukuohai@huaweicloud.com \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.