* [PATCHv4 bpf 0/2] bpf: Fix map poke update
@ 2023-12-06 8:30 Jiri Olsa
2023-12-06 8:30 ` [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run " Jiri Olsa
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Jiri Olsa @ 2023-12-06 8:30 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai, Will Deacon,
Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich, Lee Jones
hi,
this patchset fixes the issue reported in [0].
v4 changes:
- added missing bpf_arch_poke_desc_update prototype [lkp]
- added comments to the test
- moved the test under prog_tests/tailcalls.c
thanks,
jirka
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
---
Jiri Olsa (2):
bpf: Fix prog_array_map_poke_run map poke update
selftests/bpf: Add test for early update in prog_array_map_poke_run
arch/x86/net/bpf_jit_comp.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
include/linux/bpf.h | 3 +++
kernel/bpf/arraymap.c | 58 ++++++++++------------------------------------------------
tools/testing/selftests/bpf/prog_tests/tailcalls.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tools/testing/selftests/bpf/progs/tailcall_poke.c | 32 ++++++++++++++++++++++++++++++++
5 files changed, 175 insertions(+), 48 deletions(-)
create mode 100644 tools/testing/selftests/bpf/progs/tailcall_poke.c
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-06 8:30 [PATCHv4 bpf 0/2] bpf: Fix map poke update Jiri Olsa
@ 2023-12-06 8:30 ` Jiri Olsa
2023-12-21 9:07 ` Lee Jones
2023-12-06 8:30 ` [PATCHv4 bpf 2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run Jiri Olsa
2023-12-06 21:50 ` [PATCHv4 bpf 0/2] bpf: Fix map poke update patchwork-bot+netdevbpf
2 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-12-06 8:30 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
Cc: Lee Jones, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
Lee pointed out issue found by syscaller [0] hitting BUG in prog array
map poke update in prog_array_map_poke_run function due to error value
returned from bpf_arch_text_poke function.
There's race window where bpf_arch_text_poke can fail due to missing
bpf program kallsym symbols, which is accounted for with check for
-EINVAL in that BUG_ON call.
The problem is that in such case we won't update the tail call jump
and cause imbalance for the next tail call update check which will
fail with -EBUSY in bpf_arch_text_poke.
I'm hitting following race during the program load:
CPU 0 CPU 1
bpf_prog_load
bpf_check
do_misc_fixups
prog_array_map_poke_track
map_update_elem
bpf_fd_array_map_update_elem
prog_array_map_poke_run
bpf_arch_text_poke returns -EINVAL
bpf_prog_kallsyms_add
After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
poke update fails on expected jump instruction check in bpf_arch_text_poke
with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
Similar race exists on the program unload.
Fixing this by moving the update to bpf_arch_poke_desc_update function which
makes sure we call __bpf_arch_text_poke that skips the bpf address check.
Each architecture has slightly different approach wrt looking up bpf address
in bpf_arch_text_poke, so instead of splitting the function or adding new
'checkip' argument in previous version, it seems best to move the whole
map_poke_run update as arch specific code.
[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
Cc: Lee Jones <lee@kernel.org>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
include/linux/bpf.h | 3 ++
kernel/bpf/arraymap.c | 58 +++++++------------------------------
3 files changed, 59 insertions(+), 48 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 8c10d9abc239..e89e415aa743 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -3025,3 +3025,49 @@ void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp
#endif
WARN(1, "verification of programs using bpf_throw should have failed\n");
}
+
+void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old)
+{
+ u8 *old_addr, *new_addr, *old_bypass_addr;
+ int ret;
+
+ old_bypass_addr = old ? NULL : poke->bypass_addr;
+ old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
+ new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
+
+ /*
+ * On program loading or teardown, the program's kallsym entry
+ * might not be in place, so we use __bpf_arch_text_poke to skip
+ * the kallsyms check.
+ */
+ if (new) {
+ ret = __bpf_arch_text_poke(poke->tailcall_target,
+ BPF_MOD_JUMP,
+ old_addr, new_addr);
+ BUG_ON(ret < 0);
+ if (!old) {
+ ret = __bpf_arch_text_poke(poke->tailcall_bypass,
+ BPF_MOD_JUMP,
+ poke->bypass_addr,
+ NULL);
+ BUG_ON(ret < 0);
+ }
+ } else {
+ ret = __bpf_arch_text_poke(poke->tailcall_bypass,
+ BPF_MOD_JUMP,
+ old_bypass_addr,
+ poke->bypass_addr);
+ BUG_ON(ret < 0);
+ /* let other CPUs finish the execution of program
+ * so that it will not possible to expose them
+ * to invalid nop, stack unwind, nop state
+ */
+ if (!ret)
+ synchronize_rcu();
+ ret = __bpf_arch_text_poke(poke->tailcall_target,
+ BPF_MOD_JUMP,
+ old_addr, NULL);
+ BUG_ON(ret < 0);
+ }
+}
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6762dac3ef76..cff5bb08820e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -3175,6 +3175,9 @@ enum bpf_text_poke_type {
int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
void *addr1, void *addr2);
+void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old);
+
void *bpf_arch_text_copy(void *dst, void *src, size_t len);
int bpf_arch_text_invalidate(void *dst, size_t len);
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 2058e89b5ddd..c85ff9162a5c 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -1012,11 +1012,16 @@ static void prog_array_map_poke_untrack(struct bpf_map *map,
mutex_unlock(&aux->poke_mutex);
}
+void __weak bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
+ struct bpf_prog *new, struct bpf_prog *old)
+{
+ WARN_ON_ONCE(1);
+}
+
static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
struct bpf_prog *old,
struct bpf_prog *new)
{
- u8 *old_addr, *new_addr, *old_bypass_addr;
struct prog_poke_elem *elem;
struct bpf_array_aux *aux;
@@ -1025,7 +1030,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
list_for_each_entry(elem, &aux->poke_progs, list) {
struct bpf_jit_poke_descriptor *poke;
- int i, ret;
+ int i;
for (i = 0; i < elem->aux->size_poke_tab; i++) {
poke = &elem->aux->poke_tab[i];
@@ -1044,21 +1049,10 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
* activated, so tail call updates can arrive from here
* while JIT is still finishing its final fixup for
* non-activated poke entries.
- * 3) On program teardown, the program's kallsym entry gets
- * removed out of RCU callback, but we can only untrack
- * from sleepable context, therefore bpf_arch_text_poke()
- * might not see that this is in BPF text section and
- * bails out with -EINVAL. As these are unreachable since
- * RCU grace period already passed, we simply skip them.
- * 4) Also programs reaching refcount of zero while patching
+ * 3) Also programs reaching refcount of zero while patching
* is in progress is okay since we're protected under
* poke_mutex and untrack the programs before the JIT
- * buffer is freed. When we're still in the middle of
- * patching and suddenly kallsyms entry of the program
- * gets evicted, we just skip the rest which is fine due
- * to point 3).
- * 5) Any other error happening below from bpf_arch_text_poke()
- * is a unexpected bug.
+ * buffer is freed.
*/
if (!READ_ONCE(poke->tailcall_target_stable))
continue;
@@ -1068,39 +1062,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
poke->tail_call.key != key)
continue;
- old_bypass_addr = old ? NULL : poke->bypass_addr;
- old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
- new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
-
- if (new) {
- ret = bpf_arch_text_poke(poke->tailcall_target,
- BPF_MOD_JUMP,
- old_addr, new_addr);
- BUG_ON(ret < 0 && ret != -EINVAL);
- if (!old) {
- ret = bpf_arch_text_poke(poke->tailcall_bypass,
- BPF_MOD_JUMP,
- poke->bypass_addr,
- NULL);
- BUG_ON(ret < 0 && ret != -EINVAL);
- }
- } else {
- ret = bpf_arch_text_poke(poke->tailcall_bypass,
- BPF_MOD_JUMP,
- old_bypass_addr,
- poke->bypass_addr);
- BUG_ON(ret < 0 && ret != -EINVAL);
- /* let other CPUs finish the execution of program
- * so that it will not possible to expose them
- * to invalid nop, stack unwind, nop state
- */
- if (!ret)
- synchronize_rcu();
- ret = bpf_arch_text_poke(poke->tailcall_target,
- BPF_MOD_JUMP,
- old_addr, NULL);
- BUG_ON(ret < 0 && ret != -EINVAL);
- }
+ bpf_arch_poke_desc_update(poke, new, old);
}
}
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCHv4 bpf 2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run
2023-12-06 8:30 [PATCHv4 bpf 0/2] bpf: Fix map poke update Jiri Olsa
2023-12-06 8:30 ` [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run " Jiri Olsa
@ 2023-12-06 8:30 ` Jiri Olsa
2023-12-06 21:50 ` [PATCHv4 bpf 0/2] bpf: Fix map poke update patchwork-bot+netdevbpf
2 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2023-12-06 8:30 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
Cc: Ilya Leoshkevich, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Lee Jones
Adding test that tries to trigger the BUG_IN during early map update
in prog_array_map_poke_run function.
The idea is to share prog array map between thread that constantly
updates it and another one loading a program that uses that prog
array.
Eventually we will hit a place where the program is ok to be updated
(poke->tailcall_target_stable check) but the address is still not
registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL
and cause imbalance for the next tail call update check, which will
fail with -EBUSY in bpf_arch_text_poke as described in previous fix.
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
.../selftests/bpf/prog_tests/tailcalls.c | 84 +++++++++++++++++++
.../selftests/bpf/progs/tailcall_poke.c | 32 +++++++
2 files changed, 116 insertions(+)
create mode 100644 tools/testing/selftests/bpf/progs/tailcall_poke.c
diff --git a/tools/testing/selftests/bpf/prog_tests/tailcalls.c b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
index fc6b2954e8f5..59993fc9c0d7 100644
--- a/tools/testing/selftests/bpf/prog_tests/tailcalls.c
+++ b/tools/testing/selftests/bpf/prog_tests/tailcalls.c
@@ -1,6 +1,9 @@
// SPDX-License-Identifier: GPL-2.0
+#include <unistd.h>
#include <test_progs.h>
#include <network_helpers.h>
+#include "tailcall_poke.skel.h"
+
/* test_tailcall_1 checks basic functionality by patching multiple locations
* in a single program for a single tail call slot with nop->jmp, jmp->nop
@@ -1105,6 +1108,85 @@ static void test_tailcall_bpf2bpf_fentry_entry(void)
bpf_object__close(tgt_obj);
}
+#define JMP_TABLE "/sys/fs/bpf/jmp_table"
+
+static int poke_thread_exit;
+
+static void *poke_update(void *arg)
+{
+ __u32 zero = 0, prog1_fd, prog2_fd, map_fd;
+ struct tailcall_poke *call = arg;
+
+ map_fd = bpf_map__fd(call->maps.jmp_table);
+ prog1_fd = bpf_program__fd(call->progs.call1);
+ prog2_fd = bpf_program__fd(call->progs.call2);
+
+ while (!poke_thread_exit) {
+ bpf_map_update_elem(map_fd, &zero, &prog1_fd, BPF_ANY);
+ bpf_map_update_elem(map_fd, &zero, &prog2_fd, BPF_ANY);
+ }
+
+ return NULL;
+}
+
+/*
+ * We are trying to hit prog array update during another program load
+ * that shares the same prog array map.
+ *
+ * For that we share the jmp_table map between two skeleton instances
+ * by pinning the jmp_table to same path. Then first skeleton instance
+ * periodically updates jmp_table in 'poke update' thread while we load
+ * the second skeleton instance in the main thread.
+ */
+static void test_tailcall_poke(void)
+{
+ struct tailcall_poke *call, *test;
+ int err, cnt = 10;
+ pthread_t thread;
+
+ unlink(JMP_TABLE);
+
+ call = tailcall_poke__open_and_load();
+ if (!ASSERT_OK_PTR(call, "tailcall_poke__open"))
+ return;
+
+ err = bpf_map__pin(call->maps.jmp_table, JMP_TABLE);
+ if (!ASSERT_OK(err, "bpf_map__pin"))
+ goto out;
+
+ err = pthread_create(&thread, NULL, poke_update, call);
+ if (!ASSERT_OK(err, "new toggler"))
+ goto out;
+
+ while (cnt--) {
+ test = tailcall_poke__open();
+ if (!ASSERT_OK_PTR(test, "tailcall_poke__open"))
+ break;
+
+ err = bpf_map__set_pin_path(test->maps.jmp_table, JMP_TABLE);
+ if (!ASSERT_OK(err, "bpf_map__pin")) {
+ tailcall_poke__destroy(test);
+ break;
+ }
+
+ bpf_program__set_autoload(test->progs.test, true);
+ bpf_program__set_autoload(test->progs.call1, false);
+ bpf_program__set_autoload(test->progs.call2, false);
+
+ err = tailcall_poke__load(test);
+ tailcall_poke__destroy(test);
+ if (!ASSERT_OK(err, "tailcall_poke__load"))
+ break;
+ }
+
+ poke_thread_exit = 1;
+ ASSERT_OK(pthread_join(thread, NULL), "pthread_join");
+
+out:
+ bpf_map__unpin(call->maps.jmp_table, JMP_TABLE);
+ tailcall_poke__destroy(call);
+}
+
void test_tailcalls(void)
{
if (test__start_subtest("tailcall_1"))
@@ -1139,4 +1221,6 @@ void test_tailcalls(void)
test_tailcall_bpf2bpf_fentry_fexit();
if (test__start_subtest("tailcall_bpf2bpf_fentry_entry"))
test_tailcall_bpf2bpf_fentry_entry();
+ if (test__start_subtest("tailcall_poke"))
+ test_tailcall_poke();
}
diff --git a/tools/testing/selftests/bpf/progs/tailcall_poke.c b/tools/testing/selftests/bpf/progs/tailcall_poke.c
new file mode 100644
index 000000000000..c78b94b75e83
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tailcall_poke.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+struct {
+ __uint(type, BPF_MAP_TYPE_PROG_ARRAY);
+ __uint(max_entries, 1);
+ __uint(key_size, sizeof(__u32));
+ __uint(value_size, sizeof(__u32));
+} jmp_table SEC(".maps");
+
+SEC("?fentry/bpf_fentry_test1")
+int BPF_PROG(test, int a)
+{
+ bpf_tail_call_static(ctx, &jmp_table, 0);
+ return 0;
+}
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(call1, int a)
+{
+ return 0;
+}
+
+SEC("fentry/bpf_fentry_test1")
+int BPF_PROG(call2, int a)
+{
+ return 0;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 0/2] bpf: Fix map poke update
2023-12-06 8:30 [PATCHv4 bpf 0/2] bpf: Fix map poke update Jiri Olsa
2023-12-06 8:30 ` [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run " Jiri Olsa
2023-12-06 8:30 ` [PATCHv4 bpf 2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run Jiri Olsa
@ 2023-12-06 21:50 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-12-06 21:50 UTC (permalink / raw)
To: Jiri Olsa
Cc: ast, daniel, andrii, bpf, kafai, songliubraving, yhs,
john.fastabend, kpsingh, sdf, haoluo, xukuohai, will, nathan,
pulehui, bjorn, iii, lee
Hello:
This series was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:
On Wed, 6 Dec 2023 09:30:39 +0100 you wrote:
> hi,
> this patchset fixes the issue reported in [0].
>
> v4 changes:
> - added missing bpf_arch_poke_desc_update prototype [lkp]
> - added comments to the test
> - moved the test under prog_tests/tailcalls.c
>
> [...]
Here is the summary with links:
- [PATCHv4,bpf,1/2] bpf: Fix prog_array_map_poke_run map poke update
https://git.kernel.org/bpf/bpf/c/4b7de801606e
- [PATCHv4,bpf,2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run
https://git.kernel.org/bpf/bpf/c/ffed24eff9e0
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-06 8:30 ` [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run " Jiri Olsa
@ 2023-12-21 9:07 ` Lee Jones
2023-12-21 9:34 ` Greg KH
0 siblings, 1 reply; 11+ messages in thread
From: Lee Jones @ 2023-12-21 9:07 UTC (permalink / raw)
To: Jiri Olsa, stable
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810, Yonghong Song,
bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai, Will Deacon,
Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
Dear Stable,
> Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> map poke update in prog_array_map_poke_run function due to error value
> returned from bpf_arch_text_poke function.
>
> There's race window where bpf_arch_text_poke can fail due to missing
> bpf program kallsym symbols, which is accounted for with check for
> -EINVAL in that BUG_ON call.
>
> The problem is that in such case we won't update the tail call jump
> and cause imbalance for the next tail call update check which will
> fail with -EBUSY in bpf_arch_text_poke.
>
> I'm hitting following race during the program load:
>
> CPU 0 CPU 1
>
> bpf_prog_load
> bpf_check
> do_misc_fixups
> prog_array_map_poke_track
>
> map_update_elem
> bpf_fd_array_map_update_elem
> prog_array_map_poke_run
>
> bpf_arch_text_poke returns -EINVAL
>
> bpf_prog_kallsyms_add
>
> After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> poke update fails on expected jump instruction check in bpf_arch_text_poke
> with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
>
> Similar race exists on the program unload.
>
> Fixing this by moving the update to bpf_arch_poke_desc_update function which
> makes sure we call __bpf_arch_text_poke that skips the bpf address check.
>
> Each architecture has slightly different approach wrt looking up bpf address
> in bpf_arch_text_poke, so instead of splitting the function or adding new
> 'checkip' argument in previous version, it seems best to move the whole
> map_poke_run update as arch specific code.
>
> [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
>
> Cc: Lee Jones <lee@kernel.org>
> Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> Acked-by: Yonghong Song <yonghong.song@linux.dev>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> include/linux/bpf.h | 3 ++
> kernel/bpf/arraymap.c | 58 +++++++------------------------------
> 3 files changed, 59 insertions(+), 48 deletions(-)
Please could we have this backported?
Guided by the Fixes: tag.
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 8c10d9abc239..e89e415aa743 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -3025,3 +3025,49 @@ void arch_bpf_stack_walk(bool (*consume_fn)(void *cookie, u64 ip, u64 sp, u64 bp
> #endif
> WARN(1, "verification of programs using bpf_throw should have failed\n");
> }
> +
> +void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old)
> +{
> + u8 *old_addr, *new_addr, *old_bypass_addr;
> + int ret;
> +
> + old_bypass_addr = old ? NULL : poke->bypass_addr;
> + old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
> + new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
> +
> + /*
> + * On program loading or teardown, the program's kallsym entry
> + * might not be in place, so we use __bpf_arch_text_poke to skip
> + * the kallsyms check.
> + */
> + if (new) {
> + ret = __bpf_arch_text_poke(poke->tailcall_target,
> + BPF_MOD_JUMP,
> + old_addr, new_addr);
> + BUG_ON(ret < 0);
> + if (!old) {
> + ret = __bpf_arch_text_poke(poke->tailcall_bypass,
> + BPF_MOD_JUMP,
> + poke->bypass_addr,
> + NULL);
> + BUG_ON(ret < 0);
> + }
> + } else {
> + ret = __bpf_arch_text_poke(poke->tailcall_bypass,
> + BPF_MOD_JUMP,
> + old_bypass_addr,
> + poke->bypass_addr);
> + BUG_ON(ret < 0);
> + /* let other CPUs finish the execution of program
> + * so that it will not possible to expose them
> + * to invalid nop, stack unwind, nop state
> + */
> + if (!ret)
> + synchronize_rcu();
> + ret = __bpf_arch_text_poke(poke->tailcall_target,
> + BPF_MOD_JUMP,
> + old_addr, NULL);
> + BUG_ON(ret < 0);
> + }
> +}
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6762dac3ef76..cff5bb08820e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -3175,6 +3175,9 @@ enum bpf_text_poke_type {
> int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> void *addr1, void *addr2);
>
> +void bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old);
> +
> void *bpf_arch_text_copy(void *dst, void *src, size_t len);
> int bpf_arch_text_invalidate(void *dst, size_t len);
>
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index 2058e89b5ddd..c85ff9162a5c 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -1012,11 +1012,16 @@ static void prog_array_map_poke_untrack(struct bpf_map *map,
> mutex_unlock(&aux->poke_mutex);
> }
>
> +void __weak bpf_arch_poke_desc_update(struct bpf_jit_poke_descriptor *poke,
> + struct bpf_prog *new, struct bpf_prog *old)
> +{
> + WARN_ON_ONCE(1);
> +}
> +
> static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> struct bpf_prog *old,
> struct bpf_prog *new)
> {
> - u8 *old_addr, *new_addr, *old_bypass_addr;
> struct prog_poke_elem *elem;
> struct bpf_array_aux *aux;
>
> @@ -1025,7 +1030,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
>
> list_for_each_entry(elem, &aux->poke_progs, list) {
> struct bpf_jit_poke_descriptor *poke;
> - int i, ret;
> + int i;
>
> for (i = 0; i < elem->aux->size_poke_tab; i++) {
> poke = &elem->aux->poke_tab[i];
> @@ -1044,21 +1049,10 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> * activated, so tail call updates can arrive from here
> * while JIT is still finishing its final fixup for
> * non-activated poke entries.
> - * 3) On program teardown, the program's kallsym entry gets
> - * removed out of RCU callback, but we can only untrack
> - * from sleepable context, therefore bpf_arch_text_poke()
> - * might not see that this is in BPF text section and
> - * bails out with -EINVAL. As these are unreachable since
> - * RCU grace period already passed, we simply skip them.
> - * 4) Also programs reaching refcount of zero while patching
> + * 3) Also programs reaching refcount of zero while patching
> * is in progress is okay since we're protected under
> * poke_mutex and untrack the programs before the JIT
> - * buffer is freed. When we're still in the middle of
> - * patching and suddenly kallsyms entry of the program
> - * gets evicted, we just skip the rest which is fine due
> - * to point 3).
> - * 5) Any other error happening below from bpf_arch_text_poke()
> - * is a unexpected bug.
> + * buffer is freed.
> */
> if (!READ_ONCE(poke->tailcall_target_stable))
> continue;
> @@ -1068,39 +1062,7 @@ static void prog_array_map_poke_run(struct bpf_map *map, u32 key,
> poke->tail_call.key != key)
> continue;
>
> - old_bypass_addr = old ? NULL : poke->bypass_addr;
> - old_addr = old ? (u8 *)old->bpf_func + poke->adj_off : NULL;
> - new_addr = new ? (u8 *)new->bpf_func + poke->adj_off : NULL;
> -
> - if (new) {
> - ret = bpf_arch_text_poke(poke->tailcall_target,
> - BPF_MOD_JUMP,
> - old_addr, new_addr);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - if (!old) {
> - ret = bpf_arch_text_poke(poke->tailcall_bypass,
> - BPF_MOD_JUMP,
> - poke->bypass_addr,
> - NULL);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - }
> - } else {
> - ret = bpf_arch_text_poke(poke->tailcall_bypass,
> - BPF_MOD_JUMP,
> - old_bypass_addr,
> - poke->bypass_addr);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - /* let other CPUs finish the execution of program
> - * so that it will not possible to expose them
> - * to invalid nop, stack unwind, nop state
> - */
> - if (!ret)
> - synchronize_rcu();
> - ret = bpf_arch_text_poke(poke->tailcall_target,
> - BPF_MOD_JUMP,
> - old_addr, NULL);
> - BUG_ON(ret < 0 && ret != -EINVAL);
> - }
> + bpf_arch_poke_desc_update(poke, new, old);
> }
> }
> }
> --
> 2.43.0
>
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 9:07 ` Lee Jones
@ 2023-12-21 9:34 ` Greg KH
2023-12-21 9:55 ` Lee Jones
0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2023-12-21 9:34 UTC (permalink / raw)
To: Lee Jones
Cc: Jiri Olsa, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> Dear Stable,
>
> > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > map poke update in prog_array_map_poke_run function due to error value
> > returned from bpf_arch_text_poke function.
> >
> > There's race window where bpf_arch_text_poke can fail due to missing
> > bpf program kallsym symbols, which is accounted for with check for
> > -EINVAL in that BUG_ON call.
> >
> > The problem is that in such case we won't update the tail call jump
> > and cause imbalance for the next tail call update check which will
> > fail with -EBUSY in bpf_arch_text_poke.
> >
> > I'm hitting following race during the program load:
> >
> > CPU 0 CPU 1
> >
> > bpf_prog_load
> > bpf_check
> > do_misc_fixups
> > prog_array_map_poke_track
> >
> > map_update_elem
> > bpf_fd_array_map_update_elem
> > prog_array_map_poke_run
> >
> > bpf_arch_text_poke returns -EINVAL
> >
> > bpf_prog_kallsyms_add
> >
> > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> >
> > Similar race exists on the program unload.
> >
> > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> >
> > Each architecture has slightly different approach wrt looking up bpf address
> > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > 'checkip' argument in previous version, it seems best to move the whole
> > map_poke_run update as arch specific code.
> >
> > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> >
> > Cc: Lee Jones <lee@kernel.org>
> > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > include/linux/bpf.h | 3 ++
> > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > 3 files changed, 59 insertions(+), 48 deletions(-)
>
> Please could we have this backported?
>
> Guided by the Fixes: tag.
<formletter>
This is not the correct way to submit patches for inclusion in the
stable kernel tree. Please read:
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.
</formletter>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 9:34 ` Greg KH
@ 2023-12-21 9:55 ` Lee Jones
2023-12-21 10:02 ` Greg KH
0 siblings, 1 reply; 11+ messages in thread
From: Lee Jones @ 2023-12-21 9:55 UTC (permalink / raw)
To: Greg KH
Cc: Jiri Olsa, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, 21 Dec 2023, Greg KH wrote:
> On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> > Dear Stable,
> >
> > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > > map poke update in prog_array_map_poke_run function due to error value
> > > returned from bpf_arch_text_poke function.
> > >
> > > There's race window where bpf_arch_text_poke can fail due to missing
> > > bpf program kallsym symbols, which is accounted for with check for
> > > -EINVAL in that BUG_ON call.
> > >
> > > The problem is that in such case we won't update the tail call jump
> > > and cause imbalance for the next tail call update check which will
> > > fail with -EBUSY in bpf_arch_text_poke.
> > >
> > > I'm hitting following race during the program load:
> > >
> > > CPU 0 CPU 1
> > >
> > > bpf_prog_load
> > > bpf_check
> > > do_misc_fixups
> > > prog_array_map_poke_track
> > >
> > > map_update_elem
> > > bpf_fd_array_map_update_elem
> > > prog_array_map_poke_run
> > >
> > > bpf_arch_text_poke returns -EINVAL
> > >
> > > bpf_prog_kallsyms_add
> > >
> > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> > >
> > > Similar race exists on the program unload.
> > >
> > > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> > >
> > > Each architecture has slightly different approach wrt looking up bpf address
> > > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > > 'checkip' argument in previous version, it seems best to move the whole
> > > map_poke_run update as arch specific code.
> > >
> > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> > >
> > > Cc: Lee Jones <lee@kernel.org>
> > > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > > include/linux/bpf.h | 3 ++
> > > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > > 3 files changed, 59 insertions(+), 48 deletions(-)
> >
> > Please could we have this backported?
> >
> > Guided by the Fixes: tag.
>
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree. Please read:
> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>
Apologies.
Commit ID: 4b7de801606e504e69689df71475d27e35336fb3
Subject: bpf: Fix prog_array_map_poke_run map poke update
Reason: Fixes a race condition in BPF.
Versions: linux-5.10.y+, as specified by the Fixes: tag above
Thanks.
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 9:55 ` Lee Jones
@ 2023-12-21 10:02 ` Greg KH
2023-12-21 10:17 ` Lee Jones
0 siblings, 1 reply; 11+ messages in thread
From: Greg KH @ 2023-12-21 10:02 UTC (permalink / raw)
To: Lee Jones
Cc: Jiri Olsa, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, Dec 21, 2023 at 09:55:22AM +0000, Lee Jones wrote:
> On Thu, 21 Dec 2023, Greg KH wrote:
>
> > On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> > > Dear Stable,
> > >
> > > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > > > map poke update in prog_array_map_poke_run function due to error value
> > > > returned from bpf_arch_text_poke function.
> > > >
> > > > There's race window where bpf_arch_text_poke can fail due to missing
> > > > bpf program kallsym symbols, which is accounted for with check for
> > > > -EINVAL in that BUG_ON call.
> > > >
> > > > The problem is that in such case we won't update the tail call jump
> > > > and cause imbalance for the next tail call update check which will
> > > > fail with -EBUSY in bpf_arch_text_poke.
> > > >
> > > > I'm hitting following race during the program load:
> > > >
> > > > CPU 0 CPU 1
> > > >
> > > > bpf_prog_load
> > > > bpf_check
> > > > do_misc_fixups
> > > > prog_array_map_poke_track
> > > >
> > > > map_update_elem
> > > > bpf_fd_array_map_update_elem
> > > > prog_array_map_poke_run
> > > >
> > > > bpf_arch_text_poke returns -EINVAL
> > > >
> > > > bpf_prog_kallsyms_add
> > > >
> > > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > > > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> > > >
> > > > Similar race exists on the program unload.
> > > >
> > > > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > > > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> > > >
> > > > Each architecture has slightly different approach wrt looking up bpf address
> > > > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > > > 'checkip' argument in previous version, it seems best to move the whole
> > > > map_poke_run update as arch specific code.
> > > >
> > > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> > > >
> > > > Cc: Lee Jones <lee@kernel.org>
> > > > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > > > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > > > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > > > include/linux/bpf.h | 3 ++
> > > > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > > > 3 files changed, 59 insertions(+), 48 deletions(-)
> > >
> > > Please could we have this backported?
> > >
> > > Guided by the Fixes: tag.
> >
> > <formletter>
> >
> > This is not the correct way to submit patches for inclusion in the
> > stable kernel tree. Please read:
> > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > for how to do this properly.
> >
> > </formletter>
>
> Apologies.
>
> Commit ID: 4b7de801606e504e69689df71475d27e35336fb3
> Subject: bpf: Fix prog_array_map_poke_run map poke update
> Reason: Fixes a race condition in BPF.
> Versions: linux-5.10.y+, as specified by the Fixes: tag above
Did not apply to 5.10.y or 5.15.y, so if you need/want it there, we will
need a working backport that has been tested. Other trees it's now
queued up for.
BPF developers, please remember, just adding a "Fixes:" tag does NOT
guarantee that any patch will be backported to any stable kernel, you
MUST add a "cc: stable@..." tag to the patch if you wish to have it
automatically backported.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 10:02 ` Greg KH
@ 2023-12-21 10:17 ` Lee Jones
2023-12-21 14:00 ` Jiri Olsa
0 siblings, 1 reply; 11+ messages in thread
From: Lee Jones @ 2023-12-21 10:17 UTC (permalink / raw)
To: Greg KH
Cc: Jiri Olsa, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, 21 Dec 2023, Greg KH wrote:
> On Thu, Dec 21, 2023 at 09:55:22AM +0000, Lee Jones wrote:
> > On Thu, 21 Dec 2023, Greg KH wrote:
> >
> > > On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> > > > Dear Stable,
> > > >
> > > > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > > > > map poke update in prog_array_map_poke_run function due to error value
> > > > > returned from bpf_arch_text_poke function.
> > > > >
> > > > > There's race window where bpf_arch_text_poke can fail due to missing
> > > > > bpf program kallsym symbols, which is accounted for with check for
> > > > > -EINVAL in that BUG_ON call.
> > > > >
> > > > > The problem is that in such case we won't update the tail call jump
> > > > > and cause imbalance for the next tail call update check which will
> > > > > fail with -EBUSY in bpf_arch_text_poke.
> > > > >
> > > > > I'm hitting following race during the program load:
> > > > >
> > > > > CPU 0 CPU 1
> > > > >
> > > > > bpf_prog_load
> > > > > bpf_check
> > > > > do_misc_fixups
> > > > > prog_array_map_poke_track
> > > > >
> > > > > map_update_elem
> > > > > bpf_fd_array_map_update_elem
> > > > > prog_array_map_poke_run
> > > > >
> > > > > bpf_arch_text_poke returns -EINVAL
> > > > >
> > > > > bpf_prog_kallsyms_add
> > > > >
> > > > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > > > > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > > > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> > > > >
> > > > > Similar race exists on the program unload.
> > > > >
> > > > > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > > > > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> > > > >
> > > > > Each architecture has slightly different approach wrt looking up bpf address
> > > > > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > > > > 'checkip' argument in previous version, it seems best to move the whole
> > > > > map_poke_run update as arch specific code.
> > > > >
> > > > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> > > > >
> > > > > Cc: Lee Jones <lee@kernel.org>
> > > > > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > > > > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > > > > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > ---
> > > > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > > > > include/linux/bpf.h | 3 ++
> > > > > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > > > > 3 files changed, 59 insertions(+), 48 deletions(-)
> > > >
> > > > Please could we have this backported?
> > > >
> > > > Guided by the Fixes: tag.
> > >
> > > <formletter>
> > >
> > > This is not the correct way to submit patches for inclusion in the
> > > stable kernel tree. Please read:
> > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > for how to do this properly.
> > >
> > > </formletter>
> >
> > Apologies.
> >
> > Commit ID: 4b7de801606e504e69689df71475d27e35336fb3
> > Subject: bpf: Fix prog_array_map_poke_run map poke update
> > Reason: Fixes a race condition in BPF.
> > Versions: linux-5.10.y+, as specified by the Fixes: tag above
>
> Did not apply to 5.10.y or 5.15.y, so if you need/want it there, we will
> need a working backport that has been tested. Other trees it's now
> queued up for.
Thank you.
> BPF developers, please remember, just adding a "Fixes:" tag does NOT
> guarantee that any patch will be backported to any stable kernel, you
> MUST add a "cc: stable@..." tag to the patch if you wish to have it
> automatically backported.
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 10:17 ` Lee Jones
@ 2023-12-21 14:00 ` Jiri Olsa
2023-12-21 14:34 ` Lee Jones
0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2023-12-21 14:00 UTC (permalink / raw)
To: Lee Jones
Cc: Greg KH, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, Dec 21, 2023 at 10:17:44AM +0000, Lee Jones wrote:
> On Thu, 21 Dec 2023, Greg KH wrote:
>
> > On Thu, Dec 21, 2023 at 09:55:22AM +0000, Lee Jones wrote:
> > > On Thu, 21 Dec 2023, Greg KH wrote:
> > >
> > > > On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> > > > > Dear Stable,
> > > > >
> > > > > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > > > > > map poke update in prog_array_map_poke_run function due to error value
> > > > > > returned from bpf_arch_text_poke function.
> > > > > >
> > > > > > There's race window where bpf_arch_text_poke can fail due to missing
> > > > > > bpf program kallsym symbols, which is accounted for with check for
> > > > > > -EINVAL in that BUG_ON call.
> > > > > >
> > > > > > The problem is that in such case we won't update the tail call jump
> > > > > > and cause imbalance for the next tail call update check which will
> > > > > > fail with -EBUSY in bpf_arch_text_poke.
> > > > > >
> > > > > > I'm hitting following race during the program load:
> > > > > >
> > > > > > CPU 0 CPU 1
> > > > > >
> > > > > > bpf_prog_load
> > > > > > bpf_check
> > > > > > do_misc_fixups
> > > > > > prog_array_map_poke_track
> > > > > >
> > > > > > map_update_elem
> > > > > > bpf_fd_array_map_update_elem
> > > > > > prog_array_map_poke_run
> > > > > >
> > > > > > bpf_arch_text_poke returns -EINVAL
> > > > > >
> > > > > > bpf_prog_kallsyms_add
> > > > > >
> > > > > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > > > > > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > > > > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> > > > > >
> > > > > > Similar race exists on the program unload.
> > > > > >
> > > > > > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > > > > > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> > > > > >
> > > > > > Each architecture has slightly different approach wrt looking up bpf address
> > > > > > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > > > > > 'checkip' argument in previous version, it seems best to move the whole
> > > > > > map_poke_run update as arch specific code.
> > > > > >
> > > > > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> > > > > >
> > > > > > Cc: Lee Jones <lee@kernel.org>
> > > > > > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > > > > > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > > > > > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > ---
> > > > > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > > > > > include/linux/bpf.h | 3 ++
> > > > > > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > > > > > 3 files changed, 59 insertions(+), 48 deletions(-)
> > > > >
> > > > > Please could we have this backported?
> > > > >
> > > > > Guided by the Fixes: tag.
> > > >
> > > > <formletter>
> > > >
> > > > This is not the correct way to submit patches for inclusion in the
> > > > stable kernel tree. Please read:
> > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > > for how to do this properly.
> > > >
> > > > </formletter>
> > >
> > > Apologies.
> > >
> > > Commit ID: 4b7de801606e504e69689df71475d27e35336fb3
> > > Subject: bpf: Fix prog_array_map_poke_run map poke update
> > > Reason: Fixes a race condition in BPF.
> > > Versions: linux-5.10.y+, as specified by the Fixes: tag above
> >
> > Did not apply to 5.10.y or 5.15.y, so if you need/want it there, we will
> > need a working backport that has been tested. Other trees it's now
> > queued up for.
>
> Thank you.
please let me know if you need any help with that, I can check on that
jirka
>
> > BPF developers, please remember, just adding a "Fixes:" tag does NOT
> > guarantee that any patch will be backported to any stable kernel, you
> > MUST add a "cc: stable@..." tag to the patch if you wish to have it
> > automatically backported.
>
> --
> Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run map poke update
2023-12-21 14:00 ` Jiri Olsa
@ 2023-12-21 14:34 ` Lee Jones
0 siblings, 0 replies; 11+ messages in thread
From: Lee Jones @ 2023-12-21 14:34 UTC (permalink / raw)
To: Jiri Olsa
Cc: Greg KH, stable, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Maciej Fijalkowski, syzbot+97a4fe20470e9bc30810,
Yonghong Song, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo, Xu Kuohai,
Will Deacon, Nathan Chancellor, Pu Lehui, Björn Töpel,
Ilya Leoshkevich
On Thu, 21 Dec 2023, Jiri Olsa wrote:
> On Thu, Dec 21, 2023 at 10:17:44AM +0000, Lee Jones wrote:
> > On Thu, 21 Dec 2023, Greg KH wrote:
> >
> > > On Thu, Dec 21, 2023 at 09:55:22AM +0000, Lee Jones wrote:
> > > > On Thu, 21 Dec 2023, Greg KH wrote:
> > > >
> > > > > On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote:
> > > > > > Dear Stable,
> > > > > >
> > > > > > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array
> > > > > > > map poke update in prog_array_map_poke_run function due to error value
> > > > > > > returned from bpf_arch_text_poke function.
> > > > > > >
> > > > > > > There's race window where bpf_arch_text_poke can fail due to missing
> > > > > > > bpf program kallsym symbols, which is accounted for with check for
> > > > > > > -EINVAL in that BUG_ON call.
> > > > > > >
> > > > > > > The problem is that in such case we won't update the tail call jump
> > > > > > > and cause imbalance for the next tail call update check which will
> > > > > > > fail with -EBUSY in bpf_arch_text_poke.
> > > > > > >
> > > > > > > I'm hitting following race during the program load:
> > > > > > >
> > > > > > > CPU 0 CPU 1
> > > > > > >
> > > > > > > bpf_prog_load
> > > > > > > bpf_check
> > > > > > > do_misc_fixups
> > > > > > > prog_array_map_poke_track
> > > > > > >
> > > > > > > map_update_elem
> > > > > > > bpf_fd_array_map_update_elem
> > > > > > > prog_array_map_poke_run
> > > > > > >
> > > > > > > bpf_arch_text_poke returns -EINVAL
> > > > > > >
> > > > > > > bpf_prog_kallsyms_add
> > > > > > >
> > > > > > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next
> > > > > > > poke update fails on expected jump instruction check in bpf_arch_text_poke
> > > > > > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run.
> > > > > > >
> > > > > > > Similar race exists on the program unload.
> > > > > > >
> > > > > > > Fixing this by moving the update to bpf_arch_poke_desc_update function which
> > > > > > > makes sure we call __bpf_arch_text_poke that skips the bpf address check.
> > > > > > >
> > > > > > > Each architecture has slightly different approach wrt looking up bpf address
> > > > > > > in bpf_arch_text_poke, so instead of splitting the function or adding new
> > > > > > > 'checkip' argument in previous version, it seems best to move the whole
> > > > > > > map_poke_run update as arch specific code.
> > > > > > >
> > > > > > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
> > > > > > >
> > > > > > > Cc: Lee Jones <lee@kernel.org>
> > > > > > > Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > > > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
> > > > > > > Reported-by: syzbot+97a4fe20470e9bc30810@syzkaller.appspotmail.com
> > > > > > > Acked-by: Yonghong Song <yonghong.song@linux.dev>
> > > > > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > > > > ---
> > > > > > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++
> > > > > > > include/linux/bpf.h | 3 ++
> > > > > > > kernel/bpf/arraymap.c | 58 +++++++------------------------------
> > > > > > > 3 files changed, 59 insertions(+), 48 deletions(-)
> > > > > >
> > > > > > Please could we have this backported?
> > > > > >
> > > > > > Guided by the Fixes: tag.
> > > > >
> > > > > <formletter>
> > > > >
> > > > > This is not the correct way to submit patches for inclusion in the
> > > > > stable kernel tree. Please read:
> > > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> > > > > for how to do this properly.
> > > > >
> > > > > </formletter>
> > > >
> > > > Apologies.
> > > >
> > > > Commit ID: 4b7de801606e504e69689df71475d27e35336fb3
> > > > Subject: bpf: Fix prog_array_map_poke_run map poke update
> > > > Reason: Fixes a race condition in BPF.
> > > > Versions: linux-5.10.y+, as specified by the Fixes: tag above
> > >
> > > Did not apply to 5.10.y or 5.15.y, so if you need/want it there, we will
> > > need a working backport that has been tested. Other trees it's now
> > > queued up for.
> >
> > Thank you.
>
> please let me know if you need any help with that, I can check on that
I absolutely do. I have no way to test BPF.
--
Lee Jones [李琼斯]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-12-21 14:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-06 8:30 [PATCHv4 bpf 0/2] bpf: Fix map poke update Jiri Olsa
2023-12-06 8:30 ` [PATCHv4 bpf 1/2] bpf: Fix prog_array_map_poke_run " Jiri Olsa
2023-12-21 9:07 ` Lee Jones
2023-12-21 9:34 ` Greg KH
2023-12-21 9:55 ` Lee Jones
2023-12-21 10:02 ` Greg KH
2023-12-21 10:17 ` Lee Jones
2023-12-21 14:00 ` Jiri Olsa
2023-12-21 14:34 ` Lee Jones
2023-12-06 8:30 ` [PATCHv4 bpf 2/2] selftests/bpf: Add test for early update in prog_array_map_poke_run Jiri Olsa
2023-12-06 21:50 ` [PATCHv4 bpf 0/2] bpf: Fix map poke update patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox