* [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher
@ 2022-09-26 18:47 Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 1/2] bpf: use bpf_prog_pack for bpf_dispatcher Song Liu
` (4 more replies)
0 siblings, 5 replies; 7+ messages in thread
From: Song Liu @ 2022-09-26 18:47 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo,
jlayton, bjorn, Song Liu
Changes v1 => v2:
1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
(Alexei) Note: I haven't found an existing test to cover this part, so
this part was tested manually (comparing the generated dispatcher is
the same).
Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
1. Use bpf_prog_pack for bpf_dispatcher;
2. Set memory permission properly with bpf trampoline.
[1] https://lore.kernel.org/lkml/c84cc27c1a5031a003039748c3c099732a718aec.camel@kernel.org/
Song Liu (2):
bpf: use bpf_prog_pack for bpf_dispatcher
bpf: Enforce W^X for bpf trampoline
arch/x86/net/bpf_jit_comp.c | 16 ++++++++--------
include/linux/bpf.h | 4 ++--
include/linux/filter.h | 5 +++++
kernel/bpf/core.c | 9 +++++++--
kernel/bpf/dispatcher.c | 27 +++++++++++++++++++++------
kernel/bpf/trampoline.c | 22 +++++-----------------
6 files changed, 48 insertions(+), 35 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 bpf-next 1/2] bpf: use bpf_prog_pack for bpf_dispatcher
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
@ 2022-09-26 18:47 ` Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 2/2] bpf: Enforce W^X for bpf trampoline Song Liu
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2022-09-26 18:47 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo,
jlayton, bjorn, Song Liu
Allocate bpf_dispatcher with bpf_prog_pack_alloc so that bpf_dispatcher
can share pages with bpf programs.
arch_prepare_bpf_dispatcher() is updated to provide a RW buffer as working
area for arch code to write to.
This also fixes CPA W^X warnning like:
CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: ...
Signed-off-by: Song Liu <song@kernel.org>
---
arch/x86/net/bpf_jit_comp.c | 16 ++++++++--------
include/linux/bpf.h | 3 ++-
include/linux/filter.h | 5 +++++
kernel/bpf/core.c | 9 +++++++--
kernel/bpf/dispatcher.c | 27 +++++++++++++++++++++------
5 files changed, 43 insertions(+), 17 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index ae89f4143eb4..f62c05385fa7 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2243,7 +2243,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
return ret;
}
-static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
+static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs, u8 *image, u8 *buf)
{
u8 *jg_reloc, *prog = *pprog;
int pivot, err, jg_bytes = 1;
@@ -2259,12 +2259,12 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
EMIT2_off32(0x81, add_1reg(0xF8, BPF_REG_3),
progs[a]);
err = emit_cond_near_jump(&prog, /* je func */
- (void *)progs[a], prog,
+ (void *)progs[a], image + (prog - buf),
X86_JE);
if (err)
return err;
- emit_indirect_jump(&prog, 2 /* rdx */, prog);
+ emit_indirect_jump(&prog, 2 /* rdx */, image + (prog - buf));
*pprog = prog;
return 0;
@@ -2289,7 +2289,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
jg_reloc = prog;
err = emit_bpf_dispatcher(&prog, a, a + pivot, /* emit lower_part */
- progs);
+ progs, image, buf);
if (err)
return err;
@@ -2303,7 +2303,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs)
emit_code(jg_reloc - jg_bytes, jg_offset, jg_bytes);
err = emit_bpf_dispatcher(&prog, a + pivot + 1, /* emit upper_part */
- b, progs);
+ b, progs, image, buf);
if (err)
return err;
@@ -2323,12 +2323,12 @@ static int cmp_ips(const void *a, const void *b)
return 0;
}
-int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs)
+int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_funcs)
{
- u8 *prog = image;
+ u8 *prog = buf;
sort(funcs, num_funcs, sizeof(funcs[0]), cmp_ips, NULL);
- return emit_bpf_dispatcher(&prog, 0, num_funcs - 1, funcs);
+ return emit_bpf_dispatcher(&prog, 0, num_funcs - 1, funcs, image, buf);
}
struct x64_jit_data {
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index edd43edb27d6..9ae155c75014 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -946,6 +946,7 @@ struct bpf_dispatcher {
struct bpf_dispatcher_prog progs[BPF_DISPATCHER_MAX];
int num_progs;
void *image;
+ void *rw_image;
u32 image_off;
struct bpf_ksym ksym;
};
@@ -964,7 +965,7 @@ int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampolin
struct bpf_trampoline *bpf_trampoline_get(u64 key,
struct bpf_attach_target_info *tgt_info);
void bpf_trampoline_put(struct bpf_trampoline *tr);
-int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
+int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_funcs);
#define BPF_DISPATCHER_INIT(_name) { \
.mutex = __MUTEX_INITIALIZER(_name.mutex), \
.func = &_name##_func, \
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 98e28126c24b..efc42a6e3aed 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1023,6 +1023,8 @@ extern long bpf_jit_limit_max;
typedef void (*bpf_jit_fill_hole_t)(void *area, unsigned int size);
+void bpf_jit_fill_hole_with_zero(void *area, unsigned int size);
+
struct bpf_binary_header *
bpf_jit_binary_alloc(unsigned int proglen, u8 **image_ptr,
unsigned int alignment,
@@ -1035,6 +1037,9 @@ void bpf_jit_free(struct bpf_prog *fp);
struct bpf_binary_header *
bpf_jit_binary_pack_hdr(const struct bpf_prog *fp);
+void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns);
+void bpf_prog_pack_free(struct bpf_binary_header *hdr);
+
static inline bool bpf_prog_kallsyms_verify_off(const struct bpf_prog *fp)
{
return list_empty(&fp->aux->ksym.lnode) ||
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index d1be78c28619..711fd293b6de 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -825,6 +825,11 @@ struct bpf_prog_pack {
unsigned long bitmap[];
};
+void bpf_jit_fill_hole_with_zero(void *area, unsigned int size)
+{
+ memset(area, 0, size);
+}
+
#define BPF_PROG_SIZE_TO_NBITS(size) (round_up(size, BPF_PROG_CHUNK_SIZE) / BPF_PROG_CHUNK_SIZE)
static DEFINE_MUTEX(pack_mutex);
@@ -864,7 +869,7 @@ static struct bpf_prog_pack *alloc_new_pack(bpf_jit_fill_hole_t bpf_fill_ill_ins
return pack;
}
-static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns)
+void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insns)
{
unsigned int nbits = BPF_PROG_SIZE_TO_NBITS(size);
struct bpf_prog_pack *pack;
@@ -905,7 +910,7 @@ static void *bpf_prog_pack_alloc(u32 size, bpf_jit_fill_hole_t bpf_fill_ill_insn
return ptr;
}
-static void bpf_prog_pack_free(struct bpf_binary_header *hdr)
+void bpf_prog_pack_free(struct bpf_binary_header *hdr)
{
struct bpf_prog_pack *pack = NULL, *tmp;
unsigned int nbits;
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index 2444bd15cc2d..fa64b80b8bca 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -85,12 +85,12 @@ static bool bpf_dispatcher_remove_prog(struct bpf_dispatcher *d,
return false;
}
-int __weak arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs)
+int __weak arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_funcs)
{
return -ENOTSUPP;
}
-static int bpf_dispatcher_prepare(struct bpf_dispatcher *d, void *image)
+static int bpf_dispatcher_prepare(struct bpf_dispatcher *d, void *image, void *buf)
{
s64 ips[BPF_DISPATCHER_MAX] = {}, *ipsp = &ips[0];
int i;
@@ -99,12 +99,12 @@ static int bpf_dispatcher_prepare(struct bpf_dispatcher *d, void *image)
if (d->progs[i].prog)
*ipsp++ = (s64)(uintptr_t)d->progs[i].prog->bpf_func;
}
- return arch_prepare_bpf_dispatcher(image, &ips[0], d->num_progs);
+ return arch_prepare_bpf_dispatcher(image, buf, &ips[0], d->num_progs);
}
static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
{
- void *old, *new;
+ void *old, *new, *tmp;
u32 noff;
int err;
@@ -117,8 +117,14 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
}
new = d->num_progs ? d->image + noff : NULL;
+ tmp = d->num_progs ? d->rw_image + noff : NULL;
if (new) {
- if (bpf_dispatcher_prepare(d, new))
+ /* Prepare the dispatcher in d->rw_image. Then use
+ * bpf_arch_text_copy to update d->image, which is RO+X.
+ */
+ if (bpf_dispatcher_prepare(d, new, tmp))
+ return;
+ if (IS_ERR(bpf_arch_text_copy(new, tmp, PAGE_SIZE / 2)))
return;
}
@@ -140,9 +146,18 @@ void bpf_dispatcher_change_prog(struct bpf_dispatcher *d, struct bpf_prog *from,
mutex_lock(&d->mutex);
if (!d->image) {
- d->image = bpf_jit_alloc_exec_page();
+ d->image = bpf_prog_pack_alloc(PAGE_SIZE, bpf_jit_fill_hole_with_zero);
if (!d->image)
goto out;
+ d->rw_image = bpf_jit_alloc_exec(PAGE_SIZE);
+ if (!d->rw_image) {
+ u32 size = PAGE_SIZE;
+
+ bpf_arch_text_copy(d->image, &size, sizeof(size));
+ bpf_prog_pack_free((struct bpf_binary_header *)d->image);
+ d->image = NULL;
+ goto out;
+ }
bpf_image_ksym_add(d->image, &d->ksym);
}
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 bpf-next 2/2] bpf: Enforce W^X for bpf trampoline
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 1/2] bpf: use bpf_prog_pack for bpf_dispatcher Song Liu
@ 2022-09-26 18:47 ` Song Liu
2022-09-27 3:50 ` [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher patchwork-bot+netdevbpf
` (2 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2022-09-26 18:47 UTC (permalink / raw)
To: bpf
Cc: ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo,
jlayton, bjorn, Song Liu
Mark the trampoline as RO+X after arch_prepare_bpf_trampoline, so that
the trampoine follows W^X rule strictly. This will turn off warnings like
CPA refuse W^X violation: 8000000000000163 -> 0000000000000163 range: ...
Also remove bpf_jit_alloc_exec_page(), since it is not used any more.
Signed-off-by: Song Liu <song@kernel.org>
---
include/linux/bpf.h | 1 -
kernel/bpf/trampoline.c | 22 +++++-----------------
2 files changed, 5 insertions(+), 18 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9ae155c75014..5161fac0513f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1008,7 +1008,6 @@ int arch_prepare_bpf_dispatcher(void *image, void *buf, s64 *funcs, int num_func
void bpf_dispatcher_change_prog(struct bpf_dispatcher *d, struct bpf_prog *from,
struct bpf_prog *to);
/* Called only from JIT-enabled code, so there's no need for stubs. */
-void *bpf_jit_alloc_exec_page(void);
void bpf_image_ksym_add(void *data, struct bpf_ksym *ksym);
void bpf_image_ksym_del(struct bpf_ksym *ksym);
void bpf_ksym_add(struct bpf_ksym *ksym);
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 41b67eb83ab3..6f7b939321d6 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -116,22 +116,6 @@ bool bpf_prog_has_trampoline(const struct bpf_prog *prog)
(ptype == BPF_PROG_TYPE_LSM && eatype == BPF_LSM_MAC);
}
-void *bpf_jit_alloc_exec_page(void)
-{
- void *image;
-
- image = bpf_jit_alloc_exec(PAGE_SIZE);
- if (!image)
- return NULL;
-
- set_vm_flush_reset_perms(image);
- /* Keep image as writeable. The alternative is to keep flipping ro/rw
- * every time new program is attached or detached.
- */
- set_memory_x((long)image, 1);
- return image;
-}
-
void bpf_image_ksym_add(void *data, struct bpf_ksym *ksym)
{
ksym->start = (unsigned long) data;
@@ -404,9 +388,10 @@ static struct bpf_tramp_image *bpf_tramp_image_alloc(u64 key, u32 idx)
goto out_free_im;
err = -ENOMEM;
- im->image = image = bpf_jit_alloc_exec_page();
+ im->image = image = bpf_jit_alloc_exec(PAGE_SIZE);
if (!image)
goto out_uncharge;
+ set_vm_flush_reset_perms(image);
err = percpu_ref_init(&im->pcref, __bpf_tramp_image_release, 0, GFP_KERNEL);
if (err)
@@ -483,6 +468,9 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr, bool lock_direct_mut
if (err < 0)
goto out;
+ set_memory_ro((long)im->image, 1);
+ set_memory_x((long)im->image, 1);
+
WARN_ON(tr->cur_image && tr->selector == 0);
WARN_ON(!tr->cur_image && tr->selector);
if (tr->cur_image)
--
2.30.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 1/2] bpf: use bpf_prog_pack for bpf_dispatcher Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 2/2] bpf: Enforce W^X for bpf trampoline Song Liu
@ 2022-09-27 3:50 ` patchwork-bot+netdevbpf
2022-09-27 12:16 ` Jeff Layton
2022-09-28 9:42 ` Jesper Dangaard Brouer
4 siblings, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-09-27 3:50 UTC (permalink / raw)
To: Song Liu
Cc: bpf, ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo,
jlayton, bjorn
Hello:
This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:
On Mon, 26 Sep 2022 11:47:37 -0700 you wrote:
> Changes v1 => v2:
> 1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
> (Alexei) Note: I haven't found an existing test to cover this part, so
> this part was tested manually (comparing the generated dispatcher is
> the same).
>
> Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
> W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
>
> [...]
Here is the summary with links:
- [v2,bpf-next,1/2] bpf: use bpf_prog_pack for bpf_dispatcher
https://git.kernel.org/bpf/bpf-next/c/19c02415da23
- [v2,bpf-next,2/2] bpf: Enforce W^X for bpf trampoline
https://git.kernel.org/bpf/bpf-next/c/5b0d1c7bd572
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
` (2 preceding siblings ...)
2022-09-27 3:50 ` [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher patchwork-bot+netdevbpf
@ 2022-09-27 12:16 ` Jeff Layton
2022-09-28 9:42 ` Jesper Dangaard Brouer
4 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2022-09-27 12:16 UTC (permalink / raw)
To: Song Liu, bpf
Cc: ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo, bjorn
On Mon, 2022-09-26 at 11:47 -0700, Song Liu wrote:
> Changes v1 => v2:
> 1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
> (Alexei) Note: I haven't found an existing test to cover this part, so
> this part was tested manually (comparing the generated dispatcher is
> the same).
>
> Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
> W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
>
> 1. Use bpf_prog_pack for bpf_dispatcher;
> 2. Set memory permission properly with bpf trampoline.
>
> [1] https://lore.kernel.org/lkml/c84cc27c1a5031a003039748c3c099732a718aec.camel@kernel.org/
>
> Song Liu (2):
> bpf: use bpf_prog_pack for bpf_dispatcher
> bpf: Enforce W^X for bpf trampoline
>
> arch/x86/net/bpf_jit_comp.c | 16 ++++++++--------
> include/linux/bpf.h | 4 ++--
> include/linux/filter.h | 5 +++++
> kernel/bpf/core.c | 9 +++++++--
> kernel/bpf/dispatcher.c | 27 +++++++++++++++++++++------
> kernel/bpf/trampoline.c | 22 +++++-----------------
> 6 files changed, 48 insertions(+), 35 deletions(-)
>
> --
> 2.30.2
Your patch seems to have fixed the issue. You can add:
Tested-by: Jeff Layton <jlayton@kernel.org>
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
` (3 preceding siblings ...)
2022-09-27 12:16 ` Jeff Layton
@ 2022-09-28 9:42 ` Jesper Dangaard Brouer
2022-09-28 16:23 ` Song Liu
4 siblings, 1 reply; 7+ messages in thread
From: Jesper Dangaard Brouer @ 2022-09-28 9:42 UTC (permalink / raw)
To: Song Liu, bpf
Cc: brouer, ast, daniel, john.fastabend, kpsingh, kernel-team, haoluo,
jlayton, bjorn, Toke Hoiland Jorgensen, Clark Williams,
Steven Rostedt, Thomas Gleixner, Linux-MM
On 26/09/2022 20.47, Song Liu wrote:
> Changes v1 => v2:
> 1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
> (Alexei) Note: I haven't found an existing test to cover this part, so
> this part was tested manually (comparing the generated dispatcher is
> the same).
>
> Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
> W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
>
> 1. Use bpf_prog_pack for bpf_dispatcher;
> 2. Set memory permission properly with bpf trampoline.
Indirectly related to your patchset[0].
- TL;DR calling set_memory_x() have side-effects
We are getting reports that loading BPF-progs (jit stage) cause issues
for RT in the form of triggering work on isolated CPUs. It looks like
BTF JIT stage cause a TLB flush on all CPUs, including isolated CPUs.
The triggering function is set_memory_x() (see call-stack[2]).
We have noticed (and appreciate) you have previously improved the
situation in this patchset[3]:
[3]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80123f0ac4a6
Is this patchset also part of improving the situation, or does it
introduce more calls to set_memory_x() ?
> [1] https://lore.kernel.org/lkml/c84cc27c1a5031a003039748c3c099732a718aec.camel@kernel.org/
[2] Call stack triggering issue:
smp_call_function_many_cond+0x1
smp_call_function+0x39
on_each_cpu+0x2a
cpa_flush+0x11a
change_page_attr_set_clr+0x129
set_memory_x+0x37
bpf_int_jit_compile+0x36f
bpf_prog_select_runtime+0xc6
bpf_prepare_filter+0x523
sk_attach_filter+0x13
sock_setsockopt+0x920
__sys_setsockopt+0x16a
__x64_sys_setsockopt+0x20
do_syscall_64+0x87
entry_SYSCALL_64_after_hwframe+0x65
[0] https://lore.kernel.org/all/20220926184739.3512547-1-song@kernel.org/#r
--Jesper
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher
2022-09-28 9:42 ` Jesper Dangaard Brouer
@ 2022-09-28 16:23 ` Song Liu
0 siblings, 0 replies; 7+ messages in thread
From: Song Liu @ 2022-09-28 16:23 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: Song Liu, bpf, Jesper Dangaard Brouer, Alexei Starovoitov,
Daniel Borkmann, John Fastabend, KP Singh, Kernel Team, Hao Luo,
jlayton@kernel.org, Björn Töpel, Toke Hoiland Jorgensen,
Clark Williams, Steven Rostedt, Thomas Gleixner, Linux-MM
> On Sep 28, 2022, at 2:42 AM, Jesper Dangaard Brouer <jbrouer@redhat.com> wrote:
>
>
> On 26/09/2022 20.47, Song Liu wrote:
>> Changes v1 => v2:
>> 1. Update arch_prepare_bpf_dispatcher to use a RO image and a RW buffer.
>> (Alexei) Note: I haven't found an existing test to cover this part, so
>> this part was tested manually (comparing the generated dispatcher is
>> the same).
>> Jeff Layton reported CPA W^X warning linux-next [1]. It turns out to be
>> W^X issue with bpf trampoline and bpf dispatcher. Fix these by:
>> 1. Use bpf_prog_pack for bpf_dispatcher;
>> 2. Set memory permission properly with bpf trampoline.
>
> Indirectly related to your patchset[0].
> - TL;DR calling set_memory_x() have side-effects
>
> We are getting reports that loading BPF-progs (jit stage) cause issues for RT in the form of triggering work on isolated CPUs. It looks like BTF JIT stage cause a TLB flush on all CPUs, including isolated CPUs.
>
> The triggering function is set_memory_x() (see call-stack[2]).
>
> We have noticed (and appreciate) you have previously improved the situation in this patchset[3]:
> [3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=80123f0ac4a6
>
> Is this patchset also part of improving the situation, or does it introduce more calls to set_memory_x() ?
This set doesn't change numbers of set_memory_x() calls for trampolines.
We plan to move trampolines to use bpf_prog_pack (or the new vmalloc_exec,
if I am lucky) in 6.2. We will see fewer set_memory_x() calls after that.
Thanks,
Song
>
>
>> [1] https://lore.kernel.org/lkml/c84cc27c1a5031a003039748c3c099732a718aec.camel@kernel.org/
>
>
> [2] Call stack triggering issue:
>
> smp_call_function_many_cond+0x1
> smp_call_function+0x39
> on_each_cpu+0x2a
> cpa_flush+0x11a
> change_page_attr_set_clr+0x129
> set_memory_x+0x37
> bpf_int_jit_compile+0x36f
> bpf_prog_select_runtime+0xc6
> bpf_prepare_filter+0x523
> sk_attach_filter+0x13
> sock_setsockopt+0x920
> __sys_setsockopt+0x16a
> __x64_sys_setsockopt+0x20
> do_syscall_64+0x87
> entry_SYSCALL_64_after_hwframe+0x65
>
>
> [0] https://lore.kernel.org/all/20220926184739.3512547-1-song@kernel.org/#r
>
> --Jesper
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2022-09-28 16:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-09-26 18:47 [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 1/2] bpf: use bpf_prog_pack for bpf_dispatcher Song Liu
2022-09-26 18:47 ` [PATCH v2 bpf-next 2/2] bpf: Enforce W^X for bpf trampoline Song Liu
2022-09-27 3:50 ` [PATCH v2 bpf-next 0/2] enforce W^X for trampoline and dispatcher patchwork-bot+netdevbpf
2022-09-27 12:16 ` Jeff Layton
2022-09-28 9:42 ` Jesper Dangaard Brouer
2022-09-28 16:23 ` Song Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox