* [PATCH bpf-next 0/2] bpf,arm64: Add support for BPF Arena
@ 2024-03-14 15:00 Puranjay Mohan
2024-03-14 15:00 ` [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions Puranjay Mohan
2024-03-14 15:00 ` [PATCH bpf-next 2/2] bpf: Add arm64 JIT support for bpf_addr_space_cast instruction Puranjay Mohan
0 siblings, 2 replies; 7+ messages in thread
From: Puranjay Mohan @ 2024-03-14 15:00 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf, linux-kernel, Catalin Marinas, Will Deacon, Zi Shen Lim,
Xu Kuohai
Cc: puranjay12
This series adds the support for PROBE_MEM32 and bpf_addr_space_cast
instructions to the ARM64 BPF JIT. These two instructions allow the
enablement of BPF Arena.
All arena related selftests are passing.
[root@ip-172-31-6-62 bpf]# ./test_progs -a "*arena*"
#3/1 arena_htab/arena_htab_llvm:OK
#3/2 arena_htab/arena_htab_asm:OK
#3 arena_htab:OK
#4/1 arena_list/arena_list_1:OK
#4/2 arena_list/arena_list_1000:OK
#4 arena_list:OK
#434/1 verifier_arena/basic_alloc1:OK
#434/2 verifier_arena/basic_alloc2:OK
#434/3 verifier_arena/basic_alloc3:OK
#434/4 verifier_arena/iter_maps1:OK
#434/5 verifier_arena/iter_maps2:OK
#434/6 verifier_arena/iter_maps3:OK
#434 verifier_arena:OK
Summary: 3/10 PASSED, 0 SKIPPED, 0 FAILED
The implementation of bpf_addr_space_cast can be optimised by using ROR
(immediate) and CSEL instructions. Currently, lib/insn.c doesn't have APIs
to generate these intructions. I will send subsequent patches to implement
the APIs and then use these instructions in the JIT.
Puranjay Mohan (2):
bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
bpf: Add arm64 JIT support for bpf_addr_space_cast instruction.
arch/arm64/net/bpf_jit.h | 1 +
arch/arm64/net/bpf_jit_comp.c | 105 +++++++++++++++++--
tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 -
3 files changed, 96 insertions(+), 12 deletions(-)
--
2.40.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
2024-03-14 15:00 [PATCH bpf-next 0/2] bpf,arm64: Add support for BPF Arena Puranjay Mohan
@ 2024-03-14 15:00 ` Puranjay Mohan
2024-03-14 17:07 ` Kumar Kartikeya Dwivedi
2024-03-14 15:00 ` [PATCH bpf-next 2/2] bpf: Add arm64 JIT support for bpf_addr_space_cast instruction Puranjay Mohan
1 sibling, 1 reply; 7+ messages in thread
From: Puranjay Mohan @ 2024-03-14 15:00 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf, linux-kernel, Catalin Marinas, Will Deacon, Zi Shen Lim,
Xu Kuohai
Cc: puranjay12
Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
instructions. They are similar to PROBE_MEM instructions with the
following differences:
- PROBE_MEM32 supports store.
- PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
src/dst register
- PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in R28
in the prologue). Due to bpf_arena constructions such R28 + reg +
off16 access is guaranteed to be within arena virtual range, so no
address check at run-time.
- PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
LDX faults the destination register is zeroed.
To support these on arm64, we do tmp2 = R28 + src/dst reg and then use
tmp2 as the new src/dst register. This allows us to reuse most of the
code for normal [LDX | STX | ST].
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
---
arch/arm64/net/bpf_jit_comp.c | 70 ++++++++++++++++++++++++++++++-----
1 file changed, 60 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index c5b461dda438..ce66c17b73a0 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -29,6 +29,7 @@
#define TCALL_CNT (MAX_BPF_JIT_REG + 2)
#define TMP_REG_3 (MAX_BPF_JIT_REG + 3)
#define FP_BOTTOM (MAX_BPF_JIT_REG + 4)
+#define PROBE_MEM32_BASE (MAX_BPF_JIT_REG + 5)
#define check_imm(bits, imm) do { \
if ((((imm) > 0) && ((imm) >> (bits))) || \
@@ -67,6 +68,8 @@ static const int bpf2a64[] = {
/* temporary register for blinding constants */
[BPF_REG_AX] = A64_R(9),
[FP_BOTTOM] = A64_R(27),
+ /* callee saved register for kern_vm_start address */
+ [PROBE_MEM32_BASE] = A64_R(28),
};
struct jit_ctx {
@@ -295,7 +298,7 @@ static bool is_lsi_offset(int offset, int scale)
#define PROLOGUE_OFFSET (BTI_INSNS + 2 + PAC_INSNS + 8)
static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf,
- bool is_exception_cb)
+ bool is_exception_cb, u64 arena_vm_start)
{
const struct bpf_prog *prog = ctx->prog;
const bool is_main_prog = !bpf_is_subprog(prog);
@@ -306,6 +309,7 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf,
const u8 fp = bpf2a64[BPF_REG_FP];
const u8 tcc = bpf2a64[TCALL_CNT];
const u8 fpb = bpf2a64[FP_BOTTOM];
+ const u8 pb = bpf2a64[PROBE_MEM32_BASE];
const int idx0 = ctx->idx;
int cur_offset;
@@ -411,6 +415,10 @@ static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf,
/* Set up function call stack */
emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
+
+ if (arena_vm_start)
+ emit_a64_mov_i64(pb, arena_vm_start, ctx);
+
return 0;
}
@@ -738,6 +746,7 @@ static void build_epilogue(struct jit_ctx *ctx, bool is_exception_cb)
#define BPF_FIXUP_OFFSET_MASK GENMASK(26, 0)
#define BPF_FIXUP_REG_MASK GENMASK(31, 27)
+#define DONT_CLEAR 32
bool ex_handler_bpf(const struct exception_table_entry *ex,
struct pt_regs *regs)
@@ -745,7 +754,8 @@ bool ex_handler_bpf(const struct exception_table_entry *ex,
off_t offset = FIELD_GET(BPF_FIXUP_OFFSET_MASK, ex->fixup);
int dst_reg = FIELD_GET(BPF_FIXUP_REG_MASK, ex->fixup);
- regs->regs[dst_reg] = 0;
+ if (dst_reg != DONT_CLEAR)
+ regs->regs[dst_reg] = 0;
regs->pc = (unsigned long)&ex->fixup - offset;
return true;
}
@@ -765,7 +775,8 @@ static int add_exception_handler(const struct bpf_insn *insn,
return 0;
if (BPF_MODE(insn->code) != BPF_PROBE_MEM &&
- BPF_MODE(insn->code) != BPF_PROBE_MEMSX)
+ BPF_MODE(insn->code) != BPF_PROBE_MEMSX &&
+ BPF_MODE(insn->code) != BPF_PROBE_MEM32)
return 0;
if (!ctx->prog->aux->extable ||
@@ -810,6 +821,9 @@ static int add_exception_handler(const struct bpf_insn *insn,
ex->insn = ins_offset;
+ if (BPF_CLASS(insn->code) != BPF_LDX)
+ dst_reg = DONT_CLEAR;
+
ex->fixup = FIELD_PREP(BPF_FIXUP_OFFSET_MASK, fixup_offset) |
FIELD_PREP(BPF_FIXUP_REG_MASK, dst_reg);
@@ -829,12 +843,13 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
bool extra_pass)
{
const u8 code = insn->code;
- const u8 dst = bpf2a64[insn->dst_reg];
- const u8 src = bpf2a64[insn->src_reg];
+ u8 dst = bpf2a64[insn->dst_reg];
+ u8 src = bpf2a64[insn->src_reg];
const u8 tmp = bpf2a64[TMP_REG_1];
const u8 tmp2 = bpf2a64[TMP_REG_2];
const u8 fp = bpf2a64[BPF_REG_FP];
const u8 fpb = bpf2a64[FP_BOTTOM];
+ const u8 pb = bpf2a64[PROBE_MEM32_BASE];
const s16 off = insn->off;
const s32 imm = insn->imm;
const int i = insn - ctx->prog->insnsi;
@@ -1237,7 +1252,15 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
case BPF_LDX | BPF_PROBE_MEMSX | BPF_B:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_H:
case BPF_LDX | BPF_PROBE_MEMSX | BPF_W:
- if (ctx->fpb_offset > 0 && src == fp) {
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_LDX | BPF_PROBE_MEM32 | BPF_DW:
+ if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) {
+ emit(A64_ADD(1, tmp2, src, pb), ctx);
+ src = tmp2;
+ }
+ if (ctx->fpb_offset > 0 && src == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) {
src_adj = fpb;
off_adj = off + ctx->fpb_offset;
} else {
@@ -1322,7 +1345,15 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
case BPF_ST | BPF_MEM | BPF_H:
case BPF_ST | BPF_MEM | BPF_B:
case BPF_ST | BPF_MEM | BPF_DW:
- if (ctx->fpb_offset > 0 && dst == fp) {
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_ST | BPF_PROBE_MEM32 | BPF_DW:
+ if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) {
+ emit(A64_ADD(1, tmp2, dst, pb), ctx);
+ dst = tmp2;
+ }
+ if (ctx->fpb_offset > 0 && dst == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) {
dst_adj = fpb;
off_adj = off + ctx->fpb_offset;
} else {
@@ -1365,6 +1396,10 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
}
break;
}
+
+ ret = add_exception_handler(insn, ctx, dst);
+ if (ret)
+ return ret;
break;
/* STX: *(size *)(dst + off) = src */
@@ -1372,7 +1407,15 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
case BPF_STX | BPF_MEM | BPF_H:
case BPF_STX | BPF_MEM | BPF_B:
case BPF_STX | BPF_MEM | BPF_DW:
- if (ctx->fpb_offset > 0 && dst == fp) {
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_B:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_H:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_W:
+ case BPF_STX | BPF_PROBE_MEM32 | BPF_DW:
+ if (BPF_MODE(insn->code) == BPF_PROBE_MEM32) {
+ emit(A64_ADD(1, tmp2, dst, pb), ctx);
+ dst = tmp2;
+ }
+ if (ctx->fpb_offset > 0 && dst == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) {
dst_adj = fpb;
off_adj = off + ctx->fpb_offset;
} else {
@@ -1413,6 +1456,10 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
}
break;
}
+
+ ret = add_exception_handler(insn, ctx, dst);
+ if (ret)
+ return ret;
break;
case BPF_STX | BPF_ATOMIC | BPF_W:
@@ -1594,6 +1641,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
bool tmp_blinded = false;
bool extra_pass = false;
struct jit_ctx ctx;
+ u64 arena_vm_start;
u8 *image_ptr;
u8 *ro_image_ptr;
@@ -1611,6 +1659,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
prog = tmp;
}
+ arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena);
jit_data = prog->aux->jit_data;
if (!jit_data) {
jit_data = kzalloc(sizeof(*jit_data), GFP_KERNEL);
@@ -1648,7 +1697,8 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
* BPF line info needs ctx->offset[i] to be the offset of
* instruction[i] in jited image, so build prologue first.
*/
- if (build_prologue(&ctx, was_classic, prog->aux->exception_cb)) {
+ if (build_prologue(&ctx, was_classic, prog->aux->exception_cb,
+ arena_vm_start)) {
prog = orig_prog;
goto out_off;
}
@@ -1696,7 +1746,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
ctx.idx = 0;
ctx.exentry_idx = 0;
- build_prologue(&ctx, was_classic, prog->aux->exception_cb);
+ build_prologue(&ctx, was_classic, prog->aux->exception_cb, arena_vm_start);
if (build_body(&ctx, extra_pass)) {
prog = orig_prog;
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH bpf-next 2/2] bpf: Add arm64 JIT support for bpf_addr_space_cast instruction.
2024-03-14 15:00 [PATCH bpf-next 0/2] bpf,arm64: Add support for BPF Arena Puranjay Mohan
2024-03-14 15:00 ` [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions Puranjay Mohan
@ 2024-03-14 15:00 ` Puranjay Mohan
1 sibling, 0 replies; 7+ messages in thread
From: Puranjay Mohan @ 2024-03-14 15:00 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
bpf, linux-kernel, Catalin Marinas, Will Deacon, Zi Shen Lim,
Xu Kuohai
Cc: puranjay12
LLVM generates bpf_addr_space_cast instruction while translating
pointers between native (zero) address space and
__attribute__((address_space(N))). The addr_space=1 is reserved as
bpf_arena address space.
rY = addr_space_cast(rX, 0, 1) is processed by the verifier and
converted to normal 32-bit move: wX = wY
rY = addr_space_cast(rX, 1, 0) has to be converted by JIT:
Here I explain using symbolic language what the JIT is supposed to do:
We have:
src = [src_upper32][src_lower32] // 64 bit src kernel pointer
uvm = [uvm_upper32][uvm_lower32] // 64 bit user_vm_start
The JIT has to make the dst reg like following
dst = [uvm_upper32][src_lower32] // if src_lower32 != 0
dst = [00000000000][00000000000] // if src_lower32 == 0
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
---
arch/arm64/net/bpf_jit.h | 1 +
arch/arm64/net/bpf_jit_comp.c | 35 ++++++++++++++++++++
tools/testing/selftests/bpf/DENYLIST.aarch64 | 2 --
3 files changed, 36 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h
index 23b1b34db088..813c3c428fde 100644
--- a/arch/arm64/net/bpf_jit.h
+++ b/arch/arm64/net/bpf_jit.h
@@ -238,6 +238,7 @@
#define A64_LSLV(sf, Rd, Rn, Rm) A64_DATA2(sf, Rd, Rn, Rm, LSLV)
#define A64_LSRV(sf, Rd, Rn, Rm) A64_DATA2(sf, Rd, Rn, Rm, LSRV)
#define A64_ASRV(sf, Rd, Rn, Rm) A64_DATA2(sf, Rd, Rn, Rm, ASRV)
+#define A64_RORV(sf, Rd, Rn, Rm) A64_DATA2(sf, Rd, Rn, Rm, RORV)
/* Data-processing (3 source) */
/* Rd = Ra + Rn * Rm */
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index ce66c17b73a0..e12e0df3ad1a 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -82,6 +82,7 @@ struct jit_ctx {
__le32 *ro_image;
u32 stack_size;
int fpb_offset;
+ u64 user_vm_start;
};
struct bpf_plt {
@@ -868,6 +869,34 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx,
/* dst = src */
case BPF_ALU | BPF_MOV | BPF_X:
case BPF_ALU64 | BPF_MOV | BPF_X:
+ if (insn->off == BPF_ADDR_SPACE_CAST &&
+ insn->imm == 1U << 16) {
+ /* Zero out tmp2 */
+ emit(A64_EOR(1, tmp2, tmp2, tmp2), ctx);
+
+ /* Move lo_32_bits(src) to dst */
+ if (dst != src)
+ emit(A64_MOV(0, dst, src), ctx);
+
+ /* Logical shift left by 32 bits */
+ emit(A64_LSL(1, dst, dst, 32), ctx);
+
+ /* Get upper 32 bits of user_vm_start in tmp */
+ emit_a64_mov_i(0, tmp, ctx->user_vm_start >> 32, ctx);
+
+ /* dst |= up_32_bits(user_vm_start) */
+ emit(A64_ORR(1, dst, dst, tmp), ctx);
+
+ /* Rotate by 32 bits to get final result */
+ emit_a64_mov_i(0, tmp, 32, ctx);
+ emit(A64_RORV(1, dst, dst, tmp), ctx);
+
+ /* If lo_32_bits(dst) == 0, set dst = tmp2(0) */
+ emit(A64_CBZ(0, dst, 2), ctx);
+ emit(A64_MOV(1, tmp2, dst), ctx);
+ emit(A64_MOV(1, dst, tmp2), ctx);
+ break;
+ }
switch (insn->off) {
case 0:
emit(A64_MOV(is64, dst, src), ctx);
@@ -1690,6 +1719,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
ctx.fpb_offset = find_fpb_offset(prog);
+ ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena);
/*
* 1. Initial fake pass to compute ctx->idx and ctx->offset.
@@ -2514,6 +2544,11 @@ bool bpf_jit_supports_exceptions(void)
return true;
}
+bool bpf_jit_supports_arena(void)
+{
+ return true;
+}
+
void bpf_jit_free(struct bpf_prog *prog)
{
if (prog->jited) {
diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64
index d8ade15e2789..0445ac38bc07 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -10,5 +10,3 @@ fill_link_info/kprobe_multi_link_info # bpf_program__attach_kprobe_mu
fill_link_info/kretprobe_multi_link_info # bpf_program__attach_kprobe_multi_opts unexpected error: -95
fill_link_info/kprobe_multi_invalid_ubuff # bpf_program__attach_kprobe_multi_opts unexpected error: -95
missed/kprobe_recursion # missed_kprobe_recursion__attach unexpected error: -95 (errno 95)
-verifier_arena # JIT does not support arena
-arena_htab # JIT does not support arena
--
2.40.1
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
2024-03-14 15:00 ` [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions Puranjay Mohan
@ 2024-03-14 17:07 ` Kumar Kartikeya Dwivedi
2024-03-14 17:13 ` Puranjay Mohan
0 siblings, 1 reply; 7+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2024-03-14 17:07 UTC (permalink / raw)
To: Puranjay Mohan, Alexei Starovoitov, Eduard Zingerman
Cc: Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf, linux-kernel,
Catalin Marinas, Will Deacon, Zi Shen Lim, Xu Kuohai
On Thu, 14 Mar 2024 at 16:00, Puranjay Mohan <puranjay12@gmail.com> wrote:
>
> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> instructions. They are similar to PROBE_MEM instructions with the
> following differences:
> - PROBE_MEM32 supports store.
> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> src/dst register
> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in R28
> in the prologue). Due to bpf_arena constructions such R28 + reg +
> off16 access is guaranteed to be within arena virtual range, so no
> address check at run-time.
> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> LDX faults the destination register is zeroed.
>
> To support these on arm64, we do tmp2 = R28 + src/dst reg and then use
> tmp2 as the new src/dst register. This allows us to reuse most of the
> code for normal [LDX | STX | ST].
>
> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> ---
Hi Alexei,
Puranjay and I were discussing this stuff off list and noticed that
atomic instructions are not handled.
It turns out that will cause a kernel crash right now because the
32-bit offset into arena will be dereferenced directly.
e.g. something like this:
@@ -55,6 +56,7 @@ int arena_list_add(void *ctx)
test_val++;
n->value = i;
arena_sum += i;
+ __sync_fetch_and_add(&arena_sum, 0);
list_add_head(&n->node, list_head);
}
#else
I will try to prepare a fix for the x86 JIT. Puranjay will do the same
for his set.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
2024-03-14 17:07 ` Kumar Kartikeya Dwivedi
@ 2024-03-14 17:13 ` Puranjay Mohan
2024-03-14 17:21 ` Alexei Starovoitov
0 siblings, 1 reply; 7+ messages in thread
From: Puranjay Mohan @ 2024-03-14 17:13 UTC (permalink / raw)
To: Kumar Kartikeya Dwivedi, Alexei Starovoitov, Eduard Zingerman
Cc: Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf, linux-kernel,
Catalin Marinas, Will Deacon, Zi Shen Lim, Xu Kuohai
Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> On Thu, 14 Mar 2024 at 16:00, Puranjay Mohan <puranjay12@gmail.com> wrote:
>>
>> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
>> instructions. They are similar to PROBE_MEM instructions with the
>> following differences:
>> - PROBE_MEM32 supports store.
>> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
>> src/dst register
>> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in R28
>> in the prologue). Due to bpf_arena constructions such R28 + reg +
>> off16 access is guaranteed to be within arena virtual range, so no
>> address check at run-time.
>> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
>> LDX faults the destination register is zeroed.
>>
>> To support these on arm64, we do tmp2 = R28 + src/dst reg and then use
>> tmp2 as the new src/dst register. This allows us to reuse most of the
>> code for normal [LDX | STX | ST].
>>
>> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
>> ---
>
> Hi Alexei,
> Puranjay and I were discussing this stuff off list and noticed that
> atomic instructions are not handled.
> It turns out that will cause a kernel crash right now because the
> 32-bit offset into arena will be dereferenced directly.
>
> e.g. something like this:
>
> @@ -55,6 +56,7 @@ int arena_list_add(void *ctx)
> test_val++;
> n->value = i;
> arena_sum += i;
> + __sync_fetch_and_add(&arena_sum, 0);
> list_add_head(&n->node, list_head);
> }
> #else
>
> I will try to prepare a fix for the x86 JIT. Puranjay will do the same
> for his set.
Yes, testing the change mentioned by Kumar on ARM64 causes a crashes as well:
bpf_testmod: loading out-of-tree module taints kernel.
bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
Mem abort info:
ESR = 0x0000000096000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x06: level 2 translation fault
Data abort info:
ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=00000004043cc000
[0000000000000010] pgd=0800000410d8f003, p4d=0800000410d8f003, pud=0800000405972003, pmd=0000000000000000
Internal error: Oops: 0000000096000006 [#1] SMP
Modules linked in: bpf_testmod(OE) nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sha1_ce button sch_fq_codel dm_mod dax configfs dmi_sysfs sha2_ce sha256_arm64 efivarfs
CPU: 8 PID: 5631 Comm: test_progs Tainted: G OE 6.8.0+ #2
Hardware name: Amazon EC2 c6g.16xlarge/, BIOS 1.0 11/1/2018
pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
lr : bpf_prog_8771c336cb6a18eb_arena_list_add+0x144/0x2b8
sp : ffff80008b84bc30
x29: ffff80008b84bca0 x28: ffff8000a5008000 x27: ffff80008b84bc38
x26: 0000000000000000 x25: ffff80008b84bc60 x24: 0000000000000000
x23: 0000000000000000 x22: 0000000000000058 x21: 0000000000000838
x20: 0000000000000000 x19: 0000000100001fe0 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffcc66d2c8
x14: 0000000000000000 x13: 0000000000000000 x12: 000000000004058c
x11: ffff8000a5008010 x10: 00000000ffffffff x9 : 00000000000002cf
x8 : ffff800082ff4ab8 x7 : 0000000100001000 x6 : 0000000000000001
x5 : 0000000010e5e3fd x4 : 000000003619b978 x3 : 0000000000000010
x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000001fe0
Call trace:
bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
bpf_prog_test_run_syscall+0x100/0x340
__sys_bpf+0x8e8/0xa20
__arm64_sys_bpf+0x2c/0x48
invoke_syscall+0x50/0x128
el0_svc_common.constprop.0+0x48/0xf8
do_el0_svc+0x28/0x40
el0_svc+0x58/0x190
el0t_64_sync_handler+0x13c/0x158
el0t_64_sync+0x1a8/0x1b0
Code: 8b010042 8b1c006b f9000162 d2800001 (f821307f)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: disabled
CPU features: 0x0,00000120,7002014a,21407a0b
Memory Limit: none
Rebooting in 5 seconds..
I will send v2 with the arm64 JIT fix, but I guess verifier has to be modified
as well to add BPF_PROBE_MEM32 to atomic instructions.
Thanks,
Puranjay
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
2024-03-14 17:13 ` Puranjay Mohan
@ 2024-03-14 17:21 ` Alexei Starovoitov
2024-03-15 10:31 ` Puranjay Mohan
0 siblings, 1 reply; 7+ messages in thread
From: Alexei Starovoitov @ 2024-03-14 17:21 UTC (permalink / raw)
To: Puranjay Mohan
Cc: Kumar Kartikeya Dwivedi, Alexei Starovoitov, Eduard Zingerman,
Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf, LKML,
Catalin Marinas, Will Deacon, Zi Shen Lim, Xu Kuohai
On Thu, Mar 14, 2024 at 10:13 AM Puranjay Mohan <puranjay12@gmail.com> wrote:
>
> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
>
> > On Thu, 14 Mar 2024 at 16:00, Puranjay Mohan <puranjay12@gmail.com> wrote:
> >>
> >> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> >> instructions. They are similar to PROBE_MEM instructions with the
> >> following differences:
> >> - PROBE_MEM32 supports store.
> >> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> >> src/dst register
> >> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in R28
> >> in the prologue). Due to bpf_arena constructions such R28 + reg +
> >> off16 access is guaranteed to be within arena virtual range, so no
> >> address check at run-time.
> >> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> >> LDX faults the destination register is zeroed.
> >>
> >> To support these on arm64, we do tmp2 = R28 + src/dst reg and then use
> >> tmp2 as the new src/dst register. This allows us to reuse most of the
> >> code for normal [LDX | STX | ST].
> >>
> >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> >> ---
> >
> > Hi Alexei,
> > Puranjay and I were discussing this stuff off list and noticed that
> > atomic instructions are not handled.
> > It turns out that will cause a kernel crash right now because the
> > 32-bit offset into arena will be dereferenced directly.
> >
> > e.g. something like this:
> >
> > @@ -55,6 +56,7 @@ int arena_list_add(void *ctx)
> > test_val++;
> > n->value = i;
> > arena_sum += i;
> > + __sync_fetch_and_add(&arena_sum, 0);
> > list_add_head(&n->node, list_head);
> > }
> > #else
> >
> > I will try to prepare a fix for the x86 JIT. Puranjay will do the same
> > for his set.
>
> Yes, testing the change mentioned by Kumar on ARM64 causes a crashes as well:
>
> bpf_testmod: loading out-of-tree module taints kernel.
> bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> Mem abort info:
> ESR = 0x0000000096000006
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x06: level 2 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=00000004043cc000
> [0000000000000010] pgd=0800000410d8f003, p4d=0800000410d8f003, pud=0800000405972003, pmd=0000000000000000
> Internal error: Oops: 0000000096000006 [#1] SMP
> Modules linked in: bpf_testmod(OE) nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sha1_ce button sch_fq_codel dm_mod dax configfs dmi_sysfs sha2_ce sha256_arm64 efivarfs
> CPU: 8 PID: 5631 Comm: test_progs Tainted: G OE 6.8.0+ #2
> Hardware name: Amazon EC2 c6g.16xlarge/, BIOS 1.0 11/1/2018
> pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
> lr : bpf_prog_8771c336cb6a18eb_arena_list_add+0x144/0x2b8
> sp : ffff80008b84bc30
> x29: ffff80008b84bca0 x28: ffff8000a5008000 x27: ffff80008b84bc38
> x26: 0000000000000000 x25: ffff80008b84bc60 x24: 0000000000000000
> x23: 0000000000000000 x22: 0000000000000058 x21: 0000000000000838
> x20: 0000000000000000 x19: 0000000100001fe0 x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffcc66d2c8
> x14: 0000000000000000 x13: 0000000000000000 x12: 000000000004058c
> x11: ffff8000a5008010 x10: 00000000ffffffff x9 : 00000000000002cf
> x8 : ffff800082ff4ab8 x7 : 0000000100001000 x6 : 0000000000000001
> x5 : 0000000010e5e3fd x4 : 000000003619b978 x3 : 0000000000000010
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000001fe0
> Call trace:
> bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
> bpf_prog_test_run_syscall+0x100/0x340
> __sys_bpf+0x8e8/0xa20
> __arm64_sys_bpf+0x2c/0x48
> invoke_syscall+0x50/0x128
> el0_svc_common.constprop.0+0x48/0xf8
> do_el0_svc+0x28/0x40
> el0_svc+0x58/0x190
> el0t_64_sync_handler+0x13c/0x158
> el0t_64_sync+0x1a8/0x1b0
> Code: 8b010042 8b1c006b f9000162 d2800001 (f821307f)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops: Fatal exception
> SMP: stopping secondary CPUs
> Kernel Offset: disabled
> CPU features: 0x0,00000120,7002014a,21407a0b
> Memory Limit: none
> Rebooting in 5 seconds..
>
> I will send v2 with the arm64 JIT fix, but I guess verifier has to be modified
> as well to add BPF_PROBE_MEM32 to atomic instructions.
The JIT and the verifier changes for atomics might be too big.
Let's disable atomics in arena in the verifier for now.
Pls send a patch.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions.
2024-03-14 17:21 ` Alexei Starovoitov
@ 2024-03-15 10:31 ` Puranjay Mohan
0 siblings, 0 replies; 7+ messages in thread
From: Puranjay Mohan @ 2024-03-15 10:31 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Kumar Kartikeya Dwivedi, Alexei Starovoitov, Eduard Zingerman,
Daniel Borkmann, John Fastabend, Andrii Nakryiko,
Martin KaFai Lau, Song Liu, Yonghong Song, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, bpf, LKML,
Catalin Marinas, Will Deacon, Zi Shen Lim, Xu Kuohai
On Thu, Mar 14, 2024 at 6:21 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Mar 14, 2024 at 10:13 AM Puranjay Mohan <puranjay12@gmail.com> wrote:
> >
> > Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> >
> > > On Thu, 14 Mar 2024 at 16:00, Puranjay Mohan <puranjay12@gmail.com> wrote:
> > >>
> > >> Add support for [LDX | STX | ST], PROBE_MEM32, [B | H | W | DW]
> > >> instructions. They are similar to PROBE_MEM instructions with the
> > >> following differences:
> > >> - PROBE_MEM32 supports store.
> > >> - PROBE_MEM32 relies on the verifier to clear upper 32-bit of the
> > >> src/dst register
> > >> - PROBE_MEM32 adds 64-bit kern_vm_start address (which is stored in R28
> > >> in the prologue). Due to bpf_arena constructions such R28 + reg +
> > >> off16 access is guaranteed to be within arena virtual range, so no
> > >> address check at run-time.
> > >> - PROBE_MEM32 allows STX and ST. If they fault the store is a nop. When
> > >> LDX faults the destination register is zeroed.
> > >>
> > >> To support these on arm64, we do tmp2 = R28 + src/dst reg and then use
> > >> tmp2 as the new src/dst register. This allows us to reuse most of the
> > >> code for normal [LDX | STX | ST].
> > >>
> > >> Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
> > >> ---
> > >
> > > Hi Alexei,
> > > Puranjay and I were discussing this stuff off list and noticed that
> > > atomic instructions are not handled.
> > > It turns out that will cause a kernel crash right now because the
> > > 32-bit offset into arena will be dereferenced directly.
> > >
> > > e.g. something like this:
> > >
> > > @@ -55,6 +56,7 @@ int arena_list_add(void *ctx)
> > > test_val++;
> > > n->value = i;
> > > arena_sum += i;
> > > + __sync_fetch_and_add(&arena_sum, 0);
> > > list_add_head(&n->node, list_head);
> > > }
> > > #else
> > >
> > > I will try to prepare a fix for the x86 JIT. Puranjay will do the same
> > > for his set.
> >
> > Yes, testing the change mentioned by Kumar on ARM64 causes a crashes as well:
> >
> > bpf_testmod: loading out-of-tree module taints kernel.
> > bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
> > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000010
> > Mem abort info:
> > ESR = 0x0000000096000006
> > EC = 0x25: DABT (current EL), IL = 32 bits
> > SET = 0, FnV = 0
> > EA = 0, S1PTW = 0
> > FSC = 0x06: level 2 translation fault
> > Data abort info:
> > ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
> > CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > user pgtable: 4k pages, 48-bit VAs, pgdp=00000004043cc000
> > [0000000000000010] pgd=0800000410d8f003, p4d=0800000410d8f003, pud=0800000405972003, pmd=0000000000000000
> > Internal error: Oops: 0000000096000006 [#1] SMP
> > Modules linked in: bpf_testmod(OE) nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sha1_ce button sch_fq_codel dm_mod dax configfs dmi_sysfs sha2_ce sha256_arm64 efivarfs
> > CPU: 8 PID: 5631 Comm: test_progs Tainted: G OE 6.8.0+ #2
> > Hardware name: Amazon EC2 c6g.16xlarge/, BIOS 1.0 11/1/2018
> > pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > pc : bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
> > lr : bpf_prog_8771c336cb6a18eb_arena_list_add+0x144/0x2b8
> > sp : ffff80008b84bc30
> > x29: ffff80008b84bca0 x28: ffff8000a5008000 x27: ffff80008b84bc38
> > x26: 0000000000000000 x25: ffff80008b84bc60 x24: 0000000000000000
> > x23: 0000000000000000 x22: 0000000000000058 x21: 0000000000000838
> > x20: 0000000000000000 x19: 0000000100001fe0 x18: 0000000000000000
> > x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffcc66d2c8
> > x14: 0000000000000000 x13: 0000000000000000 x12: 000000000004058c
> > x11: ffff8000a5008010 x10: 00000000ffffffff x9 : 00000000000002cf
> > x8 : ffff800082ff4ab8 x7 : 0000000100001000 x6 : 0000000000000001
> > x5 : 0000000010e5e3fd x4 : 000000003619b978 x3 : 0000000000000010
> > x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000001fe0
> > Call trace:
> > bpf_prog_8771c336cb6a18eb_arena_list_add+0x204/0x2b8
> > bpf_prog_test_run_syscall+0x100/0x340
> > __sys_bpf+0x8e8/0xa20
> > __arm64_sys_bpf+0x2c/0x48
> > invoke_syscall+0x50/0x128
> > el0_svc_common.constprop.0+0x48/0xf8
> > do_el0_svc+0x28/0x40
> > el0_svc+0x58/0x190
> > el0t_64_sync_handler+0x13c/0x158
> > el0t_64_sync+0x1a8/0x1b0
> > Code: 8b010042 8b1c006b f9000162 d2800001 (f821307f)
> > ---[ end trace 0000000000000000 ]---
> > Kernel panic - not syncing: Oops: Fatal exception
> > SMP: stopping secondary CPUs
> > Kernel Offset: disabled
> > CPU features: 0x0,00000120,7002014a,21407a0b
> > Memory Limit: none
> > Rebooting in 5 seconds..
> >
> > I will send v2 with the arm64 JIT fix, but I guess verifier has to be modified
> > as well to add BPF_PROBE_MEM32 to atomic instructions.
>
> The JIT and the verifier changes for atomics might be too big.
> Let's disable atomics in arena in the verifier for now.
> Pls send a patch.
As atomics are disabled in the Arena now, this series will not require
any changes.
Looking forward to the reviews.
Thanks,
Puranjay
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-03-15 10:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-14 15:00 [PATCH bpf-next 0/2] bpf,arm64: Add support for BPF Arena Puranjay Mohan
2024-03-14 15:00 ` [PATCH bpf-next 1/2] bpf: Add arm64 JIT support for PROBE_MEM32 pseudo instructions Puranjay Mohan
2024-03-14 17:07 ` Kumar Kartikeya Dwivedi
2024-03-14 17:13 ` Puranjay Mohan
2024-03-14 17:21 ` Alexei Starovoitov
2024-03-15 10:31 ` Puranjay Mohan
2024-03-14 15:00 ` [PATCH bpf-next 2/2] bpf: Add arm64 JIT support for bpf_addr_space_cast instruction Puranjay Mohan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox