* [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue
@ 2024-09-12 3:59 Yonghong Song
2024-09-12 3:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add tests for sdiv/smod overflow cases Yonghong Song
2024-09-12 18:17 ` [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Andrii Nakryiko
0 siblings, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2024-09-12 3:59 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
Martin KaFai Lau, Zac Ecob
Zac Ecob reported a problem where a bpf program may cause kernel crash due
to the following error:
Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI
The failure is due to the below signed divide:
LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808.
LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808,
but it is impossible since for 64-bit system, the maximum positive
number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will
cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is
LLONG_MIN.
Further investigation found all the following sdiv/smod cases may trigger
an exception when bpf program is running on x86_64 platform:
- LLONG_MIN/-1 for 64bit operation
- INT_MIN/-1 for 32bit operation
- LLONG_MIN%-1 for 64bit operation
- INT_MIN%-1 for 32bit operation
where -1 can be an immediate or in a register.
On arm64, there are no exceptions:
- LLONG_MIN/-1 = LLONG_MIN
- INT_MIN/-1 = INT_MIN
- LLONG_MIN%-1 = 0
- INT_MIN%-1 = 0
where -1 can be an immediate or in a register.
Insn patching is needed to handle the above cases and the patched codes
produced results aligned with above arm64 result.
[1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/
Reported-by: Zac Ecob <zacecob@protonmail.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
kernel/bpf/verifier.c | 84 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 80 insertions(+), 4 deletions(-)
Changelogs:
v1 -> v2:
- Handle more crash cases like 32bit operation and modules.
- Add more tests to test new cases.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f35b80c16cda..ad7f51302c70 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -20499,13 +20499,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
/* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */
insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code);
- /* Make divide-by-zero exceptions impossible. */
+ /* Make sdiv/smod divide-by-minus-one exceptions impossible. */
+ if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) ||
+ insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) ||
+ insn->code == (BPF_ALU | BPF_MOD | BPF_K) ||
+ insn->code == (BPF_ALU | BPF_DIV | BPF_K)) &&
+ insn->off == 1 && insn->imm == -1) {
+ bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
+ bool isdiv = BPF_OP(insn->code) == BPF_DIV;
+ struct bpf_insn *patchlet;
+ struct bpf_insn chk_and_div[] = {
+ BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) |
+ BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg,
+ 0, 0, 0),
+ };
+ struct bpf_insn chk_and_mod[] = {
+ BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg),
+ };
+
+ patchlet = isdiv ? chk_and_div : chk_and_mod;
+ cnt = isdiv ? ARRAY_SIZE(chk_and_div) : ARRAY_SIZE(chk_and_mod);
+
+ new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt);
+ if (!new_prog)
+ return -ENOMEM;
+
+ delta += cnt - 1;
+ env->prog = prog = new_prog;
+ insn = new_prog->insnsi + i + delta;
+ goto next_insn;
+ }
+
+ /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */
if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) ||
insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) ||
insn->code == (BPF_ALU | BPF_MOD | BPF_X) ||
insn->code == (BPF_ALU | BPF_DIV | BPF_X)) {
bool is64 = BPF_CLASS(insn->code) == BPF_ALU64;
bool isdiv = BPF_OP(insn->code) == BPF_DIV;
+ bool is_sdiv = isdiv && insn->off == 1;
+ bool is_smod = !isdiv && insn->off == 1;
struct bpf_insn *patchlet;
struct bpf_insn chk_and_div[] = {
/* [R,W]x div 0 -> 0 */
@@ -20525,10 +20558,53 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
BPF_JMP_IMM(BPF_JA, 0, 0, 1),
BPF_MOV32_REG(insn->dst_reg, insn->dst_reg),
};
+ struct bpf_insn chk_and_sdiv[] = {
+ /* [R,W]x sdiv 0 -> 0 */
+ BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) |
+ BPF_JNE | BPF_K, insn->src_reg,
+ 0, 2, 0),
+ BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 4),
+ /* LLONG_MIN sdiv -1 -> LLONG_MIN
+ * INT_MIN sdiv -1 -> INT_MIN
+ */
+ BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) |
+ BPF_JNE | BPF_K, insn->src_reg,
+ 0, 2, -1),
+ /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */
+ BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) |
+ BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg,
+ 0, 0, 0),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+ *insn,
+ };
+ struct bpf_insn chk_and_smod[] = {
+ /* [R,W]x mod 0 -> [R,W]x */
+ BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) |
+ BPF_JNE | BPF_K, insn->src_reg,
+ 0, 2, 0),
+ BPF_MOV32_REG(insn->dst_reg, insn->dst_reg),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 4),
+ /* [R,W]x mod -1 -> 0 */
+ BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) |
+ BPF_JNE | BPF_K, insn->src_reg,
+ 0, 2, -1),
+ BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg),
+ BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+ *insn,
+ };
- patchlet = isdiv ? chk_and_div : chk_and_mod;
- cnt = isdiv ? ARRAY_SIZE(chk_and_div) :
- ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0);
+ if (is_sdiv) {
+ patchlet = chk_and_sdiv;
+ cnt = ARRAY_SIZE(chk_and_sdiv);
+ } else if (is_smod) {
+ patchlet = chk_and_smod;
+ cnt = ARRAY_SIZE(chk_and_smod);
+ } else {
+ patchlet = isdiv ? chk_and_div : chk_and_mod;
+ cnt = isdiv ? ARRAY_SIZE(chk_and_div) :
+ ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0);
+ }
new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt);
if (!new_prog)
--
2.43.5
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH bpf-next v2 2/2] selftests/bpf: Add tests for sdiv/smod overflow cases 2024-09-12 3:59 [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Yonghong Song @ 2024-09-12 3:59 ` Yonghong Song 2024-09-12 18:17 ` [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Andrii Nakryiko 1 sibling, 0 replies; 6+ messages in thread From: Yonghong Song @ 2024-09-12 3:59 UTC (permalink / raw) To: bpf Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau Subtests are added to exercise the patched code which handles - LLONG_MIN/-1 - INT_MIN/-1 - LLONG_MIN%-1 - INT_MIN%-1 where -1 could be an immediate or in a register. Without the previous patch, all these cases will crash the kernel on x86_64 platform. Additional tests are added to use small values (e.g. -5/-1, 5%-1, etc.) in order to exercise the additional logic with patched insns. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> --- .../selftests/bpf/progs/verifier_sdiv.c | 431 ++++++++++++++++++ 1 file changed, 431 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/verifier_sdiv.c b/tools/testing/selftests/bpf/progs/verifier_sdiv.c index 2a2271cf0294..eebe34fecc4d 100644 --- a/tools/testing/selftests/bpf/progs/verifier_sdiv.c +++ b/tools/testing/selftests/bpf/progs/verifier_sdiv.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include <linux/bpf.h> +#include <limits.h> #include <bpf/bpf_helpers.h> #include "bpf_misc.h" @@ -770,6 +771,436 @@ __naked void smod64_zero_divisor(void) " ::: __clobber_all); } +SEC("socket") +__description("SDIV64, overflow r/r, LLONG_MIN/-1") +__success __retval(1) +__arch_x86_64 +__xlated("0: r2 = 0x8000000000000000") +__xlated("2: r3 = -1") +__xlated("3: r4 = r2") +__xlated("4: if r3 != 0x0 goto pc+2") +__xlated("5: w2 ^= w2") +__xlated("6: goto pc+4") +__xlated("7: if r3 != 0xffffffff goto pc+2") +__xlated("8: r2 = -r2") +__xlated("9: goto pc+1") +__xlated("10: r2 s/= r3") +__xlated("11: r0 = 0") +__xlated("12: if r2 != r4 goto pc+1") +__xlated("13: r0 = 1") +__xlated("14: exit") +__naked void sdiv64_overflow_rr(void) +{ + asm volatile (" \ + r2 = %[llong_min] ll; \ + r3 = -1; \ + r4 = r2; \ + r2 s/= r3; \ + r0 = 0; \ + if r2 != r4 goto +1; \ + r0 = 1; \ + exit; \ +" : + : __imm_const(llong_min, LLONG_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SDIV64, r/r, small_val/-1") +__success __retval(-5) +__arch_x86_64 +__xlated("0: r2 = 5") +__xlated("1: r3 = -1") +__xlated("2: if r3 != 0x0 goto pc+2") +__xlated("3: w2 ^= w2") +__xlated("4: goto pc+4") +__xlated("5: if r3 != 0xffffffff goto pc+2") +__xlated("6: r2 = -r2") +__xlated("7: goto pc+1") +__xlated("8: r2 s/= r3") +__xlated("9: r0 = r2") +__xlated("10: exit") +__naked void sdiv64_rr_divisor_neg_1(void) +{ + asm volatile (" \ + r2 = 5; \ + r3 = -1; \ + r2 s/= r3; \ + r0 = r2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SDIV64, overflow r/i, LLONG_MIN/-1") +__success __retval(1) +__arch_x86_64 +__xlated("0: r2 = 0x8000000000000000") +__xlated("2: r4 = r2") +__xlated("3: r2 = -r2") +__xlated("4: r0 = 0") +__xlated("5: if r2 != r4 goto pc+1") +__xlated("6: r0 = 1") +__xlated("7: exit") +__naked void sdiv64_overflow_ri(void) +{ + asm volatile (" \ + r2 = %[llong_min] ll; \ + r4 = r2; \ + r2 s/= -1; \ + r0 = 0; \ + if r2 != r4 goto +1; \ + r0 = 1; \ + exit; \ +" : + : __imm_const(llong_min, LLONG_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SDIV64, r/i, small_val/-1") +__success __retval(-5) +__arch_x86_64 +__xlated("0: r2 = 5") +__xlated("1: r4 = r2") +__xlated("2: r2 = -r2") +__xlated("3: r0 = r2") +__xlated("4: exit") +__naked void sdiv64_ri_divisor_neg_1(void) +{ + asm volatile (" \ + r2 = 5; \ + r4 = r2; \ + r2 s/= -1; \ + r0 = r2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SDIV32, overflow r/r, INT_MIN/-1") +__success __retval(1) +__arch_x86_64 +__xlated("0: w2 = -2147483648") +__xlated("1: w3 = -1") +__xlated("2: w4 = w2") +__xlated("3: if w3 != 0x0 goto pc+2") +__xlated("4: w2 ^= w2") +__xlated("5: goto pc+4") +__xlated("6: if w3 != 0xffffffff goto pc+2") +__xlated("7: w2 = -w2") +__xlated("8: goto pc+1") +__xlated("9: w2 s/= w3") +__xlated("10: r0 = 0") +__xlated("11: if w2 != w4 goto pc+1") +__xlated("12: r0 = 1") +__xlated("13: exit") +__naked void sdiv32_overflow_rr(void) +{ + asm volatile (" \ + w2 = %[int_min]; \ + w3 = -1; \ + w4 = w2; \ + w2 s/= w3; \ + r0 = 0; \ + if w2 != w4 goto +1; \ + r0 = 1; \ + exit; \ +" : + : __imm_const(int_min, INT_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SDIV32, r/r, small_val/-1") +__success __retval(5) +__arch_x86_64 +__xlated("0: w2 = -5") +__xlated("1: w3 = -1") +__xlated("2: w4 = w2") +__xlated("3: if w3 != 0x0 goto pc+2") +__xlated("4: w2 ^= w2") +__xlated("5: goto pc+4") +__xlated("6: if w3 != 0xffffffff goto pc+2") +__xlated("7: w2 = -w2") +__xlated("8: goto pc+1") +__xlated("9: w2 s/= w3") +__xlated("10: w0 = w2") +__xlated("11: exit") +__naked void sdiv32_rr_divisor_neg_1(void) +{ + asm volatile (" \ + w2 = -5; \ + w3 = -1; \ + w4 = w2; \ + w2 s/= w3; \ + w0 = w2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SDIV32, overflow r/i, INT_MIN/-1") +__success __retval(1) +__arch_x86_64 +__xlated("0: w2 = -2147483648") +__xlated("1: w4 = w2") +__xlated("2: w2 = -w2") +__xlated("3: r0 = 0") +__xlated("4: if w2 != w4 goto pc+1") +__xlated("5: r0 = 1") +__xlated("6: exit") +__naked void sdiv32_overflow_ri(void) +{ + asm volatile (" \ + w2 = %[int_min]; \ + w4 = w2; \ + w2 s/= -1; \ + r0 = 0; \ + if w2 != w4 goto +1; \ + r0 = 1; \ + exit; \ +" : + : __imm_const(int_min, INT_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SDIV32, r/i, small_val/-1") +__success __retval(-5) +__arch_x86_64 +__xlated("0: w2 = 5") +__xlated("1: w4 = w2") +__xlated("2: w2 = -w2") +__xlated("3: w0 = w2") +__xlated("4: exit") +__naked void sdiv32_ri_divisor_neg_1(void) +{ + asm volatile (" \ + w2 = 5; \ + w4 = w2; \ + w2 s/= -1; \ + w0 = w2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SMOD64, overflow r/r, LLONG_MIN/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: r2 = 0x8000000000000000") +__xlated("2: r3 = -1") +__xlated("3: r4 = r2") +__xlated("4: if r3 != 0x0 goto pc+2") +__xlated("5: w2 = w2") +__xlated("6: goto pc+4") +__xlated("7: if r3 != 0xffffffff goto pc+2") +__xlated("8: w2 ^= w2") +__xlated("9: goto pc+1") +__xlated("10: r2 s%= r3") +__xlated("11: r0 = r2") +__xlated("12: exit") +__naked void smod64_overflow_rr(void) +{ + asm volatile (" \ + r2 = %[llong_min] ll; \ + r3 = -1; \ + r4 = r2; \ + r2 s%%= r3; \ + r0 = r2; \ + exit; \ +" : + : __imm_const(llong_min, LLONG_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SMOD64, r/r, small_val/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: r2 = 5") +__xlated("1: r3 = -1") +__xlated("2: r4 = r2") +__xlated("3: if r3 != 0x0 goto pc+2") +__xlated("4: w2 = w2") +__xlated("5: goto pc+4") +__xlated("6: if r3 != 0xffffffff goto pc+2") +__xlated("7: w2 ^= w2") +__xlated("8: goto pc+1") +__xlated("9: r2 s%= r3") +__xlated("10: r0 = r2") +__xlated("11: exit") +__naked void smod64_rr_divisor_neg_1(void) +{ + asm volatile (" \ + r2 = 5; \ + r3 = -1; \ + r4 = r2; \ + r2 s%%= r3; \ + r0 = r2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SMOD64, overflow r/i, LLONG_MIN/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: r2 = 0x8000000000000000") +__xlated("2: r4 = r2") +__xlated("3: w2 ^= w2") +__xlated("4: r0 = r2") +__xlated("5: exit") +__naked void smod64_overflow_ri(void) +{ + asm volatile (" \ + r2 = %[llong_min] ll; \ + r4 = r2; \ + r2 s%%= -1; \ + r0 = r2; \ + exit; \ +" : + : __imm_const(llong_min, LLONG_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SMOD64, r/i, small_val/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: r2 = 5") +__xlated("1: r4 = r2") +__xlated("2: w2 ^= w2") +__xlated("3: r0 = r2") +__xlated("4: exit") +__naked void smod64_ri_divisor_neg_1(void) +{ + asm volatile (" \ + r2 = 5; \ + r4 = r2; \ + r2 s%%= -1; \ + r0 = r2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SMOD32, overflow r/r, INT_MIN/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: w2 = -2147483648") +__xlated("1: w3 = -1") +__xlated("2: w4 = w2") +__xlated("3: if w3 != 0x0 goto pc+2") +__xlated("4: w2 = w2") +__xlated("5: goto pc+4") +__xlated("6: if w3 != 0xffffffff goto pc+2") +__xlated("7: w2 ^= w2") +__xlated("8: goto pc+1") +__xlated("9: w2 s%= w3") +__xlated("10: r0 = r2") +__xlated("11: exit") +__naked void smod32_overflow_rr(void) +{ + asm volatile (" \ + w2 = %[int_min]; \ + w3 = -1; \ + w4 = w2; \ + w2 s%%= w3; \ + r0 = r2; \ + exit; \ +" : + : __imm_const(int_min, INT_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SMOD32, r/r, small_val/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: w2 = -5") +__xlated("1: w3 = -1") +__xlated("2: w4 = w2") +__xlated("3: if w3 != 0x0 goto pc+2") +__xlated("4: w2 = w2") +__xlated("5: goto pc+4") +__xlated("6: if w3 != 0xffffffff goto pc+2") +__xlated("7: w2 ^= w2") +__xlated("8: goto pc+1") +__xlated("9: w2 s%= w3") +__xlated("10: r0 = r2") +__xlated("11: exit") +__naked void smod32_rr_divisor_neg_1(void) +{ + asm volatile (" \ + w2 = -5; \ + w3 = -1; \ + w4 = w2; \ + w2 s%%= w3; \ + r0 = r2; \ + exit; \ +" : + : + : __clobber_all); +} + +SEC("socket") +__description("SMOD32, overflow r/i, INT_MIN/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: w2 = -2147483648") +__xlated("1: w4 = w2") +__xlated("2: w2 ^= w2") +__xlated("3: r0 = r2") +__xlated("4: exit") +__naked void smod32_overflow_ri(void) +{ + asm volatile (" \ + w2 = %[int_min]; \ + w4 = w2; \ + w2 s%%= -1; \ + r0 = r2; \ + exit; \ +" : + : __imm_const(int_min, INT_MIN) + : __clobber_all); +} + +SEC("socket") +__description("SMOD32, r/i, small_val/-1") +__success __retval(0) +__arch_x86_64 +__xlated("0: w2 = 5") +__xlated("1: w4 = w2") +__xlated("2: w2 ^= w2") +__xlated("3: w0 = w2") +__xlated("4: exit") +__naked void smod32_ri_divisor_neg_1(void) +{ + asm volatile (" \ + w2 = 5; \ + w4 = w2; \ + w2 s%%= -1; \ + w0 = w2; \ + exit; \ +" : + : + : __clobber_all); +} + #else SEC("socket") -- 2.43.5 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue 2024-09-12 3:59 [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Yonghong Song 2024-09-12 3:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add tests for sdiv/smod overflow cases Yonghong Song @ 2024-09-12 18:17 ` Andrii Nakryiko 2024-09-12 18:19 ` Andrii Nakryiko 2024-09-12 22:53 ` Yonghong Song 1 sibling, 2 replies; 6+ messages in thread From: Andrii Nakryiko @ 2024-09-12 18:17 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Zac Ecob On Wed, Sep 11, 2024 at 9:00 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > Zac Ecob reported a problem where a bpf program may cause kernel crash due > to the following error: > Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI > > The failure is due to the below signed divide: > LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808. > LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808, > but it is impossible since for 64-bit system, the maximum positive > number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will > cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is > LLONG_MIN. > > Further investigation found all the following sdiv/smod cases may trigger > an exception when bpf program is running on x86_64 platform: > - LLONG_MIN/-1 for 64bit operation > - INT_MIN/-1 for 32bit operation > - LLONG_MIN%-1 for 64bit operation > - INT_MIN%-1 for 32bit operation > where -1 can be an immediate or in a register. > > On arm64, there are no exceptions: > - LLONG_MIN/-1 = LLONG_MIN > - INT_MIN/-1 = INT_MIN > - LLONG_MIN%-1 = 0 > - INT_MIN%-1 = 0 > where -1 can be an immediate or in a register. > > Insn patching is needed to handle the above cases and the patched codes > produced results aligned with above arm64 result. > > [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/ > > Reported-by: Zac Ecob <zacecob@protonmail.com> > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > --- > kernel/bpf/verifier.c | 84 ++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 80 insertions(+), 4 deletions(-) > > Changelogs: > v1 -> v2: > - Handle more crash cases like 32bit operation and modules. > - Add more tests to test new cases. > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index f35b80c16cda..ad7f51302c70 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -20499,13 +20499,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ > insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); > > - /* Make divide-by-zero exceptions impossible. */ > + /* Make sdiv/smod divide-by-minus-one exceptions impossible. */ > + if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) || > + insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) || > + insn->code == (BPF_ALU | BPF_MOD | BPF_K) || > + insn->code == (BPF_ALU | BPF_DIV | BPF_K)) && > + insn->off == 1 && insn->imm == -1) { > + bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > + bool isdiv = BPF_OP(insn->code) == BPF_DIV; > + struct bpf_insn *patchlet; > + struct bpf_insn chk_and_div[] = { > + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > + 0, 0, 0), > + }; > + struct bpf_insn chk_and_mod[] = { > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > + }; > + > + patchlet = isdiv ? chk_and_div : chk_and_mod; nit: "chk_and_" part in the name is misleading, it's more like "safe_div" and "safe_mod". Oh, and it's "sdiv" and "smod" specific, so probably not a bad idea to have that in the name as well. > + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : ARRAY_SIZE(chk_and_mod); > + > + new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > + if (!new_prog) > + return -ENOMEM; > + > + delta += cnt - 1; > + env->prog = prog = new_prog; > + insn = new_prog->insnsi + i + delta; > + goto next_insn; > + } > + > + /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */ > if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || > insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || > insn->code == (BPF_ALU | BPF_MOD | BPF_X) || > insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { > bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > bool isdiv = BPF_OP(insn->code) == BPF_DIV; > + bool is_sdiv = isdiv && insn->off == 1; > + bool is_smod = !isdiv && insn->off == 1; > struct bpf_insn *patchlet; > struct bpf_insn chk_and_div[] = { > /* [R,W]x div 0 -> 0 */ > @@ -20525,10 +20558,53 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > BPF_JMP_IMM(BPF_JA, 0, 0, 1), > BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > }; > + struct bpf_insn chk_and_sdiv[] = { > + /* [R,W]x sdiv 0 -> 0 */ > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > + BPF_JNE | BPF_K, insn->src_reg, > + 0, 2, 0), > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > + /* LLONG_MIN sdiv -1 -> LLONG_MIN > + * INT_MIN sdiv -1 -> INT_MIN > + */ > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > + BPF_JNE | BPF_K, insn->src_reg, > + 0, 2, -1), > + /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */ > + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > + 0, 0, 0), > + BPF_JMP_IMM(BPF_JA, 0, 0, 1), I don't know how much it actually matters, but it feels like common safe case should be as straight-line-executed as possible, no? So maybe it's better to rearrange to roughly this (where rX is the divisor register): if rX == 0 goto L1 if rX == -1 goto L2 rY /= rX goto L3 L1: /* zero case */ rY = 0 /* fallthrough, negation doesn't hurt, but less jumping */ L2: /* negative one case (or zero) */ rY = -rY L3: ... the rest of the program code ... Those two branches for common case are still annoyingly inefficient, I wonder if we should do rX += 1 /* [-1, 0] -> [0, 1] if rX <=(unsigned) 1 goto L1 rX -= 1 /* restore original divisor */ rY /= rX /* common case */ goto L3 L1: if rX == 0 goto L2 /* jump if originally -1 */ rY = 0 /* division by zero case */ L2: /* fallthrough */ rY = -rY rX -= 1 /* restore original divisor */ L3: ... continue with the rest ... It's a bit trickier to follow, but should be faster in a common case. WDYT? Too much too far? > + *insn, > + }; > + struct bpf_insn chk_and_smod[] = { > + /* [R,W]x mod 0 -> [R,W]x */ > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > + BPF_JNE | BPF_K, insn->src_reg, > + 0, 2, 0), > + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > + /* [R,W]x mod -1 -> 0 */ > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > + BPF_JNE | BPF_K, insn->src_reg, > + 0, 2, -1), > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > + *insn, > + }; > Same idea here, keep the common case as straight as possible. > - patchlet = isdiv ? chk_and_div : chk_and_mod; > - cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > - ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > + if (is_sdiv) { > + patchlet = chk_and_sdiv; > + cnt = ARRAY_SIZE(chk_and_sdiv); > + } else if (is_smod) { > + patchlet = chk_and_smod; > + cnt = ARRAY_SIZE(chk_and_smod); > + } else { > + patchlet = isdiv ? chk_and_div : chk_and_mod; > + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > + ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > + } > > new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > if (!new_prog) > -- > 2.43.5 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue 2024-09-12 18:17 ` [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Andrii Nakryiko @ 2024-09-12 18:19 ` Andrii Nakryiko 2024-09-12 22:53 ` Yonghong Song 1 sibling, 0 replies; 6+ messages in thread From: Andrii Nakryiko @ 2024-09-12 18:19 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Zac Ecob On Thu, Sep 12, 2024 at 11:17 AM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Sep 11, 2024 at 9:00 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > > Zac Ecob reported a problem where a bpf program may cause kernel crash due > > to the following error: > > Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI > > > > The failure is due to the below signed divide: > > LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808. > > LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808, > > but it is impossible since for 64-bit system, the maximum positive > > number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will > > cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is > > LLONG_MIN. > > > > Further investigation found all the following sdiv/smod cases may trigger > > an exception when bpf program is running on x86_64 platform: > > - LLONG_MIN/-1 for 64bit operation > > - INT_MIN/-1 for 32bit operation > > - LLONG_MIN%-1 for 64bit operation > > - INT_MIN%-1 for 32bit operation > > where -1 can be an immediate or in a register. > > > > On arm64, there are no exceptions: > > - LLONG_MIN/-1 = LLONG_MIN > > - INT_MIN/-1 = INT_MIN > > - LLONG_MIN%-1 = 0 > > - INT_MIN%-1 = 0 > > where -1 can be an immediate or in a register. > > > > Insn patching is needed to handle the above cases and the patched codes > > produced results aligned with above arm64 result. > > > > [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/ > > > > Reported-by: Zac Ecob <zacecob@protonmail.com> > > Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > > --- > > kernel/bpf/verifier.c | 84 ++++++++++++++++++++++++++++++++++++++++--- > > 1 file changed, 80 insertions(+), 4 deletions(-) > > > > Changelogs: > > v1 -> v2: > > - Handle more crash cases like 32bit operation and modules. > > - Add more tests to test new cases. > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > > index f35b80c16cda..ad7f51302c70 100644 > > --- a/kernel/bpf/verifier.c > > +++ b/kernel/bpf/verifier.c > > @@ -20499,13 +20499,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ > > insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); > > > > - /* Make divide-by-zero exceptions impossible. */ > > + /* Make sdiv/smod divide-by-minus-one exceptions impossible. */ > > + if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) || > > + insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) || > > + insn->code == (BPF_ALU | BPF_MOD | BPF_K) || > > + insn->code == (BPF_ALU | BPF_DIV | BPF_K)) && > > + insn->off == 1 && insn->imm == -1) { > > + bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > > + bool isdiv = BPF_OP(insn->code) == BPF_DIV; > > + struct bpf_insn *patchlet; > > + struct bpf_insn chk_and_div[] = { > > + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > > + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > > + 0, 0, 0), > > + }; > > + struct bpf_insn chk_and_mod[] = { > > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > > + }; > > + > > + patchlet = isdiv ? chk_and_div : chk_and_mod; > > nit: "chk_and_" part in the name is misleading, it's more like > "safe_div" and "safe_mod". Oh, and it's "sdiv" and "smod" specific, so > probably not a bad idea to have that in the name as well. > > > + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : ARRAY_SIZE(chk_and_mod); > > + > > + new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > > + if (!new_prog) > > + return -ENOMEM; > > + > > + delta += cnt - 1; > > + env->prog = prog = new_prog; > > + insn = new_prog->insnsi + i + delta; > > + goto next_insn; > > + } > > + > > + /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */ > > if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || > > insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || > > insn->code == (BPF_ALU | BPF_MOD | BPF_X) || > > insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { > > bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > > bool isdiv = BPF_OP(insn->code) == BPF_DIV; > > + bool is_sdiv = isdiv && insn->off == 1; > > + bool is_smod = !isdiv && insn->off == 1; > > struct bpf_insn *patchlet; > > struct bpf_insn chk_and_div[] = { > > /* [R,W]x div 0 -> 0 */ > > @@ -20525,10 +20558,53 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > > BPF_JMP_IMM(BPF_JA, 0, 0, 1), > > BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > > }; > > + struct bpf_insn chk_and_sdiv[] = { > > + /* [R,W]x sdiv 0 -> 0 */ > > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > > + BPF_JNE | BPF_K, insn->src_reg, > > + 0, 2, 0), > > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > > + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > > + /* LLONG_MIN sdiv -1 -> LLONG_MIN > > + * INT_MIN sdiv -1 -> INT_MIN > > + */ > > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > > + BPF_JNE | BPF_K, insn->src_reg, > > + 0, 2, -1), > > + /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */ > > + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > > + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > > + 0, 0, 0), > > + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > > I don't know how much it actually matters, but it feels like common > safe case should be as straight-line-executed as possible, no? > > So maybe it's better to rearrange to roughly this (where rX is the > divisor register): > > if rX == 0 goto L1 > if rX == -1 goto L2 > rY /= rX > goto L3 > L1: /* zero case */ > rY = 0 /* fallthrough, negation doesn't hurt, but less jumping */ > L2: /* negative one case (or zero) */ > rY = -rY > L3: > ... the rest of the program code ... > > > Those two branches for common case are still annoyingly inefficient, I > wonder if we should do > > rX += 1 /* [-1, 0] -> [0, 1] > if rX <=(unsigned) 1 goto L1 > rX -= 1 /* restore original divisor */ > rY /= rX /* common case */ > goto L3 > L1: > if rX == 0 goto L2 /* jump if originally -1 */ > rY = 0 /* division by zero case */ > L2: /* fallthrough */ > rY = -rY > rX -= 1 /* restore original divisor */ > L3: > ... continue with the rest ... hmm.. just in case rX is the same register as rY, probably best to restore rX early right at L1: label (and adjust `if rX == 0 goto L2` into `if rX != 0 goto L2`). > > > It's a bit trickier to follow, but should be faster in a common case. > > WDYT? Too much too far? > > > > + *insn, > > + }; > > + struct bpf_insn chk_and_smod[] = { > > + /* [R,W]x mod 0 -> [R,W]x */ > > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > > + BPF_JNE | BPF_K, insn->src_reg, > > + 0, 2, 0), > > + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > > + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > > + /* [R,W]x mod -1 -> 0 */ > > + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > > + BPF_JNE | BPF_K, insn->src_reg, > > + 0, 2, -1), > > + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > > + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > > + *insn, > > + }; > > > > Same idea here, keep the common case as straight as possible. > > > - patchlet = isdiv ? chk_and_div : chk_and_mod; > > - cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > > - ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > > + if (is_sdiv) { > > + patchlet = chk_and_sdiv; > > + cnt = ARRAY_SIZE(chk_and_sdiv); > > + } else if (is_smod) { > > + patchlet = chk_and_smod; > > + cnt = ARRAY_SIZE(chk_and_smod); > > + } else { > > + patchlet = isdiv ? chk_and_div : chk_and_mod; > > + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > > + ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > > + } > > > > new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > > if (!new_prog) > > -- > > 2.43.5 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue 2024-09-12 18:17 ` [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Andrii Nakryiko 2024-09-12 18:19 ` Andrii Nakryiko @ 2024-09-12 22:53 ` Yonghong Song 2024-09-13 2:00 ` Andrii Nakryiko 1 sibling, 1 reply; 6+ messages in thread From: Yonghong Song @ 2024-09-12 22:53 UTC (permalink / raw) To: Andrii Nakryiko Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Zac Ecob On 9/12/24 11:17 AM, Andrii Nakryiko wrote: > On Wed, Sep 11, 2024 at 9:00 PM Yonghong Song <yonghong.song@linux.dev> wrote: >> Zac Ecob reported a problem where a bpf program may cause kernel crash due >> to the following error: >> Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI >> >> The failure is due to the below signed divide: >> LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808. >> LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808, >> but it is impossible since for 64-bit system, the maximum positive >> number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will >> cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is >> LLONG_MIN. >> >> Further investigation found all the following sdiv/smod cases may trigger >> an exception when bpf program is running on x86_64 platform: >> - LLONG_MIN/-1 for 64bit operation >> - INT_MIN/-1 for 32bit operation >> - LLONG_MIN%-1 for 64bit operation >> - INT_MIN%-1 for 32bit operation >> where -1 can be an immediate or in a register. >> >> On arm64, there are no exceptions: >> - LLONG_MIN/-1 = LLONG_MIN >> - INT_MIN/-1 = INT_MIN >> - LLONG_MIN%-1 = 0 >> - INT_MIN%-1 = 0 >> where -1 can be an immediate or in a register. >> >> Insn patching is needed to handle the above cases and the patched codes >> produced results aligned with above arm64 result. >> >> [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/ >> >> Reported-by: Zac Ecob <zacecob@protonmail.com> >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> >> --- >> kernel/bpf/verifier.c | 84 ++++++++++++++++++++++++++++++++++++++++--- >> 1 file changed, 80 insertions(+), 4 deletions(-) >> >> Changelogs: >> v1 -> v2: >> - Handle more crash cases like 32bit operation and modules. >> - Add more tests to test new cases. >> >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index f35b80c16cda..ad7f51302c70 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -20499,13 +20499,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env) >> /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ >> insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); >> >> - /* Make divide-by-zero exceptions impossible. */ >> + /* Make sdiv/smod divide-by-minus-one exceptions impossible. */ >> + if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) || >> + insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) || >> + insn->code == (BPF_ALU | BPF_MOD | BPF_K) || >> + insn->code == (BPF_ALU | BPF_DIV | BPF_K)) && >> + insn->off == 1 && insn->imm == -1) { >> + bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; >> + bool isdiv = BPF_OP(insn->code) == BPF_DIV; >> + struct bpf_insn *patchlet; >> + struct bpf_insn chk_and_div[] = { >> + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | >> + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, >> + 0, 0, 0), >> + }; >> + struct bpf_insn chk_and_mod[] = { >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), >> + }; >> + >> + patchlet = isdiv ? chk_and_div : chk_and_mod; > nit: "chk_and_" part in the name is misleading, it's more like > "safe_div" and "safe_mod". Oh, and it's "sdiv" and "smod" specific, so > probably not a bad idea to have that in the name as well. good idea. Will use chk_and_sdiv and chk_and_smod. > >> + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : ARRAY_SIZE(chk_and_mod); >> + >> + new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); >> + if (!new_prog) >> + return -ENOMEM; >> + >> + delta += cnt - 1; >> + env->prog = prog = new_prog; >> + insn = new_prog->insnsi + i + delta; >> + goto next_insn; >> + } >> + >> + /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */ >> if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || >> insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || >> insn->code == (BPF_ALU | BPF_MOD | BPF_X) || >> insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { >> bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; >> bool isdiv = BPF_OP(insn->code) == BPF_DIV; >> + bool is_sdiv = isdiv && insn->off == 1; >> + bool is_smod = !isdiv && insn->off == 1; >> struct bpf_insn *patchlet; >> struct bpf_insn chk_and_div[] = { >> /* [R,W]x div 0 -> 0 */ >> @@ -20525,10 +20558,53 @@ static int do_misc_fixups(struct bpf_verifier_env *env) >> BPF_JMP_IMM(BPF_JA, 0, 0, 1), >> BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), >> }; >> + struct bpf_insn chk_and_sdiv[] = { >> + /* [R,W]x sdiv 0 -> 0 */ >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | >> + BPF_JNE | BPF_K, insn->src_reg, >> + 0, 2, 0), >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), >> + BPF_JMP_IMM(BPF_JA, 0, 0, 4), >> + /* LLONG_MIN sdiv -1 -> LLONG_MIN >> + * INT_MIN sdiv -1 -> INT_MIN >> + */ >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | >> + BPF_JNE | BPF_K, insn->src_reg, >> + 0, 2, -1), >> + /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */ >> + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | >> + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, >> + 0, 0, 0), >> + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > I don't know how much it actually matters, but it feels like common > safe case should be as straight-line-executed as possible, no? > > So maybe it's better to rearrange to roughly this (where rX is the > divisor register): > > if rX == 0 goto L1 > if rX == -1 goto L2 > rY /= rX > goto L3 > L1: /* zero case */ > rY = 0 /* fallthrough, negation doesn't hurt, but less jumping */ > L2: /* negative one case (or zero) */ > rY = -rY > L3: > ... the rest of the program code ... My previous patched insn try to clearly separate rX == 0 and rX == -1 case. It has 2 insns including 2 cond jmps, 2 uncond jmps and one 3 alu operations. The above one removed one uncond jmp, which is indeed better. > > > Those two branches for common case are still annoyingly inefficient, I > wonder if we should do > > rX += 1 /* [-1, 0] -> [0, 1] > if rX <=(unsigned) 1 goto L1 > rX -= 1 /* restore original divisor */ > rY /= rX /* common case */ > goto L3 > L1: > if rX == 0 goto L2 /* jump if originally -1 */ > rY = 0 /* division by zero case */ > L2: /* fallthrough */ > rY = -rY > rX -= 1 /* restore original divisor */ > L3: > ... continue with the rest ... > > > It's a bit trickier to follow, but should be faster in a common case. > > WDYT? Too much too far? This is even better. The above rX -= 1 can be removed if we use BPF_REG_AX as the temporary register. For example, tmp = rX tmp += 1 /* [-1, 0] -> [0, 1] if tmp <=(unsigned) 1 goto L1 rY /= rX /* common case */ goto L3 L1: if tmp == 0 goto L2 /* jump if originally -1 */ rY = 0 /* division by zero case */ L2: /* fallthrough */ rY = -rY L3: ... continue with the rest ... Maybe we can do even better tmp = rX tmp += 1 /* [-1, 0] -> [0, 1] if tmp >(unsigned) 1 goto L2 if tmp == 0 goto L1 rY = 0 L1: rY = -rY; goto L3 L2: rY /= rX L3: Could this be even better by reducing one uncond jmp in the fast path? > > >> + *insn, >> + }; >> + struct bpf_insn chk_and_smod[] = { >> + /* [R,W]x mod 0 -> [R,W]x */ >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | >> + BPF_JNE | BPF_K, insn->src_reg, >> + 0, 2, 0), >> + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), >> + BPF_JMP_IMM(BPF_JA, 0, 0, 4), >> + /* [R,W]x mod -1 -> 0 */ >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | >> + BPF_JNE | BPF_K, insn->src_reg, >> + 0, 2, -1), >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), >> + BPF_JMP_IMM(BPF_JA, 0, 0, 1), >> + *insn, >> + }; >> > Same idea here, keep the common case as straight as possible. Sure. Will do. > >> - patchlet = isdiv ? chk_and_div : chk_and_mod; >> - cnt = isdiv ? ARRAY_SIZE(chk_and_div) : >> - ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); >> + if (is_sdiv) { >> + patchlet = chk_and_sdiv; >> + cnt = ARRAY_SIZE(chk_and_sdiv); >> + } else if (is_smod) { >> + patchlet = chk_and_smod; >> + cnt = ARRAY_SIZE(chk_and_smod); >> + } else { >> + patchlet = isdiv ? chk_and_div : chk_and_mod; >> + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : >> + ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); >> + } >> >> new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); >> if (!new_prog) >> -- >> 2.43.5 >> ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue 2024-09-12 22:53 ` Yonghong Song @ 2024-09-13 2:00 ` Andrii Nakryiko 0 siblings, 0 replies; 6+ messages in thread From: Andrii Nakryiko @ 2024-09-13 2:00 UTC (permalink / raw) To: Yonghong Song Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team, Martin KaFai Lau, Zac Ecob On Thu, Sep 12, 2024 at 3:53 PM Yonghong Song <yonghong.song@linux.dev> wrote: > > > On 9/12/24 11:17 AM, Andrii Nakryiko wrote: > > On Wed, Sep 11, 2024 at 9:00 PM Yonghong Song <yonghong.song@linux.dev> wrote: > >> Zac Ecob reported a problem where a bpf program may cause kernel crash due > >> to the following error: > >> Oops: divide error: 0000 [#1] PREEMPT SMP KASAN PTI > >> > >> The failure is due to the below signed divide: > >> LLONG_MIN/-1 where LLONG_MIN equals to -9,223,372,036,854,775,808. > >> LLONG_MIN/-1 is supposed to give a positive number 9,223,372,036,854,775,808, > >> but it is impossible since for 64-bit system, the maximum positive > >> number is 9,223,372,036,854,775,807. On x86_64, LLONG_MIN/-1 will > >> cause a kernel exception. On arm64, the result for LLONG_MIN/-1 is > >> LLONG_MIN. > >> > >> Further investigation found all the following sdiv/smod cases may trigger > >> an exception when bpf program is running on x86_64 platform: > >> - LLONG_MIN/-1 for 64bit operation > >> - INT_MIN/-1 for 32bit operation > >> - LLONG_MIN%-1 for 64bit operation > >> - INT_MIN%-1 for 32bit operation > >> where -1 can be an immediate or in a register. > >> > >> On arm64, there are no exceptions: > >> - LLONG_MIN/-1 = LLONG_MIN > >> - INT_MIN/-1 = INT_MIN > >> - LLONG_MIN%-1 = 0 > >> - INT_MIN%-1 = 0 > >> where -1 can be an immediate or in a register. > >> > >> Insn patching is needed to handle the above cases and the patched codes > >> produced results aligned with above arm64 result. > >> > >> [1] https://lore.kernel.org/bpf/tPJLTEh7S_DxFEqAI2Ji5MBSoZVg7_G-Py2iaZpAaWtM961fFTWtsnlzwvTbzBzaUzwQAoNATXKUlt0LZOFgnDcIyKCswAnAGdUF3LBrhGQ=@protonmail.com/ > >> > >> Reported-by: Zac Ecob <zacecob@protonmail.com> > >> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> > >> --- > >> kernel/bpf/verifier.c | 84 ++++++++++++++++++++++++++++++++++++++++--- > >> 1 file changed, 80 insertions(+), 4 deletions(-) > >> > >> Changelogs: > >> v1 -> v2: > >> - Handle more crash cases like 32bit operation and modules. > >> - Add more tests to test new cases. > >> > >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > >> index f35b80c16cda..ad7f51302c70 100644 > >> --- a/kernel/bpf/verifier.c > >> +++ b/kernel/bpf/verifier.c > >> @@ -20499,13 +20499,46 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > >> /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ > >> insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); > >> > >> - /* Make divide-by-zero exceptions impossible. */ > >> + /* Make sdiv/smod divide-by-minus-one exceptions impossible. */ > >> + if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) || > >> + insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) || > >> + insn->code == (BPF_ALU | BPF_MOD | BPF_K) || > >> + insn->code == (BPF_ALU | BPF_DIV | BPF_K)) && > >> + insn->off == 1 && insn->imm == -1) { > >> + bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > >> + bool isdiv = BPF_OP(insn->code) == BPF_DIV; > >> + struct bpf_insn *patchlet; > >> + struct bpf_insn chk_and_div[] = { > >> + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > >> + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > >> + 0, 0, 0), > >> + }; > >> + struct bpf_insn chk_and_mod[] = { > >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > >> + }; > >> + > >> + patchlet = isdiv ? chk_and_div : chk_and_mod; > > nit: "chk_and_" part in the name is misleading, it's more like > > "safe_div" and "safe_mod". Oh, and it's "sdiv" and "smod" specific, so > > probably not a bad idea to have that in the name as well. > > good idea. Will use chk_and_sdiv and chk_and_smod. > > > > >> + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : ARRAY_SIZE(chk_and_mod); > >> + > >> + new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > >> + if (!new_prog) > >> + return -ENOMEM; > >> + > >> + delta += cnt - 1; > >> + env->prog = prog = new_prog; > >> + insn = new_prog->insnsi + i + delta; > >> + goto next_insn; > >> + } > >> + > >> + /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */ > >> if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || > >> insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || > >> insn->code == (BPF_ALU | BPF_MOD | BPF_X) || > >> insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { > >> bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; > >> bool isdiv = BPF_OP(insn->code) == BPF_DIV; > >> + bool is_sdiv = isdiv && insn->off == 1; > >> + bool is_smod = !isdiv && insn->off == 1; > >> struct bpf_insn *patchlet; > >> struct bpf_insn chk_and_div[] = { > >> /* [R,W]x div 0 -> 0 */ > >> @@ -20525,10 +20558,53 @@ static int do_misc_fixups(struct bpf_verifier_env *env) > >> BPF_JMP_IMM(BPF_JA, 0, 0, 1), > >> BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > >> }; > >> + struct bpf_insn chk_and_sdiv[] = { > >> + /* [R,W]x sdiv 0 -> 0 */ > >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > >> + BPF_JNE | BPF_K, insn->src_reg, > >> + 0, 2, 0), > >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > >> + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > >> + /* LLONG_MIN sdiv -1 -> LLONG_MIN > >> + * INT_MIN sdiv -1 -> INT_MIN > >> + */ > >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > >> + BPF_JNE | BPF_K, insn->src_reg, > >> + 0, 2, -1), > >> + /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */ > >> + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | > >> + BPF_OP(BPF_NEG) | BPF_K, insn->dst_reg, > >> + 0, 0, 0), > >> + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > > I don't know how much it actually matters, but it feels like common > > safe case should be as straight-line-executed as possible, no? > > > > So maybe it's better to rearrange to roughly this (where rX is the > > divisor register): > > > > if rX == 0 goto L1 > > if rX == -1 goto L2 > > rY /= rX > > goto L3 > > L1: /* zero case */ > > rY = 0 /* fallthrough, negation doesn't hurt, but less jumping */ > > L2: /* negative one case (or zero) */ > > rY = -rY > > L3: > > ... the rest of the program code ... > > My previous patched insn try to clearly separate rX == 0 and > rX == -1 case. It has 2 insns including 2 cond jmps, 2 uncond jmps and > one 3 alu operations. The above one removed one uncond jmp, which > is indeed better. > > > > > > > Those two branches for common case are still annoyingly inefficient, I > > wonder if we should do > > > > rX += 1 /* [-1, 0] -> [0, 1] > > if rX <=(unsigned) 1 goto L1 > > rX -= 1 /* restore original divisor */ > > rY /= rX /* common case */ > > goto L3 > > L1: > > if rX == 0 goto L2 /* jump if originally -1 */ > > rY = 0 /* division by zero case */ > > L2: /* fallthrough */ > > rY = -rY > > rX -= 1 /* restore original divisor */ > > L3: > > ... continue with the rest ... > > > > > > It's a bit trickier to follow, but should be faster in a common case. > > > > WDYT? Too much too far? > > This is even better. The above rX -= 1 can be removed if we use > BPF_REG_AX as the temporary register. For example, > > tmp = rX > tmp += 1 /* [-1, 0] -> [0, 1] > if tmp <=(unsigned) 1 goto L1 > rY /= rX /* common case */ > goto L3 > L1: > if tmp == 0 goto L2 /* jump if originally -1 */ > rY = 0 /* division by zero case */ > L2: /* fallthrough */ > rY = -rY > L3: > ... continue with the rest ... > > Maybe we can do even better > > tmp = rX > tmp += 1 /* [-1, 0] -> [0, 1] > if tmp >(unsigned) 1 goto L2 > if tmp == 0 goto L1 > rY = 0 > L1: > rY = -rY; > goto L3 > L2: > rY /= rX > L3: > > Could this be even better by reducing one uncond jmp in the fast path? Yep, makes sense to me. Go for it (as far as I'm concerned). > > > > > > >> + *insn, > >> + }; > >> + struct bpf_insn chk_and_smod[] = { > >> + /* [R,W]x mod 0 -> [R,W]x */ > >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > >> + BPF_JNE | BPF_K, insn->src_reg, > >> + 0, 2, 0), > >> + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), > >> + BPF_JMP_IMM(BPF_JA, 0, 0, 4), > >> + /* [R,W]x mod -1 -> 0 */ > >> + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | > >> + BPF_JNE | BPF_K, insn->src_reg, > >> + 0, 2, -1), > >> + BPF_ALU32_REG(BPF_XOR, insn->dst_reg, insn->dst_reg), > >> + BPF_JMP_IMM(BPF_JA, 0, 0, 1), > >> + *insn, > >> + }; > >> > > Same idea here, keep the common case as straight as possible. > > Sure. Will do. > > > > >> - patchlet = isdiv ? chk_and_div : chk_and_mod; > >> - cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > >> - ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > >> + if (is_sdiv) { > >> + patchlet = chk_and_sdiv; > >> + cnt = ARRAY_SIZE(chk_and_sdiv); > >> + } else if (is_smod) { > >> + patchlet = chk_and_smod; > >> + cnt = ARRAY_SIZE(chk_and_smod); > >> + } else { > >> + patchlet = isdiv ? chk_and_div : chk_and_mod; > >> + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : > >> + ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); > >> + } > >> > >> new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); > >> if (!new_prog) > >> -- > >> 2.43.5 > >> ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-13 2:00 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-09-12 3:59 [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Yonghong Song 2024-09-12 3:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add tests for sdiv/smod overflow cases Yonghong Song 2024-09-12 18:17 ` [PATCH bpf-next v2 1/2] bpf: Fix a sdiv overflow issue Andrii Nakryiko 2024-09-12 18:19 ` Andrii Nakryiko 2024-09-12 22:53 ` Yonghong Song 2024-09-13 2:00 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox