BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue
@ 2024-09-03 22:59 Yonghong Song
  2024-09-03 22:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues Yonghong Song
  2024-09-04 16:56 ` [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Alexei Starovoitov
  0 siblings, 2 replies; 6+ messages in thread
From: Yonghong Song @ 2024-09-03 22:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau, Daniel Hodges

Daniel Hodges reported a jit error when playing with a sched-ext program.
The error message is:
  unexpected jmp_cond padding: -4 bytes

But further investigation shows the error is actual due to failed
convergence. The following are some analysis:

  ...
  pass4, final_proglen=4391:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    289:    48 85 ff                test   rdi,rdi
    28c:    74 17                   je     0x2a5
    28e:    e9 7f ff ff ff          jmp    0x212
    293:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.

  pass5, final_proglen=4392:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    28d:    48 85 ff                test   rdi,rdi
    290:    74 1a                   je     0x2ac
    292:    eb 84                   jmp    0x218
    294:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 6-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
time, insn at 0x292 is a 2-byte insn since its offset is -124.

pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.

Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:

    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    24d:    48 85 d2                test   rdx,rdx

The similar code in pass14:
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    249:    48 85 d2                test   rdx,rdx
    24c:    74 21                   je     0x26f
    24e:    48 01 f7                add    rdi,rsi
    ...

Before generating the following insn,
  250:    74 21                   je     0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
  #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
  nops = INSN_SZ_DIFF - 2

In this specific case,
  addrs[i] = 0x24e // from pass14
  addrs[i-1] = 0x24d // from pass15
  prog - temp = 3 // from 'test rdx,rdx' in pass15
so
  nops = -4
and this triggers the failure.

To fix the issue, we need to break cycles of je <-> jmp. For example,
in the above case, we have
  211:    74 7d                   je     0x290
the offset is 0x7d. If 2-byte je insn is generated only if
the offset is less than 0x7d (<= 0x7c), the cycle can be
break and we can achieve the convergence.

I did some study on other cases like je <-> je, jmp <-> je and
jmp <-> jmp which may cause cycles. Those cases are not from actual
reproducible cases since it is pretty hard to construct a test case
for them. the results show that the offset <= 0x7b (0x7b = 123) should
be enough to cover all cases. This patch added a new helper to generate 8-bit
cond/uncond jmp insns only if the offset range is [-128, 123].

Reported-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 arch/x86/net/bpf_jit_comp.c | 55 +++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 074b41fafbe3..63c4816ed4e7 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -64,6 +64,57 @@ static bool is_imm8(int value)
 	return value <= 127 && value >= -128;
 }
 
+/*
+ * Let us limit the positive offset to be <= 124.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    74 7d                   je     0x290
+ *   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   289:    48 85 ff                test   rdi,rdi
+ *   28c:    74 17                   je     0x2a5
+ *   28e:    e9 7f ff ff ff          jmp    0x212
+ *   293:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    0f 84 80 00 00 00       je     0x297
+ *   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   28d:    48 85 ff                test   rdi,rdi
+ *   290:    74 1a                   je     0x2ac
+ *   292:    eb 84                   jmp    0x218
+ *   294:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 6-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. In the above example that we found je offset <= 0x7c
+ * should work.
+ *
+ * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence
+ * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should
+ * avoid no convergence issue.
+ *
+ * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn
+ * to mamximum 123 (0x7b). This way, the jit pass can eventually converge.
+ */
+static bool is_imm8_jmp_offset(int value)
+{
+	return value <= 123 && value >= -128;
+}
+
 static bool is_simm32(s64 value)
 {
 	return value == (s64)(s32)value;
@@ -2231,7 +2282,7 @@ st:			if (is_imm8(insn->off))
 				return -EFAULT;
 			}
 			jmp_offset = addrs[i + insn->off] - addrs[i];
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To keep the jmp_offset valid, the extra bytes are
 					 * padded before the jump insn, so we subtract the
@@ -2313,7 +2364,7 @@ st:			if (is_imm8(insn->off))
 				break;
 			}
 emit_jmp:
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To avoid breaking jmp_offset, the extra bytes
 					 * are padded before the actual jmp insn, so
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues
  2024-09-03 22:59 [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Yonghong Song
@ 2024-09-03 22:59 ` Yonghong Song
  2024-09-04 16:58   ` Alexei Starovoitov
  2024-09-04 16:56 ` [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Alexei Starovoitov
  1 sibling, 1 reply; 6+ messages in thread
From: Yonghong Song @ 2024-09-03 22:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team,
	Martin KaFai Lau

The core part of the selftest, i.e., the je <-> jmp cycle, mimics the
original sched-ext bpf program. The test will fail without the
previous patch.

I tried to create some cases for other potential cycles
(je <-> je, jmp <-> je and jmp <-> jmp) with similar pattern
to the test in this patch, but failed. So this patch
only contains one test for je <-> jmp cycle.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
---
 .../selftests/bpf/prog_tests/verifier.c       |   2 +
 .../bpf/progs/verifier_jit_convergence.c      | 128 ++++++++++++++++++
 2 files changed, 130 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/verifier_jit_convergence.c

diff --git a/tools/testing/selftests/bpf/prog_tests/verifier.c b/tools/testing/selftests/bpf/prog_tests/verifier.c
index 80a90c627182..df398e714dff 100644
--- a/tools/testing/selftests/bpf/prog_tests/verifier.c
+++ b/tools/testing/selftests/bpf/prog_tests/verifier.c
@@ -39,6 +39,7 @@
 #include "verifier_int_ptr.skel.h"
 #include "verifier_iterating_callbacks.skel.h"
 #include "verifier_jeq_infer_not_null.skel.h"
+#include "verifier_jit_convergence.skel.h"
 #include "verifier_ld_ind.skel.h"
 #include "verifier_ldsx.skel.h"
 #include "verifier_leak_ptr.skel.h"
@@ -163,6 +164,7 @@ void test_verifier_helper_value_access(void)  { RUN(verifier_helper_value_access
 void test_verifier_int_ptr(void)              { RUN(verifier_int_ptr); }
 void test_verifier_iterating_callbacks(void)  { RUN(verifier_iterating_callbacks); }
 void test_verifier_jeq_infer_not_null(void)   { RUN(verifier_jeq_infer_not_null); }
+void test_verifier_jit_convergence(void)      { RUN(verifier_jit_convergence); }
 void test_verifier_ld_ind(void)               { RUN(verifier_ld_ind); }
 void test_verifier_ldsx(void)                  { RUN(verifier_ldsx); }
 void test_verifier_leak_ptr(void)             { RUN(verifier_leak_ptr); }
diff --git a/tools/testing/selftests/bpf/progs/verifier_jit_convergence.c b/tools/testing/selftests/bpf/progs/verifier_jit_convergence.c
new file mode 100644
index 000000000000..fdc4ccc83c7d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/verifier_jit_convergence.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include "bpf_misc.h"
+
+#if defined(__TARGET_ARCH_x86)
+
+struct value_t {
+	long long a[32];
+};
+
+struct {
+        __uint(type, BPF_MAP_TYPE_HASH);
+        __uint(max_entries, 1);
+        __type(key, long long);
+        __type(value, struct value_t);
+} map_hash SEC(".maps");
+
+SEC("socket")
+__description("bpf_jit_convergence je <-> jmp")
+__success __retval(0)
+__arch_x86_64
+__jited("	pushq	%rbp")
+__naked void btf_jit_convergence_je_jmp(void)
+{
+	asm volatile (
+	"call %[bpf_get_prandom_u32];"
+	"if r0 == 0 goto l20_%=;"
+	"if r0 == 1 goto l21_%=;"
+	"if r0 == 2 goto l22_%=;"
+	"if r0 == 3 goto l23_%=;"
+	"if r0 == 4 goto l24_%=;"
+	"call %[bpf_get_prandom_u32];"
+	"call %[bpf_get_prandom_u32];"
+"l20_%=:"
+"l21_%=:"
+"l22_%=:"
+"l23_%=:"
+"l24_%=:"
+	"r1 = 0;"
+	"*(u64 *)(r10 - 8) = r1;"
+	"r2 = r10;"
+	"r2 += -8;"
+	"r1 = %[map_hash] ll;"
+	"call %[bpf_map_lookup_elem];"
+	"if r0 == 0 goto l1_%=;"
+	"r6 = r0;"
+	"call %[bpf_get_prandom_u32];"
+	"r7 = r0;"
+	"r5 = r6;"
+	"if r0 != 0x0 goto l12_%=;"
+	"call %[bpf_get_prandom_u32];"
+	"r1 = r0;"
+	"r2 = r6;"
+	"if r1 == 0x0 goto l0_%=;"
+"l9_%=:"
+	"r2 = *(u64 *)(r6 + 0x0);"
+	"r2 += 0x1;"
+	"*(u64 *)(r6 + 0x0) = r2;"
+	"goto l1_%=;"
+"l12_%=:"
+	"r1 = r7;"
+	"r1 += 0x98;"
+	"r2 = r5;"
+	"r2 += 0x90;"
+	"r2 = *(u32 *)(r2 + 0x0);"
+	"r3 = r7;"
+	"r3 &= 0x1;"
+	"r2 *= 0xa8;"
+	"if r3 == 0x0 goto l2_%=;"
+	"r1 += r2;"
+	"r1 -= r7;"
+	"r1 += 0x8;"
+	"if r1 <= 0xb20 goto l3_%=;"
+	"r1 = 0x0;"
+	"goto l4_%=;"
+"l3_%=:"
+	"r1 += r7;"
+"l4_%=:"
+	"if r1 == 0x0 goto l8_%=;"
+	"goto l9_%=;"
+"l2_%=:"
+	"r1 += r2;"
+	"r1 -= r7;"
+	"r1 += 0x10;"
+	"if r1 <= 0xb20 goto l6_%=;"
+	"r1 = 0x0;"
+	"goto l7_%=;"
+"l6_%=:"
+	"r1 += r7;"
+"l7_%=:"
+	"if r1 == 0x0 goto l8_%=;"
+	"goto l9_%=;"
+"l0_%=:"
+	"r1 = 0x3;"
+	"*(u64 *)(r10 - 0x10) = r1;"
+	"r2 = r1;"
+	"goto l1_%=;"
+"l8_%=:"
+	"r1 = r5;"
+	"r1 += 0x4;"
+	"r1 = *(u32 *)(r1 + 0x0);"
+	"*(u64 *)(r10 - 0x8) = r1;"
+"l1_%=:"
+	"r0 = 0;"
+	"exit;"
+	:
+	: __imm(bpf_get_prandom_u32),
+	  __imm(bpf_map_lookup_elem),
+	  __imm_addr(map_hash)
+	: __clobber_all);
+}
+
+#else
+
+SEC("socket")
+__description("non-x86 has no jit convergence issue, use a dummy test")
+__success
+int dummy_test(void)
+{
+	return 0;
+}
+
+#endif
+
+char _license[] SEC("license") = "GPL";
-- 
2.43.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue
  2024-09-03 22:59 [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Yonghong Song
  2024-09-03 22:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues Yonghong Song
@ 2024-09-04 16:56 ` Alexei Starovoitov
  2024-09-04 18:32   ` Yonghong Song
  1 sibling, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2024-09-04 16:56 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Kernel Team, Martin KaFai Lau, Daniel Hodges

On Tue, Sep 3, 2024 at 4:00 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 074b41fafbe3..63c4816ed4e7 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -64,6 +64,57 @@ static bool is_imm8(int value)
>         return value <= 127 && value >= -128;
>  }
>
> +/*
> + * Let us limit the positive offset to be <= 124.

I think the comment will read better if the above says "<= 123",
since that's the final outcome.
124 above is a bit confusing.
I can tweak while applying if you agree.

> + * to mamximum 123 (0x7b). This way, the jit pass can eventually converge.

Can fix this typo too.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues
  2024-09-03 22:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues Yonghong Song
@ 2024-09-04 16:58   ` Alexei Starovoitov
  2024-09-04 18:28     ` Yonghong Song
  0 siblings, 1 reply; 6+ messages in thread
From: Alexei Starovoitov @ 2024-09-04 16:58 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Kernel Team, Martin KaFai Lau

On Tue, Sep 3, 2024 at 4:00 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include "bpf_misc.h"
> +
> +#if defined(__TARGET_ARCH_x86)

Does our test framework need this ifdef and dummy_test below?
There is __arch_x86_64 below.
Isn't it enough?

> +SEC("socket")
> +__description("bpf_jit_convergence je <-> jmp")
> +__success __retval(0)
> +__arch_x86_64

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues
  2024-09-04 16:58   ` Alexei Starovoitov
@ 2024-09-04 18:28     ` Yonghong Song
  0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-09-04 18:28 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Kernel Team, Martin KaFai Lau


On 9/4/24 9:58 AM, Alexei Starovoitov wrote:
> On Tue, Sep 3, 2024 at 4:00 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> +#include <linux/bpf.h>
>> +#include <bpf/bpf_helpers.h>
>> +#include "bpf_misc.h"
>> +
>> +#if defined(__TARGET_ARCH_x86)
> Does our test framework need this ifdef and dummy_test below?
> There is __arch_x86_64 below.
> Isn't it enough?

It should be enough. Other arch should be correct although
they do not have convergence issue.

Let me remove above "#if defined(__TARGET_ARCH_x86)" and
do a test again to ensure everything is okay.

>
>> +SEC("socket")
>> +__description("bpf_jit_convergence je <-> jmp")
>> +__success __retval(0)
>> +__arch_x86_64

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue
  2024-09-04 16:56 ` [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Alexei Starovoitov
@ 2024-09-04 18:32   ` Yonghong Song
  0 siblings, 0 replies; 6+ messages in thread
From: Yonghong Song @ 2024-09-04 18:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Kernel Team, Martin KaFai Lau, Daniel Hodges


On 9/4/24 9:56 AM, Alexei Starovoitov wrote:
> On Tue, Sep 3, 2024 at 4:00 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index 074b41fafbe3..63c4816ed4e7 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -64,6 +64,57 @@ static bool is_imm8(int value)
>>          return value <= 127 && value >= -128;
>>   }
>>
>> +/*
>> + * Let us limit the positive offset to be <= 124.
> I think the comment will read better if the above says "<= 123",
> since that's the final outcome.
> 124 above is a bit confusing.
> I can tweak while applying if you agree.

Correct. Let us do "<= 123".

>
>> + * to mamximum 123 (0x7b). This way, the jit pass can eventually converge.
> Can fix this typo too.

Sounds good.

Since I need to change test in the second patch (enabling tests for arm64/s390),
let me send v3 to address all comments.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-09-04 18:32 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-03 22:59 [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Yonghong Song
2024-09-03 22:59 ` [PATCH bpf-next v2 2/2] selftests/bpf: Add a selftest for x86 jit convergence issues Yonghong Song
2024-09-04 16:58   ` Alexei Starovoitov
2024-09-04 18:28     ` Yonghong Song
2024-09-04 16:56 ` [PATCH bpf-next v2 1/2] bpf, x64: Fix a jit convergence issue Alexei Starovoitov
2024-09-04 18:32   ` Yonghong Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox