From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>,
Daniel Hodges <hodgesd@meta.com>
Subject: Re: [PATCH bpf-next] bpf, x64: Fix a jit convergence issue
Date: Wed, 28 Aug 2024 15:47:21 -0700 [thread overview]
Message-ID: <7e2ad37e-e750-4cbd-8305-bf16bbebcc53@linux.dev> (raw)
In-Reply-To: <CAADnVQ+5HD1ZxBqpDgNuwPkO1+VGzm1yqhxuDD4HYtkRYHwXiA@mail.gmail.com>
On 8/27/24 7:24 PM, Alexei Starovoitov wrote:
> On Sun, Aug 25, 2024 at 1:04 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> Daniel Hodges reported a jit error when playing with a sched-ext
>> program. The error message is:
>> unexpected jmp_cond padding: -4 bytes
>>
>> But further investigation shows the error is actual due to failed
>> convergence. The following are some analysis:
>>
>> ...
>> pass4, final_proglen=4391:
>> ...
>> 20e: 48 85 ff test rdi,rdi
>> 211: 74 7d je 0x290
>> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 289: 48 85 ff test rdi,rdi
>> 28c: 74 17 je 0x2a5
>> 28e: e9 7f ff ff ff jmp 0x212
>> 293: bf 03 00 00 00 mov edi,0x3
>>
>> Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
>> and insn at 0x28e is 5-byte jmp insn with offset -129.
>>
>> pass5, final_proglen=4392:
>> ...
>> 20e: 48 85 ff test rdi,rdi
>> 211: 0f 84 80 00 00 00 je 0x297
>> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 28d: 48 85 ff test rdi,rdi
>> 290: 74 1a je 0x2ac
>> 292: eb 84 jmp 0x218
>> 294: bf 03 00 00 00 mov edi,0x3
>>
>> Note that insn at 0x211 is 5-byte cond jump insn now since its offset
>> becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
>> At the same time, insn at 0x292 is a 2-byte insn since its offset is
>> -124.
>>
>> pass6 will repeat the same code as in pass4. pass7 will repeat the same
>> code as in pass5, and so on. This will prevent eventual convergence.
>>
>> Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
>> insn looks like:
>>
>> 211: 0f 84 80 00 00 00 je 0x297
>> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 24d: 48 85 d2 test rdx,rdx
>>
>> The similar code in pass14:
>> 211: 74 7d je 0x290
>> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 249: 48 85 d2 test rdx,rdx
>> 24c: 74 21 je 0x26f
>> 24e: 48 01 f7 add rdi,rsi
>> ...
>>
>> Before generating the following insn,
>> 250: 74 21 je 0x273
>> "padding = 1" enables some checking to ensure nops is either 0 or 4
>> where
>> #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
>> nops = INSN_SZ_DIFF - 2
>>
>> In this specific case,
>> addrs[i] = 0x24e // from pass14
>> addrs[i-1] = 0x24d // from pass15
>> prog - temp = 3 // from 'test rdx,rdx' in pass15
>> so
>> nops = -4
>> and this triggers the failure.
>> Making jit prog convergable can fix the above error.
>>
>> Reported-by: Daniel Hodges <hodgesd@meta.com>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> arch/x86/net/bpf_jit_comp.c | 47 ++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 46 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index 074b41fafbe3..ec541aae5d9b 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -64,6 +64,51 @@ static bool is_imm8(int value)
>> return value <= 127 && value >= -128;
>> }
>>
>> +/*
>> + * Let us limit the positive offset to be <= 124.
>> + * This is to ensure eventual jit convergence For the following patterns:
>> + * ...
>> + * pass4, final_proglen=4391:
>> + * ...
>> + * 20e: 48 85 ff test rdi,rdi
>> + * 211: 74 7d je 0x290
>> + * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> + * ...
>> + * 289: 48 85 ff test rdi,rdi
>> + * 28c: 74 17 je 0x2a5
>> + * 28e: e9 7f ff ff ff jmp 0x212
>> + * 293: bf 03 00 00 00 mov edi,0x3
>> + * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
>> + * and insn at 0x28e is 5-byte jmp insn with offset -129.
>> + *
>> + * pass5, final_proglen=4392:
>> + * ...
>> + * 20e: 48 85 ff test rdi,rdi
>> + * 211: 0f 84 80 00 00 00 je 0x297
>> + * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> + * ...
>> + * 28d: 48 85 ff test rdi,rdi
>> + * 290: 74 1a je 0x2ac
>> + * 292: eb 84 jmp 0x218
>> + * 294: bf 03 00 00 00 mov edi,0x3
>> + * Note that insn at 0x211 is 5-byte cond jump insn now since its offset
>> + * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
>> + * At the same time, insn at 0x292 is a 2-byte insn since its offset is
>> + * -124.
>> + *
>> + * pass6 will repeat the same code as in pass4 and this will prevent
>> + * eventual convergence.
>> + *
>> + * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
>> + * cycle in the above. Let us limit the positive offset for 8bit cond jump
>> + * insn to mamximum 124 (0x7c). This way, the jmp insn will be always 2-bytes,
>> + * and the jit pass can eventually converge.
>> + */
> je<->jmp
>
> It can be je/je too, no?
Yes. It is possible.
>
> so 128 - 4 instead of 128 - 3 ?
You probably mean "127 - 4 instead of 127 - 3" since
the maximum value is 127.
I checked 127 - 4 = 0x7c and indeed we should. See below examples:
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x291
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX XX XX XX XX je 0x212
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX je 0x217 (0x217 - 0x293)
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x28f (0x293 - 0x217)
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX je 0x213 (0x213 - 0x293) // -0x80 allowed
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x28f (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX je 0x217 (0x217 - 0x293)
293: bf 03 00 00 00 mov edi,0x3
=>
...
Here 0x293 - 0x217 = 0x7c
>
>> +static bool is_imm8_cond_offset(int value)
>> +{
>> + return value <= 124 && value >= -128;
> the other side needs the same treatment, no ?
good question. From my understanding, the non-convergence in the
above needs both forward and backport conditions. The solution we
are using is based on putting a limitation on forward conditions
w.r.t. jit code gen.
Another solution is actually to put a limitation on backward
conditions. For example, let us say the above is_imm8_cond_offset()
has
return value <= 127 && value > -124
See below example:
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x291
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX XX XX XX XX je 0x212
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX XX XX XX XX je 0x21b (0x217 - 0x293)
297: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x297 - 0x217)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX XX XX XX XX je 0x217 (0x217 - 0x297)
297: bf 03 00 00 00 mov edi,0x3
converged here.
So I think we do not need to limit both sides. One side should be enough.
>
>> +}
>> +
>> static bool is_simm32(s64 value)
>> {
>> return value == (s64)(s32)value;
>> @@ -2231,7 +2276,7 @@ st: if (is_imm8(insn->off))
>> return -EFAULT;
>> }
>> jmp_offset = addrs[i + insn->off] - addrs[i];
>> - if (is_imm8(jmp_offset)) {
>> + if (is_imm8_cond_offset(jmp_offset)) {
>> if (jmp_padding) {
>> /* To keep the jmp_offset valid, the extra bytes are
>> * padded before the jump insn, so we subtract the
>> --
>> 2.43.5
>>
next prev parent reply other threads:[~2024-08-28 22:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-25 20:04 [PATCH bpf-next] bpf, x64: Fix a jit convergence issue Yonghong Song
2024-08-27 23:44 ` Andrii Nakryiko
2024-08-28 22:50 ` Yonghong Song
2024-08-29 17:27 ` Andrii Nakryiko
2024-08-30 20:39 ` Yonghong Song
2024-08-28 2:24 ` Alexei Starovoitov
2024-08-28 22:47 ` Yonghong Song [this message]
2024-08-28 23:44 ` Alexei Starovoitov
2024-08-29 1:54 ` Yonghong Song
2024-08-29 16:37 ` Alexei Starovoitov
2024-08-29 16:50 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e2ad37e-e750-4cbd-8305-bf16bbebcc53@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hodgesd@meta.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.