From: Yonghong Song <yonghong.song@linux.dev>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kernel Team <kernel-team@fb.com>,
Martin KaFai Lau <martin.lau@kernel.org>,
Daniel Hodges <hodgesd@meta.com>
Subject: Re: [PATCH bpf-next] bpf, x64: Fix a jit convergence issue
Date: Wed, 28 Aug 2024 15:47:21 -0700 [thread overview]
Message-ID: <7e2ad37e-e750-4cbd-8305-bf16bbebcc53@linux.dev> (raw)
In-Reply-To: <CAADnVQ+5HD1ZxBqpDgNuwPkO1+VGzm1yqhxuDD4HYtkRYHwXiA@mail.gmail.com>
On 8/27/24 7:24 PM, Alexei Starovoitov wrote:
> On Sun, Aug 25, 2024 at 1:04 PM Yonghong Song <yonghong.song@linux.dev> wrote:
>> Daniel Hodges reported a jit error when playing with a sched-ext
>> program. The error message is:
>> unexpected jmp_cond padding: -4 bytes
>>
>> But further investigation shows the error is actual due to failed
>> convergence. The following are some analysis:
>>
>> ...
>> pass4, final_proglen=4391:
>> ...
>> 20e: 48 85 ff test rdi,rdi
>> 211: 74 7d je 0x290
>> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 289: 48 85 ff test rdi,rdi
>> 28c: 74 17 je 0x2a5
>> 28e: e9 7f ff ff ff jmp 0x212
>> 293: bf 03 00 00 00 mov edi,0x3
>>
>> Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
>> and insn at 0x28e is 5-byte jmp insn with offset -129.
>>
>> pass5, final_proglen=4392:
>> ...
>> 20e: 48 85 ff test rdi,rdi
>> 211: 0f 84 80 00 00 00 je 0x297
>> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 28d: 48 85 ff test rdi,rdi
>> 290: 74 1a je 0x2ac
>> 292: eb 84 jmp 0x218
>> 294: bf 03 00 00 00 mov edi,0x3
>>
>> Note that insn at 0x211 is 5-byte cond jump insn now since its offset
>> becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
>> At the same time, insn at 0x292 is a 2-byte insn since its offset is
>> -124.
>>
>> pass6 will repeat the same code as in pass4. pass7 will repeat the same
>> code as in pass5, and so on. This will prevent eventual convergence.
>>
>> Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
>> insn looks like:
>>
>> 211: 0f 84 80 00 00 00 je 0x297
>> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 24d: 48 85 d2 test rdx,rdx
>>
>> The similar code in pass14:
>> 211: 74 7d je 0x290
>> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> ...
>> 249: 48 85 d2 test rdx,rdx
>> 24c: 74 21 je 0x26f
>> 24e: 48 01 f7 add rdi,rsi
>> ...
>>
>> Before generating the following insn,
>> 250: 74 21 je 0x273
>> "padding = 1" enables some checking to ensure nops is either 0 or 4
>> where
>> #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
>> nops = INSN_SZ_DIFF - 2
>>
>> In this specific case,
>> addrs[i] = 0x24e // from pass14
>> addrs[i-1] = 0x24d // from pass15
>> prog - temp = 3 // from 'test rdx,rdx' in pass15
>> so
>> nops = -4
>> and this triggers the failure.
>> Making jit prog convergable can fix the above error.
>>
>> Reported-by: Daniel Hodges <hodgesd@meta.com>
>> Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
>> ---
>> arch/x86/net/bpf_jit_comp.c | 47 ++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 46 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index 074b41fafbe3..ec541aae5d9b 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -64,6 +64,51 @@ static bool is_imm8(int value)
>> return value <= 127 && value >= -128;
>> }
>>
>> +/*
>> + * Let us limit the positive offset to be <= 124.
>> + * This is to ensure eventual jit convergence For the following patterns:
>> + * ...
>> + * pass4, final_proglen=4391:
>> + * ...
>> + * 20e: 48 85 ff test rdi,rdi
>> + * 211: 74 7d je 0x290
>> + * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> + * ...
>> + * 289: 48 85 ff test rdi,rdi
>> + * 28c: 74 17 je 0x2a5
>> + * 28e: e9 7f ff ff ff jmp 0x212
>> + * 293: bf 03 00 00 00 mov edi,0x3
>> + * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
>> + * and insn at 0x28e is 5-byte jmp insn with offset -129.
>> + *
>> + * pass5, final_proglen=4392:
>> + * ...
>> + * 20e: 48 85 ff test rdi,rdi
>> + * 211: 0f 84 80 00 00 00 je 0x297
>> + * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
>> + * ...
>> + * 28d: 48 85 ff test rdi,rdi
>> + * 290: 74 1a je 0x2ac
>> + * 292: eb 84 jmp 0x218
>> + * 294: bf 03 00 00 00 mov edi,0x3
>> + * Note that insn at 0x211 is 5-byte cond jump insn now since its offset
>> + * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
>> + * At the same time, insn at 0x292 is a 2-byte insn since its offset is
>> + * -124.
>> + *
>> + * pass6 will repeat the same code as in pass4 and this will prevent
>> + * eventual convergence.
>> + *
>> + * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
>> + * cycle in the above. Let us limit the positive offset for 8bit cond jump
>> + * insn to mamximum 124 (0x7c). This way, the jmp insn will be always 2-bytes,
>> + * and the jit pass can eventually converge.
>> + */
> je<->jmp
>
> It can be je/je too, no?
Yes. It is possible.
>
> so 128 - 4 instead of 128 - 3 ?
You probably mean "127 - 4 instead of 127 - 3" since
the maximum value is 127.
I checked 127 - 4 = 0x7c and indeed we should. See below examples:
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x291
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX XX XX XX XX je 0x212
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX je 0x217 (0x217 - 0x293)
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x28f (0x293 - 0x217)
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX je 0x213 (0x213 - 0x293) // -0x80 allowed
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x28f (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX je 0x217 (0x217 - 0x293)
293: bf 03 00 00 00 mov edi,0x3
=>
...
Here 0x293 - 0x217 = 0x7c
>
>> +static bool is_imm8_cond_offset(int value)
>> +{
>> + return value <= 124 && value >= -128;
> the other side needs the same treatment, no ?
good question. From my understanding, the non-convergence in the
above needs both forward and backport conditions. The solution we
are using is based on putting a limitation on forward conditions
w.r.t. jit code gen.
Another solution is actually to put a limitation on backward
conditions. For example, let us say the above is_imm8_cond_offset()
has
return value <= 127 && value > -124
See below example:
20e: 48 85 ff test rdi,rdi
211: XX XX je 0x291
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: XX XX XX XX XX XX je 0x212
293: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX XX XX XX XX je 0x21b (0x217 - 0x293)
297: bf 03 00 00 00 mov edi,0x3
=>
20e: 48 85 ff test rdi,rdi
211: XX XX XX XX XX XX je 0x297 (0x297 - 0x217)
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
291: XX XX XX XX XX XX je 0x217 (0x217 - 0x297)
297: bf 03 00 00 00 mov edi,0x3
converged here.
So I think we do not need to limit both sides. One side should be enough.
>
>> +}
>> +
>> static bool is_simm32(s64 value)
>> {
>> return value == (s64)(s32)value;
>> @@ -2231,7 +2276,7 @@ st: if (is_imm8(insn->off))
>> return -EFAULT;
>> }
>> jmp_offset = addrs[i + insn->off] - addrs[i];
>> - if (is_imm8(jmp_offset)) {
>> + if (is_imm8_cond_offset(jmp_offset)) {
>> if (jmp_padding) {
>> /* To keep the jmp_offset valid, the extra bytes are
>> * padded before the jump insn, so we subtract the
>> --
>> 2.43.5
>>
next prev parent reply other threads:[~2024-08-28 22:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-25 20:04 [PATCH bpf-next] bpf, x64: Fix a jit convergence issue Yonghong Song
2024-08-27 23:44 ` Andrii Nakryiko
2024-08-28 22:50 ` Yonghong Song
2024-08-29 17:27 ` Andrii Nakryiko
2024-08-30 20:39 ` Yonghong Song
2024-08-28 2:24 ` Alexei Starovoitov
2024-08-28 22:47 ` Yonghong Song [this message]
2024-08-28 23:44 ` Alexei Starovoitov
2024-08-29 1:54 ` Yonghong Song
2024-08-29 16:37 ` Alexei Starovoitov
2024-08-29 16:50 ` Yonghong Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7e2ad37e-e750-4cbd-8305-bf16bbebcc53@linux.dev \
--to=yonghong.song@linux.dev \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hodgesd@meta.com \
--cc=kernel-team@fb.com \
--cc=martin.lau@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox