From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7866915A858 for ; Wed, 28 Aug 2024 22:47:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724885252; cv=none; b=AjIRnwgW24nFaH8hkuGNBdqNqrcmk/IIHD9BGGPnPXUPq5YA0ipsuvq77+Djpbp1YO1OcGxpqqe1CnfqFej3Gpi1q1ahJAPuc4qEtBq5bwYkTC8/gjyQ18atGxkhIBu68g+SZf6sHkLiao/boP5Ln3RrYCTTebnrEMU/SEKIwV8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724885252; c=relaxed/simple; bh=JgUoRbFenP7vs8Wa6ItRUTXZr40k2I8duWk+Pou17ag=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=VYDqsKUmEDKC4169UorvaZMDHIbR610r2CBWvtHsqWcQu4a0ddCGz1dhk0TxlovSgFDznQ7BuG8hgSVoephFd3T2TrXfNvkj29vqnKKZf/c0kj9fUjwLcEnypUbD7ZuVQkmneJ+EBENVGOXc1OaVl3nDKSZc0GcSGxic/VVcudY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=nhTZk34C; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="nhTZk34C" Message-ID: <7e2ad37e-e750-4cbd-8305-bf16bbebcc53@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1724885247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hYTWcigkIiAne3TySTUR7PgX18rfHUfqg1rg2LiWIAY=; b=nhTZk34CDWvAJti1DBsCV7L+4cYB4HdJrhEWAAqnUW2jFayd9v7iQaNfHPD/lOd0Lmt3sr G/fQKaqOaFEOsP8DRiev8V4HNWUAN5Em9CRjSdBrrmqB2SnY7WxqTBWDngBYQEOsuC2DQj 9GlgG4FkVY25vhoZMAsNYHRpJ92lwoQ= Date: Wed, 28 Aug 2024 15:47:21 -0700 Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH bpf-next] bpf, x64: Fix a jit convergence issue Content-Language: en-GB To: Alexei Starovoitov Cc: bpf , Alexei Starovoitov , Andrii Nakryiko , Daniel Borkmann , Kernel Team , Martin KaFai Lau , Daniel Hodges References: <20240825200406.1874982-1-yonghong.song@linux.dev> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yonghong Song In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On 8/27/24 7:24 PM, Alexei Starovoitov wrote: > On Sun, Aug 25, 2024 at 1:04 PM Yonghong Song wrote: >> Daniel Hodges reported a jit error when playing with a sched-ext >> program. The error message is: >> unexpected jmp_cond padding: -4 bytes >> >> But further investigation shows the error is actual due to failed >> convergence. The following are some analysis: >> >> ... >> pass4, final_proglen=4391: >> ... >> 20e: 48 85 ff test rdi,rdi >> 211: 74 7d je 0x290 >> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> ... >> 289: 48 85 ff test rdi,rdi >> 28c: 74 17 je 0x2a5 >> 28e: e9 7f ff ff ff jmp 0x212 >> 293: bf 03 00 00 00 mov edi,0x3 >> >> Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) >> and insn at 0x28e is 5-byte jmp insn with offset -129. >> >> pass5, final_proglen=4392: >> ... >> 20e: 48 85 ff test rdi,rdi >> 211: 0f 84 80 00 00 00 je 0x297 >> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> ... >> 28d: 48 85 ff test rdi,rdi >> 290: 74 1a je 0x2ac >> 292: eb 84 jmp 0x218 >> 294: bf 03 00 00 00 mov edi,0x3 >> >> Note that insn at 0x211 is 5-byte cond jump insn now since its offset >> becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). >> At the same time, insn at 0x292 is a 2-byte insn since its offset is >> -124. >> >> pass6 will repeat the same code as in pass4. pass7 will repeat the same >> code as in pass5, and so on. This will prevent eventual convergence. >> >> Passes 1-14 are with padding = 0. At pass15, padding is 1 and related >> insn looks like: >> >> 211: 0f 84 80 00 00 00 je 0x297 >> 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> ... >> 24d: 48 85 d2 test rdx,rdx >> >> The similar code in pass14: >> 211: 74 7d je 0x290 >> 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> ... >> 249: 48 85 d2 test rdx,rdx >> 24c: 74 21 je 0x26f >> 24e: 48 01 f7 add rdi,rsi >> ... >> >> Before generating the following insn, >> 250: 74 21 je 0x273 >> "padding = 1" enables some checking to ensure nops is either 0 or 4 >> where >> #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) >> nops = INSN_SZ_DIFF - 2 >> >> In this specific case, >> addrs[i] = 0x24e // from pass14 >> addrs[i-1] = 0x24d // from pass15 >> prog - temp = 3 // from 'test rdx,rdx' in pass15 >> so >> nops = -4 >> and this triggers the failure. >> Making jit prog convergable can fix the above error. >> >> Reported-by: Daniel Hodges >> Signed-off-by: Yonghong Song >> --- >> arch/x86/net/bpf_jit_comp.c | 47 ++++++++++++++++++++++++++++++++++++- >> 1 file changed, 46 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c >> index 074b41fafbe3..ec541aae5d9b 100644 >> --- a/arch/x86/net/bpf_jit_comp.c >> +++ b/arch/x86/net/bpf_jit_comp.c >> @@ -64,6 +64,51 @@ static bool is_imm8(int value) >> return value <= 127 && value >= -128; >> } >> >> +/* >> + * Let us limit the positive offset to be <= 124. >> + * This is to ensure eventual jit convergence For the following patterns: >> + * ... >> + * pass4, final_proglen=4391: >> + * ... >> + * 20e: 48 85 ff test rdi,rdi >> + * 211: 74 7d je 0x290 >> + * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> + * ... >> + * 289: 48 85 ff test rdi,rdi >> + * 28c: 74 17 je 0x2a5 >> + * 28e: e9 7f ff ff ff jmp 0x212 >> + * 293: bf 03 00 00 00 mov edi,0x3 >> + * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) >> + * and insn at 0x28e is 5-byte jmp insn with offset -129. >> + * >> + * pass5, final_proglen=4392: >> + * ... >> + * 20e: 48 85 ff test rdi,rdi >> + * 211: 0f 84 80 00 00 00 je 0x297 >> + * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] >> + * ... >> + * 28d: 48 85 ff test rdi,rdi >> + * 290: 74 1a je 0x2ac >> + * 292: eb 84 jmp 0x218 >> + * 294: bf 03 00 00 00 mov edi,0x3 >> + * Note that insn at 0x211 is 5-byte cond jump insn now since its offset >> + * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). >> + * At the same time, insn at 0x292 is a 2-byte insn since its offset is >> + * -124. >> + * >> + * pass6 will repeat the same code as in pass4 and this will prevent >> + * eventual convergence. >> + * >> + * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes) >> + * cycle in the above. Let us limit the positive offset for 8bit cond jump >> + * insn to mamximum 124 (0x7c). This way, the jmp insn will be always 2-bytes, >> + * and the jit pass can eventually converge. >> + */ > je<->jmp > > It can be je/je too, no? Yes. It is possible. > > so 128 - 4 instead of 128 - 3 ? You probably mean "127 - 4 instead of 127 - 3" since the maximum value is 127. I checked 127 - 4 = 0x7c and indeed we should. See below examples: 20e: 48 85 ff test rdi,rdi 211: XX XX je 0x291 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 28d: XX XX XX XX XX XX je 0x212 293: bf 03 00 00 00 mov edi,0x3 => 20e: 48 85 ff test rdi,rdi 211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213) 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 291: XX XX je 0x217 (0x217 - 0x293) 293: bf 03 00 00 00 mov edi,0x3 => 20e: 48 85 ff test rdi,rdi 211: XX XX je 0x28f (0x293 - 0x217) 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 28d: XX XX je 0x213 (0x213 - 0x293) // -0x80 allowed 293: bf 03 00 00 00 mov edi,0x3 => 20e: 48 85 ff test rdi,rdi 211: XX XX XX XX XX XX je 0x28f (0x293 - 0x213) 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 291: XX XX je 0x217 (0x217 - 0x293) 293: bf 03 00 00 00 mov edi,0x3 => ... Here 0x293 - 0x217 = 0x7c > >> +static bool is_imm8_cond_offset(int value) >> +{ >> + return value <= 124 && value >= -128; > the other side needs the same treatment, no ? good question. From my understanding, the non-convergence in the above needs both forward and backport conditions. The solution we are using is based on putting a limitation on forward conditions w.r.t. jit code gen. Another solution is actually to put a limitation on backward conditions. For example, let us say the above is_imm8_cond_offset() has return value <= 127 && value > -124 See below example: 20e: 48 85 ff test rdi,rdi 211: XX XX je 0x291 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 28d: XX XX XX XX XX XX je 0x212 293: bf 03 00 00 00 mov edi,0x3 => 20e: 48 85 ff test rdi,rdi 211: XX XX XX XX XX XX je 0x297 (0x293 - 0x213) 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 291: XX XX XX XX XX XX je 0x21b (0x217 - 0x293) 297: bf 03 00 00 00 mov edi,0x3 => 20e: 48 85 ff test rdi,rdi 211: XX XX XX XX XX XX je 0x297 (0x297 - 0x217) 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] ... 291: XX XX XX XX XX XX je 0x217 (0x217 - 0x297) 297: bf 03 00 00 00 mov edi,0x3 converged here. So I think we do not need to limit both sides. One side should be enough. > >> +} >> + >> static bool is_simm32(s64 value) >> { >> return value == (s64)(s32)value; >> @@ -2231,7 +2276,7 @@ st: if (is_imm8(insn->off)) >> return -EFAULT; >> } >> jmp_offset = addrs[i + insn->off] - addrs[i]; >> - if (is_imm8(jmp_offset)) { >> + if (is_imm8_cond_offset(jmp_offset)) { >> if (jmp_padding) { >> /* To keep the jmp_offset valid, the extra bytes are >> * padded before the jump insn, so we subtract the >> -- >> 2.43.5 >>