From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Yonghong Song <yonghong.song@linux.dev>,
Daniel Hodges <hodgesd@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Sasha Levin <sashal@kernel.org>,
daniel@iogearbox.net, andrii@kernel.org, davem@davemloft.net,
dsahern@kernel.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
bpf@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH AUTOSEL 6.6 08/58] bpf, x64: Fix a jit convergence issue
Date: Fri, 4 Oct 2024 14:23:41 -0400 [thread overview]
Message-ID: <20241004182503.3672477-8-sashal@kernel.org> (raw)
In-Reply-To: <20241004182503.3672477-1-sashal@kernel.org>
From: Yonghong Song <yonghong.song@linux.dev>
[ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]
Daniel Hodges reported a jit error when playing with a sched-ext program.
The error message is:
unexpected jmp_cond padding: -4 bytes
But further investigation shows the error is actual due to failed
convergence. The following are some analysis:
...
pass4, final_proglen=4391:
...
20e: 48 85 ff test rdi,rdi
211: 74 7d je 0x290
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
289: 48 85 ff test rdi,rdi
28c: 74 17 je 0x2a5
28e: e9 7f ff ff ff jmp 0x212
293: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.
pass5, final_proglen=4392:
...
20e: 48 85 ff test rdi,rdi
211: 0f 84 80 00 00 00 je 0x297
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: 48 85 ff test rdi,rdi
290: 74 1a je 0x2ac
292: eb 84 jmp 0x218
294: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 6-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
time, insn at 0x292 is a 2-byte insn since its offset is -124.
pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.
Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:
211: 0f 84 80 00 00 00 je 0x297
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
24d: 48 85 d2 test rdx,rdx
The similar code in pass14:
211: 74 7d je 0x290
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
249: 48 85 d2 test rdx,rdx
24c: 74 21 je 0x26f
24e: 48 01 f7 add rdi,rsi
...
Before generating the following insn,
250: 74 21 je 0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
#define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
nops = INSN_SZ_DIFF - 2
In this specific case,
addrs[i] = 0x24e // from pass14
addrs[i-1] = 0x24d // from pass15
prog - temp = 3 // from 'test rdx,rdx' in pass15
so
nops = -4
and this triggers the failure.
To fix the issue, we need to break cycles of je <-> jmp. For example,
in the above case, we have
211: 74 7d je 0x290
the offset is 0x7d. If 2-byte je insn is generated only if
the offset is less than 0x7d (<= 0x7c), the cycle can be
break and we can achieve the convergence.
I did some study on other cases like je <-> je, jmp <-> je and
jmp <-> jmp which may cause cycles. Those cases are not from actual
reproducible cases since it is pretty hard to construct a test case
for them. the results show that the offset <= 0x7b (0x7b = 123) should
be enough to cover all cases. This patch added a new helper to generate 8-bit
cond/uncond jmp insns only if the offset range is [-128, 123].
Reported-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/net/bpf_jit_comp.c | 54 +++++++++++++++++++++++++++++++++++--
1 file changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 878a4c6dd7565..a50c99e9b5c01 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -58,6 +58,56 @@ static bool is_imm8(int value)
return value <= 127 && value >= -128;
}
+/*
+ * Let us limit the positive offset to be <= 123.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ * ...
+ * 20e: 48 85 ff test rdi,rdi
+ * 211: 74 7d je 0x290
+ * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
+ * ...
+ * 289: 48 85 ff test rdi,rdi
+ * 28c: 74 17 je 0x2a5
+ * 28e: e9 7f ff ff ff jmp 0x212
+ * 293: bf 03 00 00 00 mov edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ * ...
+ * 20e: 48 85 ff test rdi,rdi
+ * 211: 0f 84 80 00 00 00 je 0x297
+ * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
+ * ...
+ * 28d: 48 85 ff test rdi,rdi
+ * 290: 74 1a je 0x2ac
+ * 292: eb 84 jmp 0x218
+ * 294: bf 03 00 00 00 mov edi,0x3
+ * Note that insn at 0x211 is 6-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. In the above example je offset <= 0x7c should work.
+ *
+ * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence
+ * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should
+ * avoid no convergence issue.
+ *
+ * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn
+ * to maximum 123 (0x7b). This way, the jit pass can eventually converge.
+ */
+static bool is_imm8_jmp_offset(int value)
+{
+ return value <= 123 && value >= -128;
+}
+
static bool is_simm32(s64 value)
{
return value == (s64)(s32)value;
@@ -1774,7 +1824,7 @@ st: if (is_imm8(insn->off))
return -EFAULT;
}
jmp_offset = addrs[i + insn->off] - addrs[i];
- if (is_imm8(jmp_offset)) {
+ if (is_imm8_jmp_offset(jmp_offset)) {
if (jmp_padding) {
/* To keep the jmp_offset valid, the extra bytes are
* padded before the jump insn, so we subtract the
@@ -1856,7 +1906,7 @@ st: if (is_imm8(insn->off))
break;
}
emit_jmp:
- if (is_imm8(jmp_offset)) {
+ if (is_imm8_jmp_offset(jmp_offset)) {
if (jmp_padding) {
/* To avoid breaking jmp_offset, the extra bytes
* are padded before the actual jmp insn, so
--
2.43.0
next prev parent reply other threads:[~2024-10-04 18:25 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-04 18:23 [PATCH AUTOSEL 6.6 01/58] selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 02/58] bpf: Check percpu map value size first Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 03/58] bpftool: Fix undefined behavior in qsort(NULL, 0, ...) Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 04/58] bpftool: Fix undefined behavior caused by shifting into the sign bit Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 05/58] s390/boot: Compile all files with the same march flag Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 06/58] s390/facility: Disable compile time optimization for decompressor code Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 07/58] s390/mm: Add cond_resched() to cmm_alloc/free_pages() Sasha Levin
2024-10-04 18:23 ` Sasha Levin [this message]
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 09/58] ext4: fix i_data_sem unlock order in ext4_ind_migrate() Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 10/58] ext4: avoid use-after-free in ext4_ext_show_leaf() Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 11/58] ext4: ext4_search_dir should return a proper error Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 12/58] iomap: fix iomap_dio_zero() for fs bs > system page size Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 13/58] ext4: don't set SB_RDONLY after filesystem errors Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 14/58] ext4: nested locking for xattr inode Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 15/58] s390/cpum_sf: Remove WARN_ON_ONCE statements Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 16/58] s390/traps: Handle early warnings gracefully Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 17/58] bpf: Prevent tail call between progs attached to different hooks Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 18/58] ktest.pl: Avoid false positives with grub2 skip regex Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 19/58] RDMA/mad: Improve handling of timed out WRs of mad agent Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 20/58] soundwire: intel_bus_common: enable interrupts before exiting reset Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 21/58] PCI: Add function 0 DMA alias quirk for Glenfly Arise chip Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 22/58] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 23/58] clk: bcm: bcm53573: fix OF node leak in init Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 24/58] PCI: Add ACS quirk for Qualcomm SA8775P Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 25/58] i2c: i801: Use a different adapter-name for IDF adapters Sasha Levin
2024-10-04 18:23 ` [PATCH AUTOSEL 6.6 26/58] PCI: Mark Creative Labs EMU20k2 INTx masking as broken Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 27/58] i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 28/58] RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 29/58] io_uring: check if we need to reschedule during overflow flush Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 30/58] ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 31/58] mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 32/58] riscv: avoid Imbalance in RAS Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 33/58] RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 34/58] soundwire: cadence: re-check Peripheral status with delayed_work Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 35/58] riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 36/58] media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put() Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 37/58] remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 38/58] clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 39/58] fuse: handle idmappings properly in ->write_iter() Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 40/58] serial: protect uart_port_dtr_rts() in uart_shutdown() too Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 41/58] usb: typec: tipd: Free IRQ only if it was requested before Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 42/58] usb: chipidea: udc: enable suspend interrupt after usb reset Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 43/58] usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 44/58] comedi: ni_routing: tools: Check when the file could not be opened Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 45/58] LoongArch: Fix memleak in pci_acpi_scan_root() Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 46/58] netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 47/58] netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 48/58] virtio_pmem: Check device status before requesting flush Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 49/58] tools/iio: Add memory allocation failure check for trigger_name Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 50/58] staging: vme_user: added bound check to geoid Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 51/58] driver core: bus: Fix double free in driver API bus_register() Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 52/58] driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 53/58] scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 54/58] scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 55/58] drm/amd/display: Check null pointer before dereferencing se Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 56/58] fbcon: Fix a NULL pointer dereference issue in fbcon_putcs Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 57/58] smb: client: fix UAF in async decryption Sasha Levin
2024-10-04 18:24 ` [PATCH AUTOSEL 6.6 58/58] fbdev: sisfb: Fix strbuf array overflow Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241004182503.3672477-8-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=hodgesd@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox