From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Yonghong Song <yonghong.song@linux.dev>,
Daniel Hodges <hodgesd@meta.com>,
Alexei Starovoitov <ast@kernel.org>,
Sasha Levin <sashal@kernel.org>,
davem@davemloft.net, dsahern@kernel.org, daniel@iogearbox.net,
andrii@kernel.org, tglx@linutronix.de, mingo@redhat.com,
bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 6.11 10/76] bpf, x64: Fix a jit convergence issue
Date: Fri, 4 Oct 2024 14:16:27 -0400 [thread overview]
Message-ID: <20241004181828.3669209-10-sashal@kernel.org> (raw)
In-Reply-To: <20241004181828.3669209-1-sashal@kernel.org>
From: Yonghong Song <yonghong.song@linux.dev>
[ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]
Daniel Hodges reported a jit error when playing with a sched-ext program.
The error message is:
unexpected jmp_cond padding: -4 bytes
But further investigation shows the error is actual due to failed
convergence. The following are some analysis:
...
pass4, final_proglen=4391:
...
20e: 48 85 ff test rdi,rdi
211: 74 7d je 0x290
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
289: 48 85 ff test rdi,rdi
28c: 74 17 je 0x2a5
28e: e9 7f ff ff ff jmp 0x212
293: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.
pass5, final_proglen=4392:
...
20e: 48 85 ff test rdi,rdi
211: 0f 84 80 00 00 00 je 0x297
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
28d: 48 85 ff test rdi,rdi
290: 74 1a je 0x2ac
292: eb 84 jmp 0x218
294: bf 03 00 00 00 mov edi,0x3
Note that insn at 0x211 is 6-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
time, insn at 0x292 is a 2-byte insn since its offset is -124.
pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.
Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:
211: 0f 84 80 00 00 00 je 0x297
217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
24d: 48 85 d2 test rdx,rdx
The similar code in pass14:
211: 74 7d je 0x290
213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
...
249: 48 85 d2 test rdx,rdx
24c: 74 21 je 0x26f
24e: 48 01 f7 add rdi,rsi
...
Before generating the following insn,
250: 74 21 je 0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
#define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
nops = INSN_SZ_DIFF - 2
In this specific case,
addrs[i] = 0x24e // from pass14
addrs[i-1] = 0x24d // from pass15
prog - temp = 3 // from 'test rdx,rdx' in pass15
so
nops = -4
and this triggers the failure.
To fix the issue, we need to break cycles of je <-> jmp. For example,
in the above case, we have
211: 74 7d je 0x290
the offset is 0x7d. If 2-byte je insn is generated only if
the offset is less than 0x7d (<= 0x7c), the cycle can be
break and we can achieve the convergence.
I did some study on other cases like je <-> je, jmp <-> je and
jmp <-> jmp which may cause cycles. Those cases are not from actual
reproducible cases since it is pretty hard to construct a test case
for them. the results show that the offset <= 0x7b (0x7b = 123) should
be enough to cover all cases. This patch added a new helper to generate 8-bit
cond/uncond jmp insns only if the offset range is [-128, 123].
Reported-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/net/bpf_jit_comp.c | 54 +++++++++++++++++++++++++++++++++++--
1 file changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index d25d81c8ecc00..f609dc379be75 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -64,6 +64,56 @@ static bool is_imm8(int value)
return value <= 127 && value >= -128;
}
+/*
+ * Let us limit the positive offset to be <= 123.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ * ...
+ * 20e: 48 85 ff test rdi,rdi
+ * 211: 74 7d je 0x290
+ * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
+ * ...
+ * 289: 48 85 ff test rdi,rdi
+ * 28c: 74 17 je 0x2a5
+ * 28e: e9 7f ff ff ff jmp 0x212
+ * 293: bf 03 00 00 00 mov edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ * ...
+ * 20e: 48 85 ff test rdi,rdi
+ * 211: 0f 84 80 00 00 00 je 0x297
+ * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0]
+ * ...
+ * 28d: 48 85 ff test rdi,rdi
+ * 290: 74 1a je 0x2ac
+ * 292: eb 84 jmp 0x218
+ * 294: bf 03 00 00 00 mov edi,0x3
+ * Note that insn at 0x211 is 6-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. In the above example je offset <= 0x7c should work.
+ *
+ * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence
+ * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should
+ * avoid no convergence issue.
+ *
+ * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn
+ * to maximum 123 (0x7b). This way, the jit pass can eventually converge.
+ */
+static bool is_imm8_jmp_offset(int value)
+{
+ return value <= 123 && value >= -128;
+}
+
static bool is_simm32(s64 value)
{
return value == (s64)(s32)value;
@@ -2184,7 +2234,7 @@ st: if (is_imm8(insn->off))
return -EFAULT;
}
jmp_offset = addrs[i + insn->off] - addrs[i];
- if (is_imm8(jmp_offset)) {
+ if (is_imm8_jmp_offset(jmp_offset)) {
if (jmp_padding) {
/* To keep the jmp_offset valid, the extra bytes are
* padded before the jump insn, so we subtract the
@@ -2266,7 +2316,7 @@ st: if (is_imm8(insn->off))
break;
}
emit_jmp:
- if (is_imm8(jmp_offset)) {
+ if (is_imm8_jmp_offset(jmp_offset)) {
if (jmp_padding) {
/* To avoid breaking jmp_offset, the extra bytes
* are padded before the actual jmp insn, so
--
2.43.0
next prev parent reply other threads:[~2024-10-04 18:18 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-04 18:16 [PATCH AUTOSEL 6.11 01/76] bpf: Call the missed btf_record_free() when map creation fails Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 02/76] selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 03/76] bpf: Fix a sdiv overflow issue Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 04/76] bpf: Check percpu map value size first Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 05/76] bpftool: Fix undefined behavior in qsort(NULL, 0, ...) Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 06/76] bpftool: Fix undefined behavior caused by shifting into the sign bit Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 07/76] s390/boot: Compile all files with the same march flag Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 08/76] s390/facility: Disable compile time optimization for decompressor code Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 09/76] s390/mm: Add cond_resched() to cmm_alloc/free_pages() Sasha Levin
2024-10-04 18:16 ` Sasha Levin [this message]
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 11/76] ext4: fix i_data_sem unlock order in ext4_ind_migrate() Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 12/76] ext4: avoid use-after-free in ext4_ext_show_leaf() Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 13/76] ext4: ext4_search_dir should return a proper error Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 14/76] iomap: fix iomap_dio_zero() for fs bs > system page size Sasha Levin
2024-10-04 22:46 ` Dave Chinner
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 15/76] ext4: don't set SB_RDONLY after filesystem errors Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 16/76] ext4: nested locking for xattr inode Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 17/76] s390/cpum_sf: Remove WARN_ON_ONCE statements Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 18/76] ext4: filesystems without casefold feature cannot be mounted with siphash Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 19/76] s390/traps: Handle early warnings gracefully Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 20/76] bpf: Prevent tail call between progs attached to different hooks Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 21/76] ktest.pl: Avoid false positives with grub2 skip regex Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 22/76] RDMA/mad: Improve handling of timed out WRs of mad agent Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 23/76] soundwire: intel_bus_common: enable interrupts before exiting reset Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 24/76] PCI: Add function 0 DMA alias quirk for Glenfly Arise chip Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 25/76] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 26/76] clk: bcm: bcm53573: fix OF node leak in init Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 27/76] PCI: Add ACS quirk for Qualcomm SA8775P Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 28/76] i2c: i801: Use a different adapter-name for IDF adapters Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 29/76] PCI: Mark Creative Labs EMU20k2 INTx masking as broken Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 30/76] i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 31/76] RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 32/76] io_uring: check if we need to reschedule during overflow flush Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 33/76] ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 34/76] mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 35/76] mfd: intel-lpss: Add Intel Arrow Lake-H LPSS PCI IDs Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 36/76] mfd: intel-lpss: Add Intel Panther Lake " Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 37/76] riscv: Omit optimized string routines when using KASAN Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 38/76] riscv: avoid Imbalance in RAS Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 39/76] RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 40/76] PCI: qcom: Disable mirroring of DBI and iATU register space in BAR region Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 41/76] PCI: endpoint: Assign PCI domain number for endpoint controllers Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 42/76] soundwire: cadence: re-check Peripheral status with delayed_work Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 43/76] riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 44/76] media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 45/76] remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 46/76] clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 47/76] fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 48/76] fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN Sasha Levin
2024-10-07 10:15 ` Miklos Szeredi
2024-10-11 13:51 ` Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 49/76] fuse: handle idmappings properly in ->write_iter() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 50/76] serial: protect uart_port_dtr_rts() in uart_shutdown() too Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 51/76] usb: typec: tipd: Free IRQ only if it was requested before Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 52/76] usb: typec: ucsi: Don't truncate the reads Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 53/76] usb: chipidea: udc: enable suspend interrupt after usb reset Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 54/76] usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 55/76] xhci: dbc: Fix STALL transfer event handling Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 56/76] usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 57/76] comedi: ni_routing: tools: Check when the file could not be opened Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 58/76] LoongArch: Fix memleak in pci_acpi_scan_root() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 59/76] netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 60/76] netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 61/76] virtio_pmem: Check device status before requesting flush Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 62/76] tools/iio: Add memory allocation failure check for trigger_name Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 63/76] staging: vme_user: added bound check to geoid Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 64/76] usb: gadget: uvc: Fix ERR_PTR dereference in uvc_v4l2.c Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 65/76] dm vdo: don't refer to dedupe_context after releasing it Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 66/76] driver core: bus: Fix double free in driver API bus_register() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 67/76] driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 68/76] scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 69/76] scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 70/76] scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 71/76] drm/xe/oa: Fix overflow in oa batch buffer Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 72/76] drm/amdgpu: nuke the VM PD/PT shadow handling Sasha Levin
2024-10-08 6:46 ` Christian König
2024-10-08 13:04 ` Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 73/76] drm/amd/display: Check null pointer before dereferencing se Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 74/76] fbcon: Fix a NULL pointer dereference issue in fbcon_putcs Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 75/76] smb: client: fix UAF in async decryption Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 76/76] fbdev: sisfb: Fix strbuf array overflow Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241004181828.3669209-10-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bp@alien8.de \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=hodgesd@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).