All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Yonghong Song <yonghong.song@linux.dev>,
	Daniel Hodges <hodgesd@meta.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Sasha Levin <sashal@kernel.org>,
	davem@davemloft.net, dsahern@kernel.org, daniel@iogearbox.net,
	andrii@kernel.org, tglx@linutronix.de, mingo@redhat.com,
	bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Subject: [PATCH AUTOSEL 6.11 10/76] bpf, x64: Fix a jit convergence issue
Date: Fri,  4 Oct 2024 14:16:27 -0400	[thread overview]
Message-ID: <20241004181828.3669209-10-sashal@kernel.org> (raw)
In-Reply-To: <20241004181828.3669209-1-sashal@kernel.org>

From: Yonghong Song <yonghong.song@linux.dev>

[ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]

Daniel Hodges reported a jit error when playing with a sched-ext program.
The error message is:
  unexpected jmp_cond padding: -4 bytes

But further investigation shows the error is actual due to failed
convergence. The following are some analysis:

  ...
  pass4, final_proglen=4391:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    289:    48 85 ff                test   rdi,rdi
    28c:    74 17                   je     0x2a5
    28e:    e9 7f ff ff ff          jmp    0x212
    293:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
and insn at 0x28e is 5-byte jmp insn with offset -129.

  pass5, final_proglen=4392:
    ...
    20e:    48 85 ff                test   rdi,rdi
    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    28d:    48 85 ff                test   rdi,rdi
    290:    74 1a                   je     0x2ac
    292:    eb 84                   jmp    0x218
    294:    bf 03 00 00 00          mov    edi,0x3

Note that insn at 0x211 is 6-byte cond jump insn now since its offset
becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
time, insn at 0x292 is a 2-byte insn since its offset is -124.

pass6 will repeat the same code as in pass4. pass7 will repeat the same
code as in pass5, and so on. This will prevent eventual convergence.

Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
insn looks like:

    211:    0f 84 80 00 00 00       je     0x297
    217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    24d:    48 85 d2                test   rdx,rdx

The similar code in pass14:
    211:    74 7d                   je     0x290
    213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
    ...
    249:    48 85 d2                test   rdx,rdx
    24c:    74 21                   je     0x26f
    24e:    48 01 f7                add    rdi,rsi
    ...

Before generating the following insn,
  250:    74 21                   je     0x273
"padding = 1" enables some checking to ensure nops is either 0 or 4
where
  #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
  nops = INSN_SZ_DIFF - 2

In this specific case,
  addrs[i] = 0x24e // from pass14
  addrs[i-1] = 0x24d // from pass15
  prog - temp = 3 // from 'test rdx,rdx' in pass15
so
  nops = -4
and this triggers the failure.

To fix the issue, we need to break cycles of je <-> jmp. For example,
in the above case, we have
  211:    74 7d                   je     0x290
the offset is 0x7d. If 2-byte je insn is generated only if
the offset is less than 0x7d (<= 0x7c), the cycle can be
break and we can achieve the convergence.

I did some study on other cases like je <-> je, jmp <-> je and
jmp <-> jmp which may cause cycles. Those cases are not from actual
reproducible cases since it is pretty hard to construct a test case
for them. the results show that the offset <= 0x7b (0x7b = 123) should
be enough to cover all cases. This patch added a new helper to generate 8-bit
cond/uncond jmp insns only if the offset range is [-128, 123].

Reported-by: Daniel Hodges <hodgesd@meta.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240904221251.37109-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/net/bpf_jit_comp.c | 54 +++++++++++++++++++++++++++++++++++--
 1 file changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index d25d81c8ecc00..f609dc379be75 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -64,6 +64,56 @@ static bool is_imm8(int value)
 	return value <= 127 && value >= -128;
 }
 
+/*
+ * Let us limit the positive offset to be <= 123.
+ * This is to ensure eventual jit convergence For the following patterns:
+ * ...
+ * pass4, final_proglen=4391:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    74 7d                   je     0x290
+ *   213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   289:    48 85 ff                test   rdi,rdi
+ *   28c:    74 17                   je     0x2a5
+ *   28e:    e9 7f ff ff ff          jmp    0x212
+ *   293:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
+ * and insn at 0x28e is 5-byte jmp insn with offset -129.
+ *
+ * pass5, final_proglen=4392:
+ *   ...
+ *   20e:    48 85 ff                test   rdi,rdi
+ *   211:    0f 84 80 00 00 00       je     0x297
+ *   217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
+ *   ...
+ *   28d:    48 85 ff                test   rdi,rdi
+ *   290:    74 1a                   je     0x2ac
+ *   292:    eb 84                   jmp    0x218
+ *   294:    bf 03 00 00 00          mov    edi,0x3
+ * Note that insn at 0x211 is 6-byte cond jump insn now since its offset
+ * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80).
+ * At the same time, insn at 0x292 is a 2-byte insn since its offset is
+ * -124.
+ *
+ * pass6 will repeat the same code as in pass4 and this will prevent
+ * eventual convergence.
+ *
+ * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes)
+ * cycle in the above. In the above example je offset <= 0x7c should work.
+ *
+ * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence
+ * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should
+ * avoid no convergence issue.
+ *
+ * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn
+ * to maximum 123 (0x7b). This way, the jit pass can eventually converge.
+ */
+static bool is_imm8_jmp_offset(int value)
+{
+	return value <= 123 && value >= -128;
+}
+
 static bool is_simm32(s64 value)
 {
 	return value == (s64)(s32)value;
@@ -2184,7 +2234,7 @@ st:			if (is_imm8(insn->off))
 				return -EFAULT;
 			}
 			jmp_offset = addrs[i + insn->off] - addrs[i];
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To keep the jmp_offset valid, the extra bytes are
 					 * padded before the jump insn, so we subtract the
@@ -2266,7 +2316,7 @@ st:			if (is_imm8(insn->off))
 				break;
 			}
 emit_jmp:
-			if (is_imm8(jmp_offset)) {
+			if (is_imm8_jmp_offset(jmp_offset)) {
 				if (jmp_padding) {
 					/* To avoid breaking jmp_offset, the extra bytes
 					 * are padded before the actual jmp insn, so
-- 
2.43.0


  parent reply	other threads:[~2024-10-04 18:18 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-04 18:16 [PATCH AUTOSEL 6.11 01/76] bpf: Call the missed btf_record_free() when map creation fails Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 02/76] selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 03/76] bpf: Fix a sdiv overflow issue Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 04/76] bpf: Check percpu map value size first Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 05/76] bpftool: Fix undefined behavior in qsort(NULL, 0, ...) Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 06/76] bpftool: Fix undefined behavior caused by shifting into the sign bit Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 07/76] s390/boot: Compile all files with the same march flag Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 08/76] s390/facility: Disable compile time optimization for decompressor code Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 09/76] s390/mm: Add cond_resched() to cmm_alloc/free_pages() Sasha Levin
2024-10-04 18:16 ` Sasha Levin [this message]
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 11/76] ext4: fix i_data_sem unlock order in ext4_ind_migrate() Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 12/76] ext4: avoid use-after-free in ext4_ext_show_leaf() Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 13/76] ext4: ext4_search_dir should return a proper error Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 14/76] iomap: fix iomap_dio_zero() for fs bs > system page size Sasha Levin
2024-10-04 22:46   ` Dave Chinner
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 15/76] ext4: don't set SB_RDONLY after filesystem errors Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 16/76] ext4: nested locking for xattr inode Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 17/76] s390/cpum_sf: Remove WARN_ON_ONCE statements Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 18/76] ext4: filesystems without casefold feature cannot be mounted with siphash Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 19/76] s390/traps: Handle early warnings gracefully Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 20/76] bpf: Prevent tail call between progs attached to different hooks Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 21/76] ktest.pl: Avoid false positives with grub2 skip regex Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 22/76] RDMA/mad: Improve handling of timed out WRs of mad agent Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 23/76] soundwire: intel_bus_common: enable interrupts before exiting reset Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 24/76] PCI: Add function 0 DMA alias quirk for Glenfly Arise chip Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 25/76] RDMA/rtrs-srv: Avoid null pointer deref during path establishment Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 26/76] clk: bcm: bcm53573: fix OF node leak in init Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 27/76] PCI: Add ACS quirk for Qualcomm SA8775P Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 28/76] i2c: i801: Use a different adapter-name for IDF adapters Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 29/76] PCI: Mark Creative Labs EMU20k2 INTx masking as broken Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 30/76] i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition Sasha Levin
2024-10-04 18:16   ` Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 31/76] RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t Sasha Levin
2024-10-04 18:16   ` Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 32/76] io_uring: check if we need to reschedule during overflow flush Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 33/76] ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 34/76] mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 35/76] mfd: intel-lpss: Add Intel Arrow Lake-H LPSS PCI IDs Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 36/76] mfd: intel-lpss: Add Intel Panther Lake " Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 37/76] riscv: Omit optimized string routines when using KASAN Sasha Levin
2024-10-04 18:16   ` Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 38/76] riscv: avoid Imbalance in RAS Sasha Levin
2024-10-04 18:16   ` Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 39/76] RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 40/76] PCI: qcom: Disable mirroring of DBI and iATU register space in BAR region Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 41/76] PCI: endpoint: Assign PCI domain number for endpoint controllers Sasha Levin
2024-10-04 18:16 ` [PATCH AUTOSEL 6.11 42/76] soundwire: cadence: re-check Peripheral status with delayed_work Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 43/76] riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown Sasha Levin
2024-10-04 18:17   ` Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 44/76] media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 45/76] remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 46/76] clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 47/76] fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 48/76] fuse: allow O_PATH fd for FUSE_DEV_IOC_BACKING_OPEN Sasha Levin
2024-10-07 10:15   ` Miklos Szeredi
2024-10-11 13:51     ` Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 49/76] fuse: handle idmappings properly in ->write_iter() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 50/76] serial: protect uart_port_dtr_rts() in uart_shutdown() too Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 51/76] usb: typec: tipd: Free IRQ only if it was requested before Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 52/76] usb: typec: ucsi: Don't truncate the reads Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 53/76] usb: chipidea: udc: enable suspend interrupt after usb reset Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 54/76] usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 55/76] xhci: dbc: Fix STALL transfer event handling Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 56/76] usb: host: xhci-plat: Parse xhci-missing_cas_quirk and apply quirk Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 57/76] comedi: ni_routing: tools: Check when the file could not be opened Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 58/76] LoongArch: Fix memleak in pci_acpi_scan_root() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 59/76] netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 60/76] netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 61/76] virtio_pmem: Check device status before requesting flush Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 62/76] tools/iio: Add memory allocation failure check for trigger_name Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 63/76] staging: vme_user: added bound check to geoid Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 64/76] usb: gadget: uvc: Fix ERR_PTR dereference in uvc_v4l2.c Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 65/76] dm vdo: don't refer to dedupe_context after releasing it Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 66/76] driver core: bus: Fix double free in driver API bus_register() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 67/76] driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 68/76] scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 69/76] scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 70/76] scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 71/76] drm/xe/oa: Fix overflow in oa batch buffer Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 72/76] drm/amdgpu: nuke the VM PD/PT shadow handling Sasha Levin
2024-10-08  6:46   ` Christian König
2024-10-08 13:04     ` Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 73/76] drm/amd/display: Check null pointer before dereferencing se Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 74/76] fbcon: Fix a NULL pointer dereference issue in fbcon_putcs Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 75/76] smb: client: fix UAF in async decryption Sasha Levin
2024-10-04 18:17 ` [PATCH AUTOSEL 6.11 76/76] fbdev: sisfb: Fix strbuf array overflow Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241004181828.3669209-10-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=hodgesd@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.