public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Yonghong Song <yonghong.song@linux.dev>
To: "Lai, Yi" <yi1.lai@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, Daniel Borkmann <daniel@iogearbox.net>,
	Martin KaFai Lau <martin.lau@kernel.org>,
	David Faust <david.faust@oracle.com>,
	"Jose E . Marchesi" <jose.marchesi@oracle.com>,
	kernel-team@fb.com, Eduard Zingerman <eddyz87@gmail.com>,
	yi1.lai@intel.com
Subject: Re: [PATCH bpf-next v5 07/17] bpf: Support new 32bit offset jmp instruction
Date: Thu, 8 May 2025 21:09:25 -0700	[thread overview]
Message-ID: <33a03235-638d-4c63-811d-ec44872654b3@linux.dev> (raw)
In-Reply-To: <763cbfb4-b1a0-4752-8428-749bb12e2103@linux.dev>



On 5/7/25 1:06 PM, Yonghong Song wrote:
>
>
> On 4/15/25 11:58 AM, Lai, Yi wrote:
>> Hi Yonghong Song,
>>
>> Greetings!
>>
>> I used Syzkaller and found that there is WARNING in 
>> __mark_chain_precision in linux-next tag - next-20250414.
>
> Thanks, Yi. I will investigate this soon.

I did some investigation. The source code looks like below:

+__used __naked static void hack_sub(void)
+{
+       asm volatile ("                                 \
+        r2 = 2314885393468386424 ll; \
+        gotol +0; \
+        if r2 <= r10 goto -1; \
+        if r1 >= -1835016 goto +0; \
+        if r2 <= 8 goto +0; \
+        if r3 <= 0 goto +0; \
+        call 44; \
+        exit; \
+       "      :
+       :
+       : __clobber_all);
+}
+
+SEC("cgroup/sock_create")
+__description("HACK")
+__success __retval(0)
+__naked void hack(void)
+{
+       asm volatile ("                                 \
+        r3 = 0 ll; \
+        call hack_sub; \
+        exit; \
+       "      :
+       :
+       : __clobber_all);
+}

The verification failure:

0: R1=ctx() R10=fp0
; asm volatile ("                                 \ @ verifier_movsx.c:352
0: (18) r3 = 0x0                      ; R3_w=0
2: (85) call pc+1
caller:
  R10=fp0
callee:
  frame1: R1=ctx() R3_w=0 R10=fp0
4: frame1: R1=ctx() R3_w=0 R10=fp0
; asm volatile ("                                 \ @ verifier_movsx.c:333
4: (18) r2 = 0x20202000256c6c78       ; frame1: R2_w=0x20202000256c6c78
6: (06) gotol pc+0
7: (bd) if r2 <= r10 goto pc-1        ; frame1: R2_w=0x20202000256c6c78 R10=fp0
8: (35) if r1 >= 0xffe3fff8 goto pc+0         ; frame1: R1=ctx()
9: (b5) if r2 <= 0x8 goto pc+0
mark_precise: frame1: last_idx 9 first_idx 0 subseq_idx -1
mark_precise: frame1: regs=r2 stack= before 8: (35) if r1 >= 0xffe3fff8 goto pc+0
mark_precise: frame1: regs=r2 stack= before 7: (bd) if r2 <= r10 goto pc-1
mark_precise: frame1: regs=r2,r10 stack= before 6: (06) gotol pc+0
mark_precise: frame1: regs=r2,r10 stack= before 4: (18) r2 = 0x20202000256c6c78
mark_precise: frame1: regs=r10 stack= before 2: (85) call pc+1
BUG regs 400
processed 7 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

The verification failure happens below (line 4301 and 4302)

  4294                                 /* static subprog call instruction, which
  4295                                  * means that we are exiting current subprog,
  4296                                  * so only r1-r5 could be still requested as
  4297                                  * precise, r0 and r6-r10 or any stack slot in
  4298                                  * the current frame should be zero by now
  4299                                  */
  4300                                 if (bt_reg_mask(bt) & ~BPF_REGMASK_ARGS) {
  4301                                         verbose(env, "BUG regs %x\n", bt_reg_mask(bt));
  4302                                         WARN_ONCE(1, "verifier backtracking bug");
  4303                                         return -EFAULT;
  4304                                 }

So the failure reason is due to r10 is used during comparisons.
So verifier does the right thing. Maybe you should remove WARN_ONCE
("verifier backtracking bug")? Do we actually hit backtracking bug
due to verifier implementation?


>
>>
>> After bisection and the first bad commit is:
>> "
>> 4cd58e9af8b9 bpf: Support new 32bit offset jmp instruction
>> "
>>
>> All detailed into can be found at:
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision 
>>
>> Syzkaller repro code:
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision/repro.c 
>>
>> Syzkaller repro syscall steps:
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision/repro.prog 
>>
>> Syzkaller report:
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision/repro.report 
>>
>> Kconfig(make olddefconfig):
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision/kconfig_origin 
>>
>> Bisect info:
>> https://github.com/laifryiee/syzkaller_logs/tree/main/250415_203801___mark_chain_precision/bisect_info.log 
>>
>> bzImage:
>> https://github.com/laifryiee/syzkaller_logs/raw/refs/heads/main/250415_203801___mark_chain_precision/bzImage_8ffd015db85fea3e15a77027fda6c02ced4d2444 
>>
>> Issue dmesg:
>> https://github.com/laifryiee/syzkaller_logs/blob/main/250415_203801___mark_chain_precision/8ffd015db85fea3e15a77027fda6c02ced4d2444_dmesg.log 
>>
>>
>> "
>> [   51.167546] ------------[ cut here ]------------
>> [   51.167803] verifier backtracking bug
>> [   51.167867] WARNING: CPU: 1 PID: 672 at kernel/bpf/verifier.c:4302 
>> __mark_chain_precision+0x35d3/0x37b0
>> [   51.168496] Modules linked in:
>> [   51.168684] CPU: 1 UID: 0 PID: 672 Comm: repro Not tainted 
>> 6.15.0-rc2-8ffd015db85f #1 PREEMPT(voluntary)
>> [   51.169127] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
>> BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.o4
>> [   51.169980] RIP: 0010:__mark_chain_precision+0x35d3/0x37b0
>> [   51.170255] Code: 06 31 ff 89 de e8 cd 0b e0 ff 84 db 0f 85 a7 e5 
>> ff ff e8 90 11 e0 ff 48 c7 c7 a0 cb f4 85 c6 05 f
>> [   51.171108] RSP: 0018:ffff8880115ff2d8 EFLAGS: 00010296
>> [   51.171424] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
>> ffffffff81470f72
>> [   51.171759] RDX: ffff88801f422540 RSI: ffffffff81470f7f RDI: 
>> 0000000000000001
>> [   51.172112] RBP: ffff8880115ff428 R08: 0000000000000001 R09: 
>> ffffed100d8a5941
>> [   51.172443] R10: 0000000000000000 R11: ffff88801f423398 R12: 
>> 0000000000000400
>> [   51.172769] R13: dffffc0000000000 R14: 0000000000000002 R15: 
>> ffff88801f720000
>> [   51.173152] FS:  00007f8a0a0b1600(0000) GS:ffff8880e3684000(0000) 
>> knlGS:0000000000000000
>> [   51.173563] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   51.173861] CR2: 0000000000402010 CR3: 000000001179a006 CR4: 
>> 0000000000770ef0
>> [   51.174244] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [   51.174614] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 
>> 0000000000000400
>> [   51.174995] PKRU: 55555554
>> [   51.175151] Call Trace:
>> [   51.175302]  <TASK>
>> [   51.175439]  ? __lock_acquire+0x381/0x2260
>> [   51.175675]  ? __pfx___sanitizer_cov_trace_const_cmp4+0x10/0x10
>> [   51.176006]  ? __pfx___mark_chain_precision+0x10/0x10
>> [   51.176326]  ? mark_reg_read+0x1e4/0x340
>> [   51.176558]  ? __check_reg_arg+0x1c8/0x440
>> [   51.176802]  ? kasan_quarantine_put+0xa2/0x200
>> [   51.177068]  check_cond_jmp_op+0x2692/0x65f0
>> [   51.177335]  ? krealloc_noprof+0xe5/0x330
>> [   51.177569]  ? krealloc_noprof+0x190/0x330
>> [   51.177790]  ? __pfx_check_cond_jmp_op+0x10/0x10
>> [   51.178060]  ? push_insn_history+0x1d0/0x6d0
>> [   51.178308]  do_check_common+0x9134/0xd570
>> [   51.178532]  ? ns_capable+0xec/0x130
>> [   51.178748]  ? bpf_base_func_proto+0x7e/0xbe0
>> [   51.179025]  ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
>> [   51.179319]  ? __pfx_do_check_common+0x10/0x10
>> [   51.179540]  ? __pfx_mark_fastcall_pattern_for_call+0x10/0x10
>> [   51.179864]  ? bpf_check+0x89b9/0xd880
>> [   51.180072]  ? kvfree+0x32/0x40
>> [   51.180237]  bpf_check+0x9c27/0xd880
>> [   51.180450]  ? rcu_is_watching+0x19/0xc0
>> [   51.180680]  ? __lock_acquire+0x380/0x2260
>> [   51.180900]  ? __pfx_bpf_check+0x10/0x10
>> [   51.181099]  ? __lock_acquire+0x410/0x2260
>> [   51.181355]  ? __this_cpu_preempt_check+0x21/0x30
>> [   51.181673]  ? seqcount_lockdep_reader_access.constprop.0+0xb4/0xd0
>> [   51.181989]  ? __sanitizer_cov_trace_cmp4+0x1a/0x20
>> [   51.182229]  ? __sanitizer_cov_trace_const_cmp1+0x1e/0x30
>> [   51.182510]  ? bpf_obj_name_cpy+0x152/0x1b0
>> [   51.182765]  bpf_prog_load+0x14d7/0x2600
>> [   51.182970]  ? __pfx_bpf_prog_load+0x10/0x10
>> [   51.183193]  ? __might_fault+0x14a/0x1b0
>> [   51.183435]  ? __this_cpu_preempt_check+0x21/0x30
>> [   51.183670]  ? lock_release+0x14f/0x2c0
>> [   51.183876]  ? __might_fault+0xf1/0x1b0
>> [   51.184074]  __sys_bpf+0x18ac/0x5c10
>> [   51.184279]  ? __pfx___sys_bpf+0x10/0x10
>> [   51.184502]  ? __lock_acquire+0x410/0x2260
>> [   51.184725]  ? __sanitizer_cov_trace_cmp4+0x1a/0x20
>> [   51.184960]  ? ktime_get_coarse_real_ts64+0xb6/0x100
>> [   51.185253]  ? __audit_syscall_entry+0x39c/0x500
>> [   51.185507]  __x64_sys_bpf+0x7d/0xc0
>> [   51.185718]  ? syscall_trace_enter+0x14d/0x280
>> [   51.185945]  x64_sys_call+0x204a/0x2150
>> [   51.186182]  do_syscall_64+0x6d/0x150
>> [   51.186395]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [   51.186654] RIP: 0033:0x7f8a09e3ee5d
>> [   51.186869] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e 
>> fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 8
>> [   51.187767] RSP: 002b:00007fff00100bb8 EFLAGS: 00000246 ORIG_RAX: 
>> 0000000000000141
>> [   51.188152] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 
>> 00007f8a09e3ee5d
>> [   51.188527] RDX: 0000000000000090 RSI: 00000000200009c0 RDI: 
>> 0000000000000005
>> [   51.188895] RBP: 00007fff00100bc0 R08: 0000000000000000 R09: 
>> 0000000000000001
>> [   51.189263] R10: 00000000ffffffff R11: 0000000000000246 R12: 
>> 00007fff00100cd8
>> [   51.189657] R13: 0000000000401146 R14: 0000000000403e08 R15: 
>> 00007f8a0a0fa000
>> [   51.190071]  </TASK>
>> [   51.190197] irq event stamp: 3113
>> [   51.190380] hardirqs last  enabled at (3121): [<ffffffff8165d8c5>] 
>> __up_console_sem+0x95/0xb0
>> [   51.190797] hardirqs last disabled at (3128): [<ffffffff8165d8aa>] 
>> __up_console_sem+0x7a/0xb0
>> [   51.191214] softirqs last  enabled at (2600): [<ffffffff8149050e>] 
>> __irq_exit_rcu+0x10e/0x170
>> [   51.191656] softirqs last disabled at (2589): [<ffffffff8149050e>] 
>> __irq_exit_rcu+0x10e/0x170
>> [   51.192093] ---[ end trace 0000000000000000 ]---
>> "
>>
>> Hope this cound be insightful to you.
>>
>> Regards,
>> Yi Lai
>>
>> ---
>>
>> If you don't need the following environment to reproduce the problem 
>> or if you
>> already have one reproduced environment, please ignore the following 
>> information.
>>
>> How to reproduce:
>> git clone https://gitlab.com/xupengfe/repro_vm_env.git
>> cd repro_vm_env
>> tar -xvf repro_vm_env.tar.gz
>> cd repro_vm_env; ./start3.sh  // it needs qemu-system-x86_64 and I 
>> used v7.1.0
>>    // start3.sh will load 
>> bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
>>    // You could change the bzImage_xxx as you want
>>    // Maybe you need to remove line "-drive 
>> if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different 
>> qemu version
>> You could use below command to log in, there is no password for root.
>> ssh -p 10023 root@localhost
>>
>> After login vm(virtual machine) successfully, you could transfer 
>> reproduced
>> binary to the vm by below way, and reproduce the problem in vm:
>> gcc -pthread -o repro repro.c
>> scp -P 10023 repro root@localhost:/root/
>>
>> Get the bzImage for target kernel:
>> Please use target kconfig and copy it to kernel_src/.config
>> make olddefconfig
>> make -jx bzImage           //x should equal or less than cpu num your 
>> pc has
>>
>> Fill the bzImage file into above start3.sh to load the target kernel 
>> in vm.
>>
>>
>> Tips:
>> If you already have qemu-system-x86_64, please ignore below info.
>> If you want to install qemu v7.1.0 version:
>> git clone https://github.com/qemu/qemu.git
>> cd qemu
>> git checkout -f v7.1.0
>> mkdir build
>> cd build
>> yum install -y ninja-build.x86_64
>> yum -y install libslirp-devel.x86_64
>> ../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc 
>> --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
>> make
>> make install
>>
> [...]
>


  reply	other threads:[~2025-05-09  4:09 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-28  1:11 [PATCH bpf-next v5 00/17] bpf: Support new insns from cpu v4 Yonghong Song
2023-07-28  1:11 ` [PATCH bpf-next v5 01/17] bpf: Support new sign-extension load insns Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 02/17] bpf: Support new sign-extension mov insns Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 03/17] bpf: Handle sign-extenstin ctx member accesses Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 04/17] bpf: Support new unconditional bswap instruction Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 05/17] bpf: Support new signed div/mod instructions Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 06/17] bpf: Fix jit blinding with new sdiv/smov insns Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 07/17] bpf: Support new 32bit offset jmp instruction Yonghong Song
2025-04-16  3:58   ` Lai, Yi
2025-05-08  5:06     ` Yonghong Song
2025-05-09  4:09       ` Yonghong Song [this message]
2025-05-09 17:21         ` Alexei Starovoitov
2025-05-09 20:50           ` Eduard Zingerman
2025-05-09 21:36             ` Andrii Nakryiko
2025-05-10  0:01               ` Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 09/17] selftests/bpf: Fix a test_verifier failure Yonghong Song
2023-07-28  1:12 ` [PATCH bpf-next v5 10/17] selftests/bpf: Add a cpuv4 test runner for cpu=v4 testing Yonghong Song
2023-07-28  2:18   ` Alexei Starovoitov
2023-07-28  4:49     ` Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 11/17] selftests/bpf: Add unit tests for new sign-extension load insns Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 12/17] selftests/bpf: Add unit tests for new sign-extension mov insns Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 13/17] selftests/bpf: Add unit tests for new bswap insns Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 14/17] selftests/bpf: Add unit tests for new sdiv/smod insns Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 15/17] selftests/bpf: Add unit tests for new gotol insn Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 16/17] selftests/bpf: Test ldsx with more complex cases Yonghong Song
2023-07-28  1:13 ` [PATCH bpf-next v5 17/17] docs/bpf: Add documentation for new instructions Yonghong Song
2023-07-28  1:13   ` [Bpf] " Yonghong Song
2023-07-28 13:25   ` David Vernet
2023-07-28 13:25     ` [Bpf] " David Vernet
2023-07-28 16:18     ` Yonghong Song
2023-07-28 16:18       ` [Bpf] " Yonghong Song
2023-07-28  2:20 ` [PATCH bpf-next v5 00/17] bpf: Support new insns from cpu v4 patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=33a03235-638d-4c63-811d-ec44872654b3@linux.dev \
    --to=yonghong.song@linux.dev \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=david.faust@oracle.com \
    --cc=eddyz87@gmail.com \
    --cc=jose.marchesi@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=martin.lau@kernel.org \
    --cc=yi1.lai@intel.com \
    --cc=yi1.lai@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox