netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [REPORT] BPF: Reproducible triggering of BUG() from userspace PoC
@ 2023-11-08 15:46 Lee Jones
  2023-11-09 21:39 ` Jiri Olsa
  0 siblings, 1 reply; 3+ messages in thread
From: Lee Jones @ 2023-11-08 15:46 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: x86, bpf, linux-kernel, netdev

Good afternoon,

After coming across a recent Syzkaller report [0] I thought I'd take
some time to firstly reproduce the issue, then see if there was a
trivial way to mitigate it.  The report suggests that a BUG() in
prog_array_map_poke_run() [1] can be trivially and reliably triggered
from userspace using the PoC provided [2].

        ret = bpf_arch_text_poke(poke->tailcall_bypass,
                                 BPF_MOD_JUMP,
                                 old_bypass_addr,
                                 poke->bypass_addr);
        BUG_ON(ret < 0 && ret != -EINVAL);

Indeed the PoC does seem to be able to consistently trigger the BUG(),
not only on the reported kernel (v6.1), but also on linux-next.  I went
to the trouble of checking LORE, but failed to find any patches which
may be attempting to fix this.

    kernel BUG at kernel/bpf/arraymap.c:1094!
    invalid opcode: 0000 [#1] PREEMPT SMP KASAN
    CPU: 5 PID: 45 Comm: kworker/5:0 Not tainted 6.6.0-rc3-next-20230929-dirty #74
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
    Workqueue: events prog_array_map_clear_deferred
    RIP: 0010:prog_array_map_poke_run+0x6b4/0x6d0
    Code: ff 0f 0b e8 1e 27 e1 ff 48 c7 c7 60 80 93 85 48 c7 c6 00 7f 93 85 48 c7 c2 bb c2 39 86 b9 45 04 00 00 45 89 f8 e8 9c 890
    RSP: 0018:ffffc9000036fb50 EFLAGS: 00010246
    RAX: 0000000000000044 RBX: ffff88811f337490 RCX: 63af48a1314f9900
    RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
    RBP: ffffc9000036fbe8 R08: ffffffff815c23c5 R09: 1ffff11084c14eba
    R10: dfffe91084c14ebc R11: ffffed1084c14ebb R12: ffff888116517800
    R13: dffffc0000000000 R14: ffff888125a1a400 R15: 00000000fffffff0
    FS:  0000000000000000(0000) GS:ffff888426080000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000004ab678 CR3: 0000000122ac4000 CR4: 0000000000350eb0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     <TASK>
     ? __die_body+0x92/0xf0
     ? die+0xa2/0xe0
     ? do_trap+0x12f/0x370
     ? handle_invalid_op+0xa6/0x140
     ? handle_invalid_op+0xdf/0x140
     ? prog_array_map_poke_run+0x6b4/0x6d0
     ? prog_array_map_poke_run+0x6b4/0x6d0
     ? exc_invalid_op+0x32/0x50
     ? asm_exc_invalid_op+0x1b/0x20
     ? __wake_up_klogd+0xd5/0x110
     ? prog_array_map_poke_run+0x6b4/0x6d0
     ? bpf_prog_6781ebc2dae4bad9+0xb/0x53
     fd_array_map_delete_elem+0x152/0x250
     prog_array_map_clear_deferred+0xf6/0x210
     ? __bpf_array_map_seq_show+0xa40/0xa40
     ? kick_pool+0x164/0x350
     ? process_one_work+0x57a/0xd00
     process_one_work+0x5e4/0xd00
     worker_thread+0x9cf/0xea0
     kthread+0x2b4/0x350
     ? pr_cont_work+0x580/0x580
     ? kthread_blkcg+0xd0/0xd0
     ret_from_fork+0x4a/0x80
     ? kthread_blkcg+0xd0/0xd0
     ret_from_fork_asm+0x11/0x20
     </TASK>
    Modules linked in:
    ---[ end trace 0000000000000000 ]---

However, with my very limited BPF subsystem knowledge I was unable to
trivially fix the issue.  Hopefully some knowledgable person would be
kind enough to provide me with some pointers.

bpf_arch_text_poke() seems to be returning -EBUSY due to a negative
memcmp() result from [3].

        ret = -EBUSY;
        mutex_lock(&text_mutex);
        if (memcmp(ip, old_insn, X86_PATCH_SIZE)) {
                goto out;
        [...]

When spitting out the memory at those locations, this is the result:

    ip:        e9 06 00 00 00
    old_insn:  0f 1f 44 00 00
    nop_insn:  0f 1f 44 00 00

As you can see, the information stored in 'ip' does not match that of
the data stored in 'old_insn', causing bpf_arch_text_poke() to return
early with the error -EBUSY, suggesting that the data pointed to by
'old_insn', and by extension 'prog' should have been changed when
emit_call()ing, to the value of 'ip', but wasn't.

It's possible for me to see what is happening, but I'm afraid finding
possible causes of corruption became too time consuming on this
occasion.  Would anyone be able to chime in to provide their take on
possible causes please?

Any help would be gratefully received.

[0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810
[1] https://elixir.bootlin.com/linux/latest/source/kernel/bpf/arraymap.c#L1092
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=1397180f680000
[3] https://elixir.bootlin.com/linux/latest/source/arch/x86/net/bpf_jit_comp.c#L387

-- 
Lee Jones [李琼斯]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-11-21 16:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-08 15:46 [REPORT] BPF: Reproducible triggering of BUG() from userspace PoC Lee Jones
2023-11-09 21:39 ` Jiri Olsa
2023-11-21 16:53   ` Lee Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).