All of lore.kernel.org
 help / color / mirror / Atom feed
* ebpf can allow sub-word load results to be used as 64-bit pointers
@ 2024-11-30 16:56 rtm
  2024-11-30 17:49 ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 2+ messages in thread
From: rtm @ 2024-11-30 16:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	John Fastabend; +Cc: bpf

When I modify the bpf_cubic_init() function from

  https://github.com/aroodgar/bpf-tcp-congestion-control-algorithm

to read:

SEC("struct_ops/bpf_cubic_init")
void BPF_PROG(bpf_cubic_init, struct sock *sk) 
{
  asm volatile("r2 = *(u16*)(r1 + 0)");     // verifier should demand u64
  asm volatile("*(u32 *)(r2 +1504) = 0");   // 1280 in some configs
}

the verifier accepts it, but the second line crashes when it runs during
a TCP connect() because the "*(u16*)" in the load from context yields
only the low bits of the pointer.

Linux ubuntu66 6.12.0-11677-g2ba9f676d0a2 #10 SMP Sat Nov 30 11:28:09 EST 2024 x86_64 x86_64 x86_64 GNU/Linux

BUG: unable to handle page fault for address: 0000000000001020
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: Oops: 0002 [#1] SMP PTI
CPU: 6 UID: 0 PID: 1546 Comm: a.out Not tainted 6.12.0-11677-g2ba9f676d0a2 #10
Hardware name: FreeBSD BHYVE/BHYVE, BIOS 14.0 10/17/2021
RIP: 0010:bpf_prog_0e20ff5294b59096_bpf_cubic_init+0x19/0x25
Code: 00 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00
 00 0f 1f 00 55 48 89 e5 f3 0f 1e fa 48 0f b7 77 00 <c7> 86 e0 05 00 00 00 00 00
 00 c9 c3 cc cc cc cc cc cc cc cc cc cc
RSP: 0018:ffffc9000076bc58 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000a40 RDI: ffffc9000076bc88
RBP: ffffc9000076bc58 R08: ffffc9000076bc40 R09: ffffc9000076bc20
R10: ffffffff842f3200 R11: ffffffff827659c0 R12: 0000000000000004
R13: ffff888105493098 R14: 0000000000000000 R15: 00000000ffffff8d
FS:  00007fa5348d1740(0000) GS:ffff88842fb80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000001020 CR3: 000000010bbfa003 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? __die+0x1e/0x60
 ? page_fault_oops+0x157/0x450
 ? eth_header+0x25/0xb0
 ? exc_page_fault+0x66/0x140
 ? asm_exc_page_fault+0x26/0x30
 ? bpf_prog_0e20ff5294b59096_bpf_cubic_init+0x19/0x25
 ? __bpf_prog_enter+0x14/0x60
 bpf__tcp_congestion_ops_init+0x47/0xa3
 tcp_init_congestion_control+0x2a/0xe0
 tcp_init_transfer+0x2b2/0x2d0
 tcp_finish_connect+0x82/0x130
 tcp_rcv_state_process+0x352/0xf20
 tcp_v4_do_rcv+0xca/0x240
 __release_sock+0xc6/0xd0
 release_sock+0x2a/0x90
 __inet_stream_connect+0x208/0x3c0
 ? __pfx_woken_wake_function+0x10/0x10
 inet_stream_connect+0x35/0x50
 __sys_connect+0x93/0xb0
 ? ksys_write+0x67/0xe0
 __x64_sys_connect+0x13/0x20
 ? ksys_write+0x67/0xe0
 __x64_sys_connect+0x13/0x20
 do_syscall_64+0x3f/0xd0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Robert Morris
rtm@mit.edu


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: ebpf can allow sub-word load results to be used as 64-bit pointers
  2024-11-30 16:56 ebpf can allow sub-word load results to be used as 64-bit pointers rtm
@ 2024-11-30 17:49 ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 2+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2024-11-30 17:49 UTC (permalink / raw)
  To: rtm
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau,
	John Fastabend, bpf

On Sat, 30 Nov 2024 at 18:10, <rtm@csail.mit.edu> wrote:
>
> When I modify the bpf_cubic_init() function from
>
>   https://github.com/aroodgar/bpf-tcp-congestion-control-algorithm
>
> to read:
>
> SEC("struct_ops/bpf_cubic_init")
> void BPF_PROG(bpf_cubic_init, struct sock *sk)
> {
>   asm volatile("r2 = *(u16*)(r1 + 0)");     // verifier should demand u64
>   asm volatile("*(u32 *)(r2 +1504) = 0");   // 1280 in some configs
> }
>
> the verifier accepts it, but the second line crashes when it runs during
> a TCP connect() because the "*(u16*)" in the load from context yields
> only the low bits of the pointer.

Thanks for the bug report.

It is a missing check on "size" of the access in btf_ctx_access in
kernel/bpf/btf.c.
It is probably not seen in practice as desugaring of such arguments is
done by libbpf macros etc.
This won't just be limited to struct_ops, but many other program types
(tracing, etc.) which use this code.

I have prepared a fix here: https://github.com/kkdwivedi/linux/commits/sops

The relevant diff is:

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index e7a59e6462a9..f590cb792cf3 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6458,6 +6458,11 @@ bool btf_ctx_access(int off, int size, enum
bpf_access_type type,
                        tname, off);
                return false;
        }
+       if (size != sizeof(u64)) {
+               bpf_log(log, "func '%s' size %d is not multiple of 8\n",
+                       tname, size);
+               return false;
+       }
        arg = get_ctx_arg_idx(btf, t, off);
        args = (const struct btf_param *)(t + 1);
        /* if (t == NULL) Fall back to default BPF prog with


This is not the right 100% fix though, as fields may be u32 scalars
etc. so size needs to be checked against the member's type.
Right now this will reject some valid programs. But you get the idea.
The size of access needs to be checked.
Will test out in our CI before posting to the list.

May I ask how you happened to stumble upon this?

>
> Linux ubuntu66 6.12.0-11677-g2ba9f676d0a2 #10 SMP Sat Nov 30 11:28:09 EST 2024 x86_64 x86_64 x86_64 GNU/Linux
>
> BUG: unable to handle page fault for address: 0000000000001020
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 0 P4D 0
> Oops: Oops: 0002 [#1] SMP PTI
> CPU: 6 UID: 0 PID: 1546 Comm: a.out Not tainted 6.12.0-11677-g2ba9f676d0a2 #10
> Hardware name: FreeBSD BHYVE/BHYVE, BIOS 14.0 10/17/2021
> RIP: 0010:bpf_prog_0e20ff5294b59096_bpf_cubic_init+0x19/0x25
> Code: 00 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc f3 0f 1e fa 0f 1f 44 00
>  00 0f 1f 00 55 48 89 e5 f3 0f 1e fa 48 0f b7 77 00 <c7> 86 e0 05 00 00 00 00 00
>  00 c9 c3 cc cc cc cc cc cc cc cc cc cc
> RSP: 0018:ffffc9000076bc58 EFLAGS: 00010202
> RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000a40 RDI: ffffc9000076bc88
> RBP: ffffc9000076bc58 R08: ffffc9000076bc40 R09: ffffc9000076bc20
> R10: ffffffff842f3200 R11: ffffffff827659c0 R12: 0000000000000004
> R13: ffff888105493098 R14: 0000000000000000 R15: 00000000ffffff8d
> FS:  00007fa5348d1740(0000) GS:ffff88842fb80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000001020 CR3: 000000010bbfa003 CR4: 00000000003706f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  ? __die+0x1e/0x60
>  ? page_fault_oops+0x157/0x450
>  ? eth_header+0x25/0xb0
>  ? exc_page_fault+0x66/0x140
>  ? asm_exc_page_fault+0x26/0x30
>  ? bpf_prog_0e20ff5294b59096_bpf_cubic_init+0x19/0x25
>  ? __bpf_prog_enter+0x14/0x60
>  bpf__tcp_congestion_ops_init+0x47/0xa3
>  tcp_init_congestion_control+0x2a/0xe0
>  tcp_init_transfer+0x2b2/0x2d0
>  tcp_finish_connect+0x82/0x130
>  tcp_rcv_state_process+0x352/0xf20
>  tcp_v4_do_rcv+0xca/0x240
>  __release_sock+0xc6/0xd0
>  release_sock+0x2a/0x90
>  __inet_stream_connect+0x208/0x3c0
>  ? __pfx_woken_wake_function+0x10/0x10
>  inet_stream_connect+0x35/0x50
>  __sys_connect+0x93/0xb0
>  ? ksys_write+0x67/0xe0
>  __x64_sys_connect+0x13/0x20
>  ? ksys_write+0x67/0xe0
>  __x64_sys_connect+0x13/0x20
>  do_syscall_64+0x3f/0xd0
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Robert Morris
> rtm@mit.edu
>
>

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-11-30 17:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-30 16:56 ebpf can allow sub-word load results to be used as 64-bit pointers rtm
2024-11-30 17:49 ` Kumar Kartikeya Dwivedi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.