netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
@ 2025-09-09 23:26 Kuniyuki Iwashima
  2025-09-10  5:15 ` Martin KaFai Lau
  2025-09-10 14:10 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-09-09 23:26 UTC (permalink / raw)
  To: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

syzbot reported the splat below. [0]

The repro does the following:

  1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
  2. Attach the prog to a SOCKMAP
  3. Add a socket to the SOCKMAP
  4. Activate fault injection
  5. Send data less than cork_bytes

At 5., the data is carried over to the next sendmsg() as it is
smaller than the cork_bytes specified by bpf_msg_cork_bytes().

Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
the data, but this fails silently due to fault injection + __GFP_NOWARN.

If the allocation fails, we need to revert the sk->sk_forward_alloc
change done by sk_msg_alloc().

Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
psock->cork.

[0]:
WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
Modules linked in:
CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
FS:  00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
Call Trace:
 <IRQ>
 __sk_destruct+0x86/0x660 net/core/sock.c:2339
 rcu_do_batch kernel/rcu/tree.c:2605 [inline]
 rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
 handle_softirqs+0x286/0x870 kernel/softirq.c:579
 __do_softirq kernel/softirq.c:613 [inline]
 invoke_softirq kernel/softirq.c:453 [inline]
 __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
 irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
 sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
 </IRQ>

Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
---
 net/ipv4/tcp_bpf.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ba581785adb4..ee6a371e65a4 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 		if (!psock->cork) {
 			psock->cork = kzalloc(sizeof(*psock->cork),
 					      GFP_ATOMIC | __GFP_NOWARN);
-			if (!psock->cork)
+			if (!psock->cork) {
+				sk_msg_free(sk, msg);
 				return -ENOMEM;
+			}
 		}
 		memcpy(psock->cork, msg, sizeof(*msg));
 		return 0;
-- 
2.51.0.384.g4c02a37b29-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  2025-09-09 23:26 [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Kuniyuki Iwashima
@ 2025-09-10  5:15 ` Martin KaFai Lau
  2025-09-10  6:56   ` Kuniyuki Iwashima
  2025-09-10 14:10 ` patchwork-bot+netdevbpf
  1 sibling, 1 reply; 6+ messages in thread
From: Martin KaFai Lau @ 2025-09-10  5:15 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Kuniyuki Iwashima, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote:
> syzbot reported the splat below. [0]
> 
> The repro does the following:
> 
>    1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
>    2. Attach the prog to a SOCKMAP
>    3. Add a socket to the SOCKMAP
>    4. Activate fault injection
>    5. Send data less than cork_bytes
> 
> At 5., the data is carried over to the next sendmsg() as it is
> smaller than the cork_bytes specified by bpf_msg_cork_bytes().
> 
> Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
> the data, but this fails silently due to fault injection + __GFP_NOWARN.
> 
> If the allocation fails, we need to revert the sk->sk_forward_alloc
> change done by sk_msg_alloc().
> 
> Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
> psock->cork.
> 
> [0]:
> WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
> Modules linked in:
> CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
> RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
> Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
> RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
> RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
> RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
> RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
> R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
> R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
> FS:  00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
> Call Trace:
>   <IRQ>
>   __sk_destruct+0x86/0x660 net/core/sock.c:2339
>   rcu_do_batch kernel/rcu/tree.c:2605 [inline]
>   rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
>   handle_softirqs+0x286/0x870 kernel/softirq.c:579
>   __do_softirq kernel/softirq.c:613 [inline]
>   invoke_softirq kernel/softirq.c:453 [inline]
>   __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
>   irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
>   instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
>   sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
>   </IRQ>
> 
> Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
> Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> ---
>   net/ipv4/tcp_bpf.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index ba581785adb4..ee6a371e65a4 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
>   		if (!psock->cork) {
>   			psock->cork = kzalloc(sizeof(*psock->cork),
>   					      GFP_ATOMIC | __GFP_NOWARN);
> -			if (!psock->cork)
> +			if (!psock->cork) {
> +				sk_msg_free(sk, msg);

Nothing has been corked yet, does it need to update the "*copied":

				*copied -= sk_msg_free(sk, msg);


>   				return -ENOMEM;
> +			}
>   		}
>   		memcpy(psock->cork, msg, sizeof(*msg));
>   		return 0;



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  2025-09-10  5:15 ` Martin KaFai Lau
@ 2025-09-10  6:56   ` Kuniyuki Iwashima
  2025-09-10 14:05     ` Martin KaFai Lau
  0 siblings, 1 reply; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-09-10  6:56 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Kuniyuki Iwashima, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

On Tue, Sep 9, 2025 at 10:15 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote:
> > syzbot reported the splat below. [0]
> >
> > The repro does the following:
> >
> >    1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
> >    2. Attach the prog to a SOCKMAP
> >    3. Add a socket to the SOCKMAP
> >    4. Activate fault injection
> >    5. Send data less than cork_bytes
> >
> > At 5., the data is carried over to the next sendmsg() as it is
> > smaller than the cork_bytes specified by bpf_msg_cork_bytes().
> >
> > Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
> > the data, but this fails silently due to fault injection + __GFP_NOWARN.
> >
> > If the allocation fails, we need to revert the sk->sk_forward_alloc
> > change done by sk_msg_alloc().
> >
> > Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
> > psock->cork.
> >
> > [0]:
> > WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
> > Modules linked in:
> > CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
> > RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
> > Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
> > RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
> > RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
> > RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
> > RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
> > R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
> > R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
> > FS:  00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
> > Call Trace:
> >   <IRQ>
> >   __sk_destruct+0x86/0x660 net/core/sock.c:2339
> >   rcu_do_batch kernel/rcu/tree.c:2605 [inline]
> >   rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
> >   handle_softirqs+0x286/0x870 kernel/softirq.c:579
> >   __do_softirq kernel/softirq.c:613 [inline]
> >   invoke_softirq kernel/softirq.c:453 [inline]
> >   __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
> >   irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
> >   instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
> >   sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
> >   </IRQ>
> >
> > Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
> > Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
> > Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> > ---
> >   net/ipv4/tcp_bpf.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> > index ba581785adb4..ee6a371e65a4 100644
> > --- a/net/ipv4/tcp_bpf.c
> > +++ b/net/ipv4/tcp_bpf.c
> > @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
> >               if (!psock->cork) {
> >                       psock->cork = kzalloc(sizeof(*psock->cork),
> >                                             GFP_ATOMIC | __GFP_NOWARN);
> > -                     if (!psock->cork)
> > +                     if (!psock->cork) {
> > +                             sk_msg_free(sk, msg);
>
> Nothing has been corked yet, does it need to update the "*copied":
>
>                                 *copied -= sk_msg_free(sk, msg);

Oh exactly, or simply *copied = 0 ?


>
>
> >                               return -ENOMEM;
> > +                     }
> >               }
> >               memcpy(psock->cork, msg, sizeof(*msg));
> >               return 0;
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  2025-09-10  6:56   ` Kuniyuki Iwashima
@ 2025-09-10 14:05     ` Martin KaFai Lau
  2025-09-10 15:59       ` Kuniyuki Iwashima
  0 siblings, 1 reply; 6+ messages in thread
From: Martin KaFai Lau @ 2025-09-10 14:05 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Kuniyuki Iwashima, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

On 9/9/25 11:56 PM, Kuniyuki Iwashima wrote:
> On Tue, Sep 9, 2025 at 10:15 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote:
>>> syzbot reported the splat below. [0]
>>>
>>> The repro does the following:
>>>
>>>     1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
>>>     2. Attach the prog to a SOCKMAP
>>>     3. Add a socket to the SOCKMAP
>>>     4. Activate fault injection
>>>     5. Send data less than cork_bytes
>>>
>>> At 5., the data is carried over to the next sendmsg() as it is
>>> smaller than the cork_bytes specified by bpf_msg_cork_bytes().
>>>
>>> Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
>>> the data, but this fails silently due to fault injection + __GFP_NOWARN.
>>>
>>> If the allocation fails, we need to revert the sk->sk_forward_alloc
>>> change done by sk_msg_alloc().
>>>
>>> Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
>>> psock->cork.
>>>
>>> [0]:
>>> WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
>>> RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
>>> Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
>>> RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
>>> RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
>>> RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
>>> RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
>>> R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
>>> R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
>>> FS:  00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
>>> Call Trace:
>>>    <IRQ>
>>>    __sk_destruct+0x86/0x660 net/core/sock.c:2339
>>>    rcu_do_batch kernel/rcu/tree.c:2605 [inline]
>>>    rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
>>>    handle_softirqs+0x286/0x870 kernel/softirq.c:579
>>>    __do_softirq kernel/softirq.c:613 [inline]
>>>    invoke_softirq kernel/softirq.c:453 [inline]
>>>    __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
>>>    irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
>>>    instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
>>>    sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
>>>    </IRQ>
>>>
>>> Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
>>> Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
>>> Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
>>> ---
>>>    net/ipv4/tcp_bpf.c | 4 +++-
>>>    1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>>> index ba581785adb4..ee6a371e65a4 100644
>>> --- a/net/ipv4/tcp_bpf.c
>>> +++ b/net/ipv4/tcp_bpf.c
>>> @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
>>>                if (!psock->cork) {
>>>                        psock->cork = kzalloc(sizeof(*psock->cork),
>>>                                              GFP_ATOMIC | __GFP_NOWARN);
>>> -                     if (!psock->cork)
>>> +                     if (!psock->cork) {
>>> +                             sk_msg_free(sk, msg);
>>
>> Nothing has been corked yet, does it need to update the "*copied":
>>
>>                                  *copied -= sk_msg_free(sk, msg);
> 
> Oh exactly, or simply *copied = 0 ?

Make sense. I made the change and updated the commit message for this fix also. 
Applied. Thanks.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  2025-09-09 23:26 [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Kuniyuki Iwashima
  2025-09-10  5:15 ` Martin KaFai Lau
@ 2025-09-10 14:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-09-10 14:10 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: john.fastabend, jakub, ast, daniel, kuni1840, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

Hello:

This patch was applied to bpf/bpf.git (master)
by Martin KaFai Lau <martin.lau@kernel.org>:

On Tue,  9 Sep 2025 23:26:12 +0000 you wrote:
> syzbot reported the splat below. [0]
> 
> The repro does the following:
> 
>   1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
>   2. Attach the prog to a SOCKMAP
>   3. Add a socket to the SOCKMAP
>   4. Activate fault injection
>   5. Send data less than cork_bytes
> 
> [...]

Here is the summary with links:
  - [v1,bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
    https://git.kernel.org/bpf/bpf/c/a3967baad4d5

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
  2025-09-10 14:05     ` Martin KaFai Lau
@ 2025-09-10 15:59       ` Kuniyuki Iwashima
  0 siblings, 0 replies; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-09-10 15:59 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: John Fastabend, Jakub Sitnicki, Alexei Starovoitov,
	Daniel Borkmann, Kuniyuki Iwashima, netdev, bpf,
	syzbot+4cabd1d2fa917a456db8

On Wed, Sep 10, 2025 at 7:05 AM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 9/9/25 11:56 PM, Kuniyuki Iwashima wrote:
> > On Tue, Sep 9, 2025 at 10:15 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote:
> >>> syzbot reported the splat below. [0]
> >>>
> >>> The repro does the following:
> >>>
> >>>     1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
> >>>     2. Attach the prog to a SOCKMAP
> >>>     3. Add a socket to the SOCKMAP
> >>>     4. Activate fault injection
> >>>     5. Send data less than cork_bytes
> >>>
> >>> At 5., the data is carried over to the next sendmsg() as it is
> >>> smaller than the cork_bytes specified by bpf_msg_cork_bytes().
> >>>
> >>> Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
> >>> the data, but this fails silently due to fault injection + __GFP_NOWARN.
> >>>
> >>> If the allocation fails, we need to revert the sk->sk_forward_alloc
> >>> change done by sk_msg_alloc().
> >>>
> >>> Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
> >>> psock->cork.
> >>>
> >>> [0]:
> >>> WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
> >>> Modules linked in:
> >>> CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
> >>> RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
> >>> Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
> >>> RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
> >>> RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
> >>> RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
> >>> RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
> >>> R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
> >>> R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
> >>> FS:  00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
> >>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>> CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
> >>> Call Trace:
> >>>    <IRQ>
> >>>    __sk_destruct+0x86/0x660 net/core/sock.c:2339
> >>>    rcu_do_batch kernel/rcu/tree.c:2605 [inline]
> >>>    rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
> >>>    handle_softirqs+0x286/0x870 kernel/softirq.c:579
> >>>    __do_softirq kernel/softirq.c:613 [inline]
> >>>    invoke_softirq kernel/softirq.c:453 [inline]
> >>>    __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
> >>>    irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
> >>>    instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
> >>>    sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
> >>>    </IRQ>
> >>>
> >>> Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
> >>> Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
> >>> Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
> >>> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
> >>> ---
> >>>    net/ipv4/tcp_bpf.c | 4 +++-
> >>>    1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> >>> index ba581785adb4..ee6a371e65a4 100644
> >>> --- a/net/ipv4/tcp_bpf.c
> >>> +++ b/net/ipv4/tcp_bpf.c
> >>> @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
> >>>                if (!psock->cork) {
> >>>                        psock->cork = kzalloc(sizeof(*psock->cork),
> >>>                                              GFP_ATOMIC | __GFP_NOWARN);
> >>> -                     if (!psock->cork)
> >>> +                     if (!psock->cork) {
> >>> +                             sk_msg_free(sk, msg);
> >>
> >> Nothing has been corked yet, does it need to update the "*copied":
> >>
> >>                                  *copied -= sk_msg_free(sk, msg);
> >
> > Oh exactly, or simply *copied = 0 ?
>
> Make sense. I made the change and updated the commit message for this fix also.
> Applied. Thanks.

Thank you Martin!

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-09-10 15:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-09 23:26 [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Kuniyuki Iwashima
2025-09-10  5:15 ` Martin KaFai Lau
2025-09-10  6:56   ` Kuniyuki Iwashima
2025-09-10 14:05     ` Martin KaFai Lau
2025-09-10 15:59       ` Kuniyuki Iwashima
2025-09-10 14:10 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).