From: Martin KaFai Lau <martin.lau@linux.dev>
To: Kuniyuki Iwashima <kuniyu@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>,
Jakub Sitnicki <jakub@cloudflare.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Kuniyuki Iwashima <kuni1840@gmail.com>,
netdev@vger.kernel.org, bpf@vger.kernel.org,
syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
Subject: Re: [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork.
Date: Wed, 10 Sep 2025 07:05:46 -0700 [thread overview]
Message-ID: <effcf89d-925a-4bf6-9c6c-39a9b6731409@linux.dev> (raw)
In-Reply-To: <CAAVpQUDeaiGUdxGQHSMRU3=zwJy7a0hMWXjoRkfdYPqaZLU09Q@mail.gmail.com>
On 9/9/25 11:56 PM, Kuniyuki Iwashima wrote:
> On Tue, Sep 9, 2025 at 10:15 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 9/9/25 4:26 PM, Kuniyuki Iwashima wrote:
>>> syzbot reported the splat below. [0]
>>>
>>> The repro does the following:
>>>
>>> 1. Load a sk_msg prog that calls bpf_msg_cork_bytes(msg, cork_bytes)
>>> 2. Attach the prog to a SOCKMAP
>>> 3. Add a socket to the SOCKMAP
>>> 4. Activate fault injection
>>> 5. Send data less than cork_bytes
>>>
>>> At 5., the data is carried over to the next sendmsg() as it is
>>> smaller than the cork_bytes specified by bpf_msg_cork_bytes().
>>>
>>> Then, tcp_bpf_send_verdict() tries to allocate psock->cork to hold
>>> the data, but this fails silently due to fault injection + __GFP_NOWARN.
>>>
>>> If the allocation fails, we need to revert the sk->sk_forward_alloc
>>> change done by sk_msg_alloc().
>>>
>>> Let's call sk_msg_free() when tcp_bpf_send_verdict fails to allocate
>>> psock->cork.
>>>
>>> [0]:
>>> WARNING: net/ipv4/af_inet.c:156 at inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156, CPU#1: syz-executor/5983
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 5983 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
>>> RIP: 0010:inet_sock_destruct+0x623/0x730 net/ipv4/af_inet.c:156
>>> Code: 0f 0b 90 e9 62 fe ff ff e8 7a db b5 f7 90 0f 0b 90 e9 95 fe ff ff e8 6c db b5 f7 90 0f 0b 90 e9 bb fe ff ff e8 5e db b5 f7 90 <0f> 0b 90 e9 e1 fe ff ff 89 f9 80 e1 07 80 c1 03 38 c1 0f 8c 9f fc
>>> RSP: 0018:ffffc90000a08b48 EFLAGS: 00010246
>>> RAX: ffffffff8a09d0b2 RBX: dffffc0000000000 RCX: ffff888024a23c80
>>> RDX: 0000000000000100 RSI: 0000000000000fff RDI: 0000000000000000
>>> RBP: 0000000000000fff R08: ffff88807e07c627 R09: 1ffff1100fc0f8c4
>>> R10: dffffc0000000000 R11: ffffed100fc0f8c5 R12: ffff88807e07c380
>>> R13: dffffc0000000000 R14: ffff88807e07c60c R15: 1ffff1100fc0f872
>>> FS: 00005555604c4500(0000) GS:ffff888125af1000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00005555604df5c8 CR3: 0000000032b06000 CR4: 00000000003526f0
>>> Call Trace:
>>> <IRQ>
>>> __sk_destruct+0x86/0x660 net/core/sock.c:2339
>>> rcu_do_batch kernel/rcu/tree.c:2605 [inline]
>>> rcu_core+0xca8/0x1770 kernel/rcu/tree.c:2861
>>> handle_softirqs+0x286/0x870 kernel/softirq.c:579
>>> __do_softirq kernel/softirq.c:613 [inline]
>>> invoke_softirq kernel/softirq.c:453 [inline]
>>> __irq_exit_rcu+0xca/0x1f0 kernel/softirq.c:680
>>> irq_exit_rcu+0x9/0x30 kernel/softirq.c:696
>>> instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1052 [inline]
>>> sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1052
>>> </IRQ>
>>>
>>> Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data")
>>> Reported-by: syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com
>>> Closes: https://lore.kernel.org/netdev/68c0b6b5.050a0220.3c6139.0013.GAE@google.com/
>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
>>> ---
>>> net/ipv4/tcp_bpf.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>>> index ba581785adb4..ee6a371e65a4 100644
>>> --- a/net/ipv4/tcp_bpf.c
>>> +++ b/net/ipv4/tcp_bpf.c
>>> @@ -408,8 +408,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
>>> if (!psock->cork) {
>>> psock->cork = kzalloc(sizeof(*psock->cork),
>>> GFP_ATOMIC | __GFP_NOWARN);
>>> - if (!psock->cork)
>>> + if (!psock->cork) {
>>> + sk_msg_free(sk, msg);
>>
>> Nothing has been corked yet, does it need to update the "*copied":
>>
>> *copied -= sk_msg_free(sk, msg);
>
> Oh exactly, or simply *copied = 0 ?
Make sense. I made the change and updated the commit message for this fix also.
Applied. Thanks.
next prev parent reply other threads:[~2025-09-10 14:06 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-09 23:26 [PATCH v1 bpf] tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork Kuniyuki Iwashima
2025-09-10 5:15 ` Martin KaFai Lau
2025-09-10 6:56 ` Kuniyuki Iwashima
2025-09-10 14:05 ` Martin KaFai Lau [this message]
2025-09-10 15:59 ` Kuniyuki Iwashima
2025-09-10 14:10 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=effcf89d-925a-4bf6-9c6c-39a9b6731409@linux.dev \
--to=martin.lau@linux.dev \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=jakub@cloudflare.com \
--cc=john.fastabend@gmail.com \
--cc=kuni1840@gmail.com \
--cc=kuniyu@google.com \
--cc=netdev@vger.kernel.org \
--cc=syzbot+4cabd1d2fa917a456db8@syzkaller.appspotmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.