[PATCH net 1/2] bpf, sockmap: account only unread data in tcp_eat_skb

Netdev List
 help / color / mirror / Atom feed

From: Dong Chenchen <dongchenchen2@huawei.com>
To: <daniel@iogearbox.net>, <edumazet@google.com>,
	<ncardwell@google.com>, <kuniyu@google.com>,
	<john.fastabend@gmail.com>, <jakub@cloudflare.com>,
	<jiayuan.chen@linux.dev>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<horms@kernel.org>, <zhangchangzhong@huawei.com>,
	<netdev@vger.kernel.org>, <bpf@vger.kernel.org>,
	Dong Chenchen <dongchenchen2@huawei.com>,
	<syzbot+06dbd397158ec0ea4983@syzkaller.appspotmail.com>
Subject: [PATCH net 1/2] bpf, sockmap: account only unread data in tcp_eat_skb
Date: Thu, 2 Jul 2026 22:09:58 +0800	[thread overview]
Message-ID: <20260702140959.1806754-2-dongchenchen2@huawei.com> (raw)
In-Reply-To: <20260702140959.1806754-1-dongchenchen2@huawei.com>

tcp_eat_skb() advances copied_seq by the full skb length when a sockmap
verdict drops or redirects an skb. This assumes none of the skb has
previously been consumed.

That assumption does not hold when userspace partially reads an skb
before adding the socket to a sockmap. A later packet invokes the
verdict path, which dequeues the partially consumed skb. Adding its full
length counts the consumed prefix twice and can move copied_seq beyond
rcv_nxt, causing subsequent native TCP reads to fail.

The following sequence reproduces the corruption:

  1. TCP receives a 200-byte segment; skb sits on sk_receive_queue.
  2. Userspace reads 50 bytes //copied_seq = 50, rcv_nxt = 200
  3. Socket is inserted into a sockmap with an SK_DROP verdict.
  4. A 1-byte segment arrives and tcp_try_coalesce() merges it with the
     existing skb. //skb->len = 201, copied_seq = 50, rcv_nxt = 201
  5. The verdict path calls tcp_eat_skb(), which does:

       copied_seq += skb->len; // copied_seq = 251, rcv_nxt = 201

     This counts the 50 already-read bytes again.
  6. After removing the socket from the map, native receive triggers the
     sequence warning:

       TCP recvmsg seq # bug 2: copied AA28C633, seq AA28C601,
           rcvnxt AA28C602, fl 40
       WARNING: net/ipv4/tcp.c:2745 at tcp_recvmsg_locked+0x45e/0x9f0

Fix tcp_eat_skb() to advance copied_seq to the skb TCP end sequence and
pass only the distance from the old copied_seq to end_seq to
__tcp_cleanup_rbuf().

Fixes: e5c6de5fa025 ("bpf, sockmap: Incorrectly handling copied_seq")
Reported-by: syzbot+06dbd397158ec0ea4983@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=06dbd397158ec0ea4983
Signed-off-by: Dong Chenchen <dongchenchen2@huawei.com>
---
 net/ipv4/tcp_bpf.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index cc0bd73f36b6..d640f8e06529 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -15,7 +15,7 @@
 void tcp_eat_skb(struct sock *sk, struct sk_buff *skb)
 {
 	struct tcp_sock *tcp;
-	int copied;
+	u32 end_seq, delta;
 
 	if (!skb || !skb->len || !sk_is_tcp(sk))
 		return;
@@ -24,10 +24,11 @@ void tcp_eat_skb(struct sock *sk, struct sk_buff *skb)
 		return;
 
 	tcp = tcp_sk(sk);
-	copied = tcp->copied_seq + skb->len;
-	WRITE_ONCE(tcp->copied_seq, copied);
+	end_seq = TCP_SKB_CB(skb)->end_seq;
+	delta = end_seq - tcp->copied_seq;
+	WRITE_ONCE(tcp->copied_seq, end_seq);
 	tcp_rcv_space_adjust(sk);
-	__tcp_cleanup_rbuf(sk, skb->len);
+	__tcp_cleanup_rbuf(sk, delta);
 }
 
 static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,
-- 
2.43.0

next prev parent reply	other threads:[~2026-07-02 14:00 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-02 14:09 [PATCH net 0/2] bpf, sockmap: fix copied_seq after partial TCP read Dong Chenchen
2026-07-02 14:09 ` Dong Chenchen [this message]
2026-07-02 14:09 ` [PATCH net 2/2] selftests/bpf: cover sockmap drop " Dong Chenchen

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:cc0bd73f36b dfblob:d640f8e0652 )
 OR (
bs:"[PATCH net 1/2] bpf, sockmap: account only unread data in tcp_eat_skb" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260702140959.1806754-2-dongchenchen2@huawei.com \
    --to=dongchenchen2@huawei.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=jiayuan.chen@linux.dev \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=syzbot+06dbd397158ec0ea4983@syzkaller.appspotmail.com \
    --cc=zhangchangzhong@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox