All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>, netdev@vger.kernel.org
Cc: Cong Wang <cong.wang@bytedance.com>,
	Eric Dumazet <edumazet@google.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jakub Sitnicki <jakub@cloudflare.com>
Subject: RE: [Patch bpf-next v1 1/4] tcp: introduce tcp_read_skb()
Date: Tue, 12 Apr 2022 13:00:02 -0700	[thread overview]
Message-ID: <6255da425c4ad_57e1208f9@john.notmuch> (raw)
In-Reply-To: <20220410161042.183540-2-xiyou.wangcong@gmail.com>

Cong Wang wrote:
> From: Cong Wang <cong.wang@bytedance.com>
> 
> This patch inroduces tcp_read_skb() based on tcp_read_sock(),
> a preparation for the next patch which actually introduces
> a new sock ops.
> 
> TCP is special here, because it has tcp_read_sock() which is
> mainly used by splice(). tcp_read_sock() supports partial read
> and arbitrary offset, neither of them is needed for sockmap.
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jakub Sitnicki <jakub@cloudflare.com>
> Signed-off-by: Cong Wang <cong.wang@bytedance.com>
> ---

Thanks for doing this Cong comment/question inline.

[...]

> +int tcp_read_skb(struct sock *sk, read_descriptor_t *desc,
> +		 sk_read_actor_t recv_actor)
> +{
> +	struct sk_buff *skb;
> +	struct tcp_sock *tp = tcp_sk(sk);
> +	u32 seq = tp->copied_seq;
> +	u32 offset;
> +	int copied = 0;
> +
> +	if (sk->sk_state == TCP_LISTEN)
> +		return -ENOTCONN;
> +	while ((skb = tcp_recv_skb(sk, seq, &offset, true)) != NULL) {

I'm trying to see why we might have an offset here if we always
consume the entire skb. There is a comment in tcp_recv_skb around
GRO packets, but not clear how this applies here if it does at all
to me yet. Will read a bit more I guess.

If the offset can be >0 than we also need to fix the recv_actor to
account for the extra offset in the skb. As is the bpf prog might
see duplicate data. This is a problem on the stream parser now.

Then another fallout is if offset is zero than we could just do
a skb_dequeue here and skip the tcp_recv_skb bool flag addition
and upate.

I'll continue reading after a few other things I need to get
sorted this afternoon, but maybe you have the answer on hand.

> +		if (offset < skb->len) {
> +			int used;
> +			size_t len;
> +
> +			len = skb->len - offset;
> +			used = recv_actor(desc, skb, offset, len);
> +			if (used <= 0) {
> +				if (!copied)
> +					copied = used;
> +				break;
> +			}
> +			if (WARN_ON_ONCE(used > len))
> +				used = len;
> +			seq += used;
> +			copied += used;
> +			offset += used;
> +
> +			if (offset != skb->len)
> +				continue;
> +		}
> +		if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) {
> +			kfree_skb(skb);
> +			++seq;
> +			break;
> +		}
> +		kfree_skb(skb);
> +		if (!desc->count)
> +			break;
> +		WRITE_ONCE(tp->copied_seq, seq);
> +	}
> +	WRITE_ONCE(tp->copied_seq, seq);
> +
> +	tcp_rcv_space_adjust(sk);
> +
> +	/* Clean up data we have read: This will do ACK frames. */
> +	if (copied > 0)
> +		tcp_cleanup_rbuf(sk, copied);
> +
> +	return copied;
> +}
> +EXPORT_SYMBOL(tcp_read_skb);
> +

  reply	other threads:[~2022-04-12 20:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-10 16:10 [Patch bpf-next v1 0/4] sockmap: some performance optimizations Cong Wang
2022-04-10 16:10 ` [Patch bpf-next v1 1/4] tcp: introduce tcp_read_skb() Cong Wang
2022-04-12 20:00   ` John Fastabend [this message]
2022-04-21 19:00     ` Cong Wang
2022-04-26  6:27       ` John Fastabend
2022-04-30 17:17         ` Cong Wang
2022-04-25 19:07   ` Jakub Kicinski
2022-04-30 17:22     ` Cong Wang
2022-05-02 16:13       ` Jakub Kicinski
2022-04-26  9:11   ` Jakub Sitnicki
2022-04-30 17:18     ` Cong Wang
2022-04-10 16:10 ` [Patch bpf-next v1 2/4] net: introduce a new proto_ops ->read_skb() Cong Wang
2022-04-10 16:10 ` [Patch bpf-next v1 3/4] skmsg: get rid of skb_clone() Cong Wang
2022-04-10 16:10 ` [Patch bpf-next v1 4/4] skmsg: get rid of unncessary memset() Cong Wang
2022-04-26  9:27 ` [Patch bpf-next v1 0/4] sockmap: some performance optimizations Jakub Sitnicki
2022-04-30 17:27   ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6255da425c4ad_57e1208f9@john.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=edumazet@google.com \
    --cc=jakub@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.