All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	maze@google.com, lmb@cloudflare.com, shaun@tigera.io,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	marek@cloudflare.com, John Fastabend <john.fastabend@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	eyal.birger@gmail.com, colrack@gmail.com, brouer@redhat.com
Subject: Re: [PATCH bpf-next V8 4/8] bpf: add BPF-helper for MTU checking
Date: Mon, 14 Dec 2020 14:51:28 +0100	[thread overview]
Message-ID: <20201214145128.0046316c@carbon> (raw)
In-Reply-To: <e5d7ade3-6648-5934-ede1-956e379834a2@iogearbox.net>

On Thu, 3 Dec 2020 00:23:14 +0100
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 11/27/20 7:06 PM, Jesper Dangaard Brouer wrote:
> [...]
> > +static struct net_device *__dev_via_ifindex(struct net_device *dev_curr,
> > +					    u32 ifindex)
> > +{
> > +	struct net *netns = dev_net(dev_curr);
> > +
> > +	/* Non-redirect use-cases can use ifindex=0 and save ifindex lookup */
> > +	if (ifindex == 0)
> > +		return dev_curr;
> > +
> > +	return dev_get_by_index_rcu(netns, ifindex);
> > +}
> > +
> > +BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
> > +	   u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags)
> > +{
> > +	int ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
> > +	struct net_device *dev = skb->dev;
> > +	int len;
> > +	int mtu;
> > +
> > +	if (flags & ~(BPF_MTU_CHK_SEGS))  
> 
> nit: unlikely() (similar for XDP case)

ok

> > +		return -EINVAL;
> > +
> > +	dev = __dev_via_ifindex(dev, ifindex);
> > +	if (!dev)  
> 
> nit: unlikely() (ditto XDP)

ok

> > +		return -ENODEV;
> > +
> > +	mtu = READ_ONCE(dev->mtu);
> > +
> > +	/* TC len is L2, remove L2-header as dev MTU is L3 size */
> > +	len = skb->len - ETH_HLEN;  
> 
> s/ETH_HLEN/dev->hard_header_len/ ?

ok
 
> > +	len += len_diff; /* len_diff can be negative, minus result pass check */
> > +	if (len <= mtu) {
> > +		ret = BPF_MTU_CHK_RET_SUCCESS;  
> 
> Wouldn't it be more intuitive to do ...
> 
>     len_dev = READ_ONCE(dev->mtu) + dev->hard_header_len + VLAN_HLEN;
>     len_skb = skb->len + len_diff;
>     if (len_skb <= len_dev) {
>        ret = BPF_MTU_CHK_RET_SUCCESS;
>        got out;
>     }

Yes, that is more intuitive to read.


> > +		goto out;
> > +	}
> > +	/* At this point, skb->len exceed MTU, but as it include length of all
> > +	 * segments, it can still be below MTU.  The SKB can possibly get
> > +	 * re-segmented in transmit path (see validate_xmit_skb).  Thus, user
> > +	 * must choose if segs are to be MTU checked.  Last SKB "headlen" is
> > +	 * checked against MTU.
> > +	 */
> > +	if (skb_is_gso(skb)) {
> > +		ret = BPF_MTU_CHK_RET_SUCCESS;
> > +
> > +		if (flags & BPF_MTU_CHK_SEGS &&
> > +		    skb_gso_validate_network_len(skb, mtu)) {
> > +			ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
> > +			goto out;  
> 
> Maybe my lack of coffe, but looking at ip_exceeds_mtu() for example, shouldn't
> the above test be on !skb_gso_validate_network_len() instead?

Yes, you are right!

> skb_is_gso(skb) && skb_gso_validate_network_len(skb, mtu) would indicate that
> it does /not/ exceed mtu.
> 
> > +		}
> > +
> > +		len = skb_headlen(skb) - ETH_HLEN + len_diff;  
> 
> How does this work with GRO when we invoke this helper at tc ingress, e.g. when
> there is still non-linear data in skb_shinfo(skb)->frags[]?

In case of skb_is_gso() then this code will check the linear part
skb_headlen(skb) against the MTU.  I though this was an improvement
from what we have today, where skb_is_gso() packets will skip all
checks, which have caused a lot of confusion by end-users.

I will put this under the BPF_MTU_CHK_SEGS flag (in V9) as I understand
from you comment, you don't think this is correct at tc ingress.

> > +		if (len > mtu) {
> > +			ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
> > +			goto out;
> > +		}
> > +	}
> > +out:
> > +	/* BPF verifier guarantees valid pointer */
> > +	*mtu_len = mtu;
> > +
> > +	return ret;
> > +}
[...]

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


  reply	other threads:[~2020-12-14 13:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-27 18:06 [PATCH bpf-next V8 0/8] bpf: New approach for BPF MTU handling Jesper Dangaard Brouer
2020-11-27 18:06 ` [PATCH bpf-next V8 1/8] bpf: Remove MTU check in __bpf_skb_max_len Jesper Dangaard Brouer
2020-11-27 18:06 ` [PATCH bpf-next V8 2/8] bpf: fix bpf_fib_lookup helper MTU check for SKB ctx Jesper Dangaard Brouer
2020-11-27 18:06 ` [PATCH bpf-next V8 3/8] bpf: bpf_fib_lookup return MTU value as output when looked up Jesper Dangaard Brouer
2020-12-02 22:23   ` Daniel Borkmann
2020-11-27 18:06 ` [PATCH bpf-next V8 4/8] bpf: add BPF-helper for MTU checking Jesper Dangaard Brouer
2020-12-02 23:23   ` Daniel Borkmann
2020-12-14 13:51     ` Jesper Dangaard Brouer [this message]
2020-11-27 18:06 ` [PATCH bpf-next V8 5/8] bpf: drop MTU check when doing TC-BPF redirect to ingress Jesper Dangaard Brouer
2020-12-02 23:43   ` Daniel Borkmann
2020-12-17 14:46     ` Jesper Dangaard Brouer
2020-12-17 16:10       ` Jesper Dangaard Brouer
2020-11-27 18:06 ` [PATCH bpf-next V8 6/8] bpf: make it possible to identify BPF redirected SKBs Jesper Dangaard Brouer
2020-12-03  0:06   ` Daniel Borkmann
2020-11-27 18:06 ` [PATCH bpf-next V8 7/8] selftests/bpf: use bpf_check_mtu in selftest test_cls_redirect Jesper Dangaard Brouer
2020-11-27 18:06 ` [PATCH bpf-next V8 8/8] bpf/selftests: tests using bpf_check_mtu BPF-helper Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201214145128.0046316c@carbon \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=colrack@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=eyal.birger@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=lorenzo@kernel.org \
    --cc=marek@cloudflare.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=shaun@tigera.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.