Re: [PATCH bpf-next V13 4/7] bpf: add BPF-helper for MTU checking

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jesper Dangaard Brouer <brouer@redhat.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Daniel Borkmann <borkmann@iogearbox.net>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	maze@google.com, lmb@cloudflare.com, shaun@tigera.io,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	marek@cloudflare.com, Jakub Kicinski <kuba@kernel.org>,
	eyal.birger@gmail.com, colrack@gmail.com, brouer@redhat.com
Subject: Re: [PATCH bpf-next V13 4/7] bpf: add BPF-helper for MTU checking
Date: Fri, 29 Jan 2021 08:36:54 +0100	[thread overview]
Message-ID: <20210129083654.14f343fa@carbon> (raw)
In-Reply-To: <6013b06b83ae2_2683c2085d@john-XPS-13-9370.notmuch>

On Thu, 28 Jan 2021 22:51:23 -0800
John Fastabend <john.fastabend@gmail.com> wrote:

> Jesper Dangaard Brouer wrote:
> > This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs.
> > 
> > The SKB object is complex and the skb->len value (accessible from
> > BPF-prog) also include the length of any extra GRO/GSO segments, but
> > without taking into account that these GRO/GSO segments get added
> > transport (L4) and network (L3) headers before being transmitted. Thus,
> > this BPF-helper is created such that the BPF-programmer don't need to
> > handle these details in the BPF-prog.
> > 
> > The API is designed to help the BPF-programmer, that want to do packet
> > context size changes, which involves other helpers. These other helpers
> > usually does a delta size adjustment. This helper also support a delta
> > size (len_diff), which allow BPF-programmer to reuse arguments needed by
> > these other helpers, and perform the MTU check prior to doing any actual
> > size adjustment of the packet context.
> > 
> > It is on purpose, that we allow the len adjustment to become a negative
> > result, that will pass the MTU check. This might seem weird, but it's not
> > this helpers responsibility to "catch" wrong len_diff adjustments. Other
> > helpers will take care of these checks, if BPF-programmer chooses to do
> > actual size adjustment.

The nitpick below about len adjust can become negative, is on purpose
and why is described in above.

> > 
> > V13:
> >  - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff.
> > 
> > V12:
> >  - Simplify segment check that calls skb_gso_validate_network_len.
> >  - Helpers should return long
> > 
> > V9:
> > - Use dev->hard_header_len (instead of ETH_HLEN)
> > - Annotate with unlikely req from Daniel
> > - Fix logic error using skb_gso_validate_network_len from Daniel
> > 
> > V6:
> > - Took John's advice and dropped BPF_MTU_CHK_RELAX
> > - Returned MTU is kept at L3-level (like fib_lookup)
> > 
> > V4: Lot of changes
> >  - ifindex 0 now use current netdev for MTU lookup
> >  - rename helper from bpf_mtu_check to bpf_check_mtu
> >  - fix bug for GSO pkt length (as skb->len is total len)
> >  - remove __bpf_len_adj_positive, simply allow negative len adj

Notice V4 comment about "allow negative len adj"

> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >  include/uapi/linux/bpf.h       |   67 ++++++++++++++++++++++++
> >  net/core/filter.c              |  114 ++++++++++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h |   67 ++++++++++++++++++++++++
> >  3 files changed, 248 insertions(+)
> > 
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 05bfc8c843dc..f17381a337ec 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3839,6 +3839,61 @@ union bpf_attr {  
> 
> [...]
> 
> > +
> > +BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb,
> > +	   u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags)  
> 
> Maybe worth mentioning in description we expect len_diff < skb->len,
> at least I expect that otherwise result may be undefined.
> 
> > +{
> > +	int ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
> > +	struct net_device *dev = skb->dev;
> > +	int skb_len, dev_len;
> > +	int mtu;  
> 
> Perhaps getting a bit nit-picky here but shouldn't skb_len, dev_len
> and mtu all be 'unsigned int'
> 
> Then all the types will align. I guess MTUs are small so it
> doesn't really matter, but is easier to read IMO.

We need signed types, this is a deliberate choice made based on
discussion in V4.

> > +
> > +	if (unlikely(flags & ~(BPF_MTU_CHK_SEGS)))
> > +		return -EINVAL;
> > +
> > +	if (unlikely(flags & BPF_MTU_CHK_SEGS && len_diff))
> > +		return -EINVAL;
> > +
> > +	dev = __dev_via_ifindex(dev, ifindex);
> > +	if (unlikely(!dev))
> > +		return -ENODEV;
> > +
> > +	mtu = READ_ONCE(dev->mtu);
> > +
> > +	dev_len = mtu + dev->hard_header_len;
> > +	skb_len = skb->len + len_diff; /* minus result pass check */
> > +	if (skb_len <= dev_len) {  
> 
> If skb_len is unsigned it will be >> dev_len when skb->len < len_diff. I
> think its a good idea to throw an error if skb_len calculation goes
> negative?

No, as comment says /* minus result pass check */.
And explained in patch desc.

> > +		ret = BPF_MTU_CHK_RET_SUCCESS;
> > +		goto out;
> > +	}
> > +	/* At this point, skb->len exceed MTU, but as it include length of all
> > +	 * segments, it can still be below MTU.  The SKB can possibly get
> > +	 * re-segmented in transmit path (see validate_xmit_skb).  Thus, user
> > +	 * must choose if segs are to be MTU checked.
> > +	 */
> > +	if (skb_is_gso(skb)) {
> > +		ret = BPF_MTU_CHK_RET_SUCCESS;
> > +
> > +		if (flags & BPF_MTU_CHK_SEGS &&
> > +		    !skb_gso_validate_network_len(skb, mtu))
> > +			ret = BPF_MTU_CHK_RET_SEGS_TOOBIG;
> > +	}
> > +out:
> > +	/* BPF verifier guarantees valid pointer */
> > +	*mtu_len = mtu;
> > +
> > +	return ret;
> > +}
> > +
> > +BPF_CALL_5(bpf_xdp_check_mtu, struct xdp_buff *, xdp,
> > +	   u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags)
> > +{
> > +	struct net_device *dev = xdp->rxq->dev;
> > +	int xdp_len = xdp->data_end - xdp->data;
> > +	int ret = BPF_MTU_CHK_RET_SUCCESS;
> > +	int mtu, dev_len;  
> 
> Same comment about types.
> 
> > +
> > +	/* XDP variant doesn't support multi-buffer segment check (yet) */
> > +	if (unlikely(flags))
> > +		return -EINVAL;
> > +
> > +	dev = __dev_via_ifindex(dev, ifindex);
> > +	if (unlikely(!dev))
> > +		return -ENODEV;
> > +
> > +	mtu = READ_ONCE(dev->mtu);
> > +
> > +	/* Add L2-header as dev MTU is L3 size */
> > +	dev_len = mtu + dev->hard_header_len;
> > +
> > +	xdp_len += len_diff; /* minus result pass check */
> > +	if (xdp_len > dev_len)
> > +		ret = BPF_MTU_CHK_RET_FRAG_NEEDED;
> > +
> > +	/* BPF verifier guarantees valid pointer */
> > +	*mtu_len = mtu;
> > +
> > +	return ret;
> > +}  
> 
> Otherwise LGTM.

Thanks

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

next prev parent reply	other threads:[~2021-01-29  7:39 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25 17:09 [PATCH bpf-next V13 0/7] bpf: New approach for BPF MTU handling Jesper Dangaard Brouer
2021-01-25 17:09 ` [PATCH bpf-next V13 1/7] bpf: Remove MTU check in __bpf_skb_max_len Jesper Dangaard Brouer
2021-01-25 17:09 ` [PATCH bpf-next V13 2/7] bpf: fix bpf_fib_lookup helper MTU check for SKB ctx Jesper Dangaard Brouer
2021-01-29  6:00   ` John Fastabend
2021-01-25 17:09 ` [PATCH bpf-next V13 3/7] bpf: bpf_fib_lookup return MTU value as output when looked up Jesper Dangaard Brouer
2021-01-29  6:06   ` John Fastabend
2021-01-25 17:09 ` [PATCH bpf-next V13 4/7] bpf: add BPF-helper for MTU checking Jesper Dangaard Brouer
2021-01-29  6:51   ` John Fastabend
2021-01-29  7:36     ` Jesper Dangaard Brouer [this message]
     [not found]       ` <60142eae7cd59_11fd208f1@john-XPS-13-9370.notmuch>
2021-01-30  0:08         ` Daniel Borkmann
2021-01-30 13:51           ` Jesper Dangaard Brouer
2021-01-25 17:09 ` [PATCH bpf-next V13 5/7] bpf: drop MTU check when doing TC-BPF redirect to ingress Jesper Dangaard Brouer
2021-01-29  6:54   ` John Fastabend
2021-01-25 17:09 ` [PATCH bpf-next V13 6/7] selftests/bpf: use bpf_check_mtu in selftest test_cls_redirect Jesper Dangaard Brouer
2021-01-29  6:55   ` John Fastabend
2021-01-25 17:09 ` [PATCH bpf-next V13 7/7] selftests/bpf: tests using bpf_check_mtu BPF-helper Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210129083654.14f343fa@carbon \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=colrack@gmail.com \
    --cc=eyal.birger@gmail.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=lmb@cloudflare.com \
    --cc=lorenzo@kernel.org \
    --cc=marek@cloudflare.com \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=shaun@tigera.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.