From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jesper Dangaard Brouer <brouer@redhat.com>
Subject: Re: [RFC net-next PATCH 2/2] bpf: disallow XDP data_meta to overlap
 with xdp_frame area
Date: Wed, 18 Apr 2018 19:53:22 +0200
Message-ID: <20180418195322.381b041c@redhat.com>
References: <152405338404.30730.9846848505925123326.stgit@firesoul>
        <152405341684.30730.9208803786283211244.stgit@firesoul>
        <51af68d8-3c02-b828-12b5-f2dc73406a65@iogearbox.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Cc: Daniel Borkmann <borkmann@iogearbox.net>,
        Alexei Starovoitov <alexei.starovoitov@gmail.com>,
        netdev@vger.kernel.org, brouer@redhat.com
To: Daniel Borkmann <daniel@iogearbox.net>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mx3-rdu2.redhat.com ([66.187.233.73]:36800 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1750885AbeDRRx2 (ORCPT <rfc822;netdev@vger.kernel.org>);
        Wed, 18 Apr 2018 13:53:28 -0400
In-Reply-To: <51af68d8-3c02-b828-12b5-f2dc73406a65@iogearbox.net>
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On Wed, 18 Apr 2018 18:21:21 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 04/18/2018 02:10 PM, Jesper Dangaard Brouer wrote:
> > If combining xdp_adjust_head and xdp_adjust_meta, then it is possible
> > to make data_meta overlap with area used by xdp_frame.  And another
> > invocation of xdp_adjust_head can then clear that area, due to
> > clearing of xdp_frame area.
> > 
> > The easiest solution I found was to simply not allow
> > xdp_buff->data_meta to overlap with area used by xdp_frame.  
> 
> Thanks Jesper! Trying to answer both emails in one. :) More below.
> 
> > Fixes: 6dfb970d3dbd ("xdp: avoid leaking info stored in frame data on page reuse")
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > ---
> >  net/core/filter.c |   11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 15e9b5477360..e3623e741181 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -2701,6 +2701,11 @@ BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
> >  		     data > xdp->data_end - ETH_HLEN))
> >  		return -EINVAL;
> >  
> > +	/* Disallow data_meta to use xdp_frame area */
> > +	if (metalen > 0 &&
> > +	    unlikely((data - metalen) < xdp_frame_end))
> > +		return -EINVAL;
> > +
> >  	/* Avoid info leak, when reusing area prev used by xdp_frame */
> >  	if (data < xdp_frame_end) {  
> 
> Effectively, when metalen > 0, then data_meta < data pointer, so above test
> on new data_meta might be better, but feels like a bit of a workaround to
> handle moving data pointer but disallowing moving data_meta pointer whereas
> both could be handled if we wanted to go that path.
> 
> >  		unsigned long clearlen = xdp_frame_end - data;
> > @@ -2734,6 +2739,7 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
> >  
> >  BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> >  {
> > +	void *xdp_frame_end = xdp->data_hard_start + sizeof(struct xdp_frame);
> >  	void *meta = xdp->data_meta + offset;
> >  	unsigned long metalen = xdp->data - meta;
> >  
> > @@ -2742,6 +2748,11 @@ BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
> >  	if (unlikely(meta < xdp->data_hard_start ||
> >  		     meta > xdp->data))
> >  		return -EINVAL;
> > +
> > +	/* Disallow data_meta to use xdp_frame area */
> > +	if (unlikely(meta < xdp_frame_end))
> > +		return -EINVAL;  
> 
> (Ditto.)
> 
> >  	if (unlikely((metalen & (sizeof(__u32) - 1)) ||
> >  		     (metalen > 32)))
> >  		return -EACCES;  
> 
> The other, perhaps less invasive/complex option would be to just disallow
> moving anything into previous sizeof(struct xdp_frame) area. My original
> concern was that not all drivers use 256 bytes of headroom, e.g. afaik the
> i40e and ixgbe have around 192 bytes of headroom available, but that should
> actually still be plenty of space for encap + meta data, and potentially
> with meta data use I would expect that at best custom decap would be
> happening when pushing the packet up the stack. So might as well disallow
> going into that region and not worry about it. Thus, reverting e9e9545e10d3
> ("xdp: avoid leaking info stored in frame data on page reuse") and adding
> something like the below (uncompiled), should just do it:
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 3bb0cb9..ad98ddd 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2692,8 +2692,9 @@ static unsigned long xdp_get_metalen(const struct xdp_buff *xdp)
> 
>  BPF_CALL_2(bpf_xdp_adjust_head, struct xdp_buff *, xdp, int, offset)
>  {
> +	void *frame_end = xdp->data_hard_start + sizeof(struct xdp_frame);
>  	unsigned long metalen = xdp_get_metalen(xdp);
> -	void *data_start = xdp->data_hard_start + metalen;
> +	void *data_start = frame_end + metalen;
>  	void *data = xdp->data + offset;
> 
>  	if (unlikely(data < data_start ||
> @@ -2719,13 +2720,13 @@ static const struct bpf_func_proto bpf_xdp_adjust_head_proto = {
> 
>  BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
>  {
> +	void *frame_end = xdp->data_hard_start + sizeof(struct xdp_frame);
>  	void *meta = xdp->data_meta + offset;
>  	unsigned long metalen = xdp->data - meta;
> 
>  	if (xdp_data_meta_unsupported(xdp))
>  		return -ENOTSUPP;
> -	if (unlikely(meta < xdp->data_hard_start ||
> -		     meta > xdp->data))
> +	if (unlikely(meta < frame_end || meta > xdp->data))
>  		return -EINVAL;
>  	if (unlikely((metalen & (sizeof(__u32) - 1)) ||
>  		     (metalen > 32)))

Okay, so you say just disallow using xdp_frame area in general.  It
would be simpler.  

The advantage it that we don't run into strange situations, where the
user/bpf_prog increased headroom too much, such that convert_to_xdp_frame()
fails and thus XDP_REDIRECT action fails.  (That will be confusing to
users to debug/troubleshoot).


> On top of that, we could even store a bool in struct xdp_rxq_info whether
> the driver actually is able to participate in resp. has the XDP_REDIRECT
> support and if not do something like:
> 
> void *frame_end = xdp->data_hard_start + xdp->rxq->has_redir ? sizeof(struct xdp_frame) : 0;
> 
> But the latter is merely a small optimization. Eventually we want all native XDP
> drivers to support it. Thoughts?

I would _really_ like see all drivers support XDP_REDIRECT, but to be
realistic, this is happening way too slow...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer