BPF List
 help / color / mirror / Atom feed
From: Yonghong Song <yhs@fb.com>
To: Zvi Effron <zeffron@riotgames.com>, <bpf@vger.kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
	Martin KaFai Lau <kafai@fb.com>, Cody Haas <chaas@riotgames.com>,
	Lisa Watanabe <lwatanabe@riotgames.com>
Subject: Re: [PATCH bpf-next v4 1/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN
Date: Sat, 5 Jun 2021 20:17:00 -0700	[thread overview]
Message-ID: <f3c5a8d9-6d23-dde6-e9a3-178d9f572f29@fb.com> (raw)
In-Reply-To: <20210604220235.6758-2-zeffron@riotgames.com>



On 6/4/21 3:02 PM, Zvi Effron wrote:
> Support passing a xdp_md via ctx_in/ctx_out in bpf_attr for
> BPF_PROG_TEST_RUN.
> 
> The intended use case is to pass some XDP meta data to the test runs of
> XDP programs that are used as tail calls.
> 
> For programs that use bpf_prog_test_run_xdp, support xdp_md input and
> output. Unlike with an actual xdp_md during a non-test run, data_meta must
> be 0 because it must point to the start of the provided user data. From
> the initial xdp_md, use data and data_end to adjust the pointers in the
> generated xdp_buff. All other non-zero fields are prohibited (with
> EINVAL). If the user has set ctx_out/ctx_size_out, copy the (potentially
> different) xdp_md back to the userspace.
> 
> We require all fields of input xdp_md except the ones we explicitly
> support to be set to zero. The expectation is that in the future we might
> add support for more fields and we want to fail explicitly if the user
> runs the program on the kernel where we don't yet support them.
> 
> Co-developed-by: Cody Haas <chaas@riotgames.com>
> Signed-off-by: Cody Haas <chaas@riotgames.com>
> Co-developed-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Lisa Watanabe <lwatanabe@riotgames.com>
> Signed-off-by: Zvi Effron <zeffron@riotgames.com>
> ---
>   include/uapi/linux/bpf.h |  3 --
>   net/bpf/test_run.c       | 77 ++++++++++++++++++++++++++++++++++++----
>   2 files changed, 70 insertions(+), 10 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2c1ba70abbf1..a9dcf3d8c85a 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -324,9 +324,6 @@ union bpf_iter_link_info {
>    *		**BPF_PROG_TYPE_SK_LOOKUP**
>    *			*data_in* and *data_out* must be NULL.
>    *
> - *		**BPF_PROG_TYPE_XDP**
> - *			*ctx_in* and *ctx_out* must be NULL.
> - *
>    *		**BPF_PROG_TYPE_RAW_TRACEPOINT**,
>    *		**BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE**
>    *
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index aa47af349ba8..698618f2b27e 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -687,6 +687,38 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	return ret;
>   }
>   
> +static int xdp_convert_md_to_buff(struct xdp_buff *xdp, struct xdp_md *xdp_md)

Should the order of parameters be switched to (xdp_md, xdp)?
This will follow the convention of below function xdp_convert_buff_to_md().

> +{
> +	void *data;
> +
> +	if (!xdp_md)
> +		return 0;
> +
> +	if (xdp_md->egress_ifindex != 0)
> +		return -EINVAL;
> +
> +	if (xdp_md->data > xdp_md->data_end)
> +		return -EINVAL;
> +
> +	xdp->data = xdp->data_meta + xdp_md->data;
> +
> +	if (xdp_md->ingress_ifindex != 0 || xdp_md->rx_queue_index != 0)
> +		return -EINVAL;

It would be good if you did all error checking before doing xdp->data
assignment. Also looks like xdp_md error checking happens here and
bpf_prog_test_run_xdp(). If it is hard to put all error checking
in bpf_prog_test_run_xdp(), at least put "xdp_md->data > 
xdp_md->data_end) in bpf_prog_test_run_xdp(), so this function only
checks *_ifindex and rx_queue_index?


> +
> +	return 0;
> +}
> +
> +static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md)
> +{
> +	if (!xdp_md)
> +		return;
> +
> +	/* xdp_md->data_meta must always point to the start of the out buffer */
> +	xdp_md->data_meta = 0;
> +	xdp_md->data = xdp->data - xdp->data_meta;
> +	xdp_md->data_end = xdp->data_end - xdp->data_meta;
> +}
> +
>   int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>   			  union bpf_attr __user *uattr)
>   {
> @@ -696,36 +728,68 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr,
>   	u32 repeat = kattr->test.repeat;
>   	struct netdev_rx_queue *rxqueue;
>   	struct xdp_buff xdp = {};
> +	struct xdp_md *ctx;

Let us try to maintain reverse christmas tree?

>   	u32 retval, duration;
>   	u32 max_data_sz;
>   	void *data;
>   	int ret;
>   
> -	if (kattr->test.ctx_in || kattr->test.ctx_out)
> -		return -EINVAL;
> +	ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md));
> +	if (IS_ERR(ctx))
> +		return PTR_ERR(ctx);
> +
> +	/* There can't be user provided data before the metadata */
> +	if (ctx) {
> +		if (ctx->data_meta)
> +			return -EINVAL;
> +		if (ctx->data_end != size)
> +			return -EINVAL;
> +		if (unlikely((ctx->data & (sizeof(__u32) - 1)) ||
> +			     ctx->data > 32))

Why 32? Should it be sizeof(struct xdp_md)?

> +			return -EINVAL;

As I mentioned in early comments, it would be good if we can
do some or all input parameter validation here.

> +		/* Metadata is allocated from the headroom */
> +		headroom -= ctx->data;

sizeof(struct xdp_md) should be smaller than headroom 
(XDP_PACKET_HEADROOM), so we don't need to a check, but
some comments might be helpful so people looking at the
code doesn't need to double check.

> +	}
>   
>   	/* XDP have extra tailroom as (most) drivers use full page */
>   	max_data_sz = 4096 - headroom - tailroom;
>   
>   	data = bpf_test_init(kattr, max_data_sz, headroom, tailroom);
> -	if (IS_ERR(data))
> +	if (IS_ERR(data)) {
> +		kfree(ctx);
>   		return PTR_ERR(data);
> +	}
>   
>   	rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0);
>   	xdp_init_buff(&xdp, headroom + max_data_sz + tailroom,
>   		      &rxqueue->xdp_rxq);
>   	xdp_prepare_buff(&xdp, data, headroom, size, true);
>   
> +	ret = xdp_convert_md_to_buff(&xdp, ctx);
> +	if (ret) {
> +		kfree(data);
> +		kfree(ctx);
> +		return ret;
> +	}
> +
>   	bpf_prog_change_xdp(NULL, prog);
>   	ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true);
>   	if (ret)
>   		goto out;
> -	if (xdp.data != data + headroom || xdp.data_end != xdp.data + size)
> -		size = xdp.data_end - xdp.data;
> -	ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration);
> +
> +	if (xdp.data_meta != data + headroom || xdp.data_end != xdp.data_meta + size)
> +		size = xdp.data_end - xdp.data_meta;
> +
> +	xdp_convert_buff_to_md(&xdp, ctx);
> +
> +	ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval, duration);
> +	if (!ret)
> +		ret = bpf_ctx_finish(kattr, uattr, ctx,
> +				     sizeof(struct xdp_md));
>   out:
>   	bpf_prog_change_xdp(prog, NULL);
>   	kfree(data);
> +	kfree(ctx);
>   	return ret;
>   }
>   
> @@ -809,7 +873,6 @@ int bpf_prog_test_run_flow_dissector(struct bpf_prog *prog,
>   	if (!ret)
>   		ret = bpf_ctx_finish(kattr, uattr, user_ctx,
>   				     sizeof(struct bpf_flow_keys));
> -
>   out:
>   	kfree(user_ctx);
>   	kfree(data);
> 

  reply	other threads:[~2021-06-06  3:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-04 22:02 [PATCH bpf-next v4 0/3] bpf: support input xdp_md context in BPF_PROG_TEST_RUN Zvi Effron
2021-06-04 22:02 ` [PATCH bpf-next v4 1/3] " Zvi Effron
2021-06-06  3:17   ` Yonghong Song [this message]
2021-06-07 17:58     ` Martin KaFai Lau
2021-06-09 17:06     ` Zvi Effron
2021-06-10  0:07       ` Yonghong Song
2021-06-04 22:02 ` [PATCH bpf-next v4 2/3] bpf: support specifying ingress via " Zvi Effron
2021-06-06  3:36   ` Yonghong Song
2021-06-04 22:02 ` [PATCH bpf-next v4 3/3] selftests/bpf: Add test for " Zvi Effron
2021-06-06  4:18   ` Yonghong Song
2021-06-09 17:07     ` Zvi Effron
2021-06-10  0:11       ` Yonghong Song
2021-06-06  5:36   ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3c5a8d9-6d23-dde6-e9a3-178d9f572f29@fb.com \
    --to=yhs@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=chaas@riotgames.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=hawk@kernel.org \
    --cc=kafai@fb.com \
    --cc=lwatanabe@riotgames.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=zeffron@riotgames.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox