netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Brenden Blanco <bblanco@plumgrid.com>
Cc: davem@davemloft.net, netdev@vger.kernel.org,
	Martin KaFai Lau <kafai@fb.com>, Ari Saha <as754m@att.com>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Or Gerlitz <gerlitz.or@gmail.com>,
	john.fastabend@gmail.com, hannes@stressinduktion.org,
	Thomas Graf <tgraf@suug.ch>, Tom Herbert <tom@herbertland.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	brouer@redhat.com
Subject: Re: [PATCH v6 01/12] bpf: add XDP prog type for early driver filter
Date: Sat, 9 Jul 2016 10:14:03 +0200	[thread overview]
Message-ID: <20160709101403.1ed7d021@redhat.com> (raw)
In-Reply-To: <1467944124-14891-2-git-send-email-bblanco@plumgrid.com>

On Thu,  7 Jul 2016 19:15:13 -0700
Brenden Blanco <bblanco@plumgrid.com> wrote:

> Add a new bpf prog type that is intended to run in early stages of the
> packet rx path. Only minimal packet metadata will be available, hence a
> new context type, struct xdp_md, is exposed to userspace. So far only
> expose the packet start and end pointers, and only in read mode.
> 
> An XDP program must return one of the well known enum values, all other
> return codes are reserved for future use. Unfortunately, this
> restriction is hard to enforce at verification time, so take the
> approach of warning at runtime when such programs are encountered. The
> driver can choose to implement unknown return codes however it wants,
> but must invoke the warning helper with the action value.

I believe we should define a stronger semantics for unknown/future
return codes than the once stated above:
 "driver can choose to implement unknown return codes however it wants"

The mlx4 driver implementation in:
 [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program

On Thu,  7 Jul 2016 19:15:16 -0700 Brenden Blanco <bblanco@plumgrid.com> wrote:

> +		/* A bpf program gets first chance to drop the packet. It may
> +		 * read bytes but not past the end of the frag.
> +		 */
> +		if (prog) {
> +			struct xdp_buff xdp;
> +			dma_addr_t dma;
> +			u32 act;
> +
> +			dma = be64_to_cpu(rx_desc->data[0].addr);
> +			dma_sync_single_for_cpu(priv->ddev, dma,
> +						priv->frag_info[0].frag_size,
> +						DMA_FROM_DEVICE);
> +
> +			xdp.data = page_address(frags[0].page) +
> +							frags[0].page_offset;
> +			xdp.data_end = xdp.data + length;
> +
> +			act = bpf_prog_run_xdp(prog, &xdp);
> +			switch (act) {
> +			case XDP_PASS:
> +				break;
> +			default:
> +				bpf_warn_invalid_xdp_action(act);
> +			case XDP_DROP:
> +				goto next;
> +			}
> +		}

Thus, mlx4 choice is to drop packets for unknown/future return codes.

I think this is the wrong choice.  I think the choice should be
XDP_PASS, to pass the packet up the stack.

I find "XDP_DROP" problematic because it happen so early in the driver,
that we lost all possibilities to debug what packets gets dropped.  We
get a single kernel log warning, but we cannot inspect the packets any
longer.  By defaulting to XDP_PASS all the normal stack tools (e.g.
tcpdump) is available.


I can also imagine that, defaulting to XDP_PASS, can be an important
feature in the future.

In the future we will likely have features, where XDP can "offload"
packet delivery from the normal stack (e.g. delivery into a VM).  On a
running production system you can then load your XDP program.  If the
driver was too old defaulting to XDP_DROP, then you lost your service,
instead if defaulting to XDP_PASS, your service would survive, falling
back to normal delivery.

(For the VM delivery use-case, there will likely be a need for having a
fallback delivery method in place, when the XDP program is not active,
in-order to support VM migration).



> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index c14ca1c..5b47ac3 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
[...]
>  
> +/* User return codes for XDP prog type.
> + * A valid XDP program must return one of these defined values. All other
> + * return codes are reserved for future use. Unknown return codes will result
> + * in driver-dependent behavior.
> + */
> +enum xdp_action {
> +	XDP_DROP,
> +	XDP_PASS,
> +};
> +
[...]
>  #endif /* _UAPI__LINUX_BPF_H__ */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index e206c21..a8d67d0 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
[...]
> +void bpf_warn_invalid_xdp_action(int act)
> +{
> +	WARN_ONCE(1, "\n"
> +		     "*****************************************************\n"
> +		     "**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n"
> +		     "**                                               **\n"
> +		     "** XDP program returned unknown value %-10u **\n"
> +		     "**                                               **\n"
> +		     "** XDP programs must return a well-known return  **\n"
> +		     "** value. Invalid return values will result in   **\n"
> +		     "** undefined packet actions.                     **\n"
> +		     "**                                               **\n"
> +		     "**   NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE   **\n"
> +		     "*****************************************************\n",
> +		  act);
> +}
> +EXPORT_SYMBOL_GPL(bpf_warn_invalid_xdp_action);
> +


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

  reply	other threads:[~2016-07-09  8:14 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-08  2:15 [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 01/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-09  8:14   ` Jesper Dangaard Brouer [this message]
2016-07-09 13:47     ` Tom Herbert
2016-07-10 13:37       ` Jesper Dangaard Brouer
2016-07-10 17:09         ` Brenden Blanco
2016-07-10 20:30           ` Tom Herbert
2016-07-11 10:15             ` Daniel Borkmann
2016-07-11 12:58               ` Jesper Dangaard Brouer
2016-07-10 20:27         ` Tom Herbert
2016-07-11 11:36           ` Jesper Dangaard Brouer
2016-07-10 20:56   ` Tom Herbert
2016-07-11 16:51     ` Brenden Blanco
2016-07-11 21:21       ` Daniel Borkmann
2016-07-10 21:04   ` Tom Herbert
2016-07-11 13:53     ` Jesper Dangaard Brouer
2016-07-08  2:15 ` [PATCH v6 02/12] net: add ndo to set xdp prog in adapter rx Brenden Blanco
2016-07-10 20:59   ` Tom Herbert
2016-07-11 10:35     ` Daniel Borkmann
2016-07-08  2:15 ` [PATCH v6 03/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-09 14:07   ` Or Gerlitz
2016-07-10 15:40     ` Brenden Blanco
2016-07-10 16:38       ` Tariq Toukan
2016-07-09 19:58   ` Saeed Mahameed
2016-07-09 21:37     ` Or Gerlitz
2016-07-10 15:25     ` Tariq Toukan
2016-07-10 16:05       ` Brenden Blanco
2016-07-11 11:48         ` Saeed Mahameed
2016-07-11 21:49           ` Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 05/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-09 20:21   ` Saeed Mahameed
2016-07-11 11:09   ` Jamal Hadi Salim
2016-07-11 13:37     ` Jesper Dangaard Brouer
2016-07-16 14:55       ` Jamal Hadi Salim
2016-07-08  2:15 ` [PATCH v6 06/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 07/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 08/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 09/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 10/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 11/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-08  2:15 ` [PATCH v6 12/12] net/mlx4_en: add prefetch in xdp rx path Brenden Blanco
2016-07-08  3:56   ` Eric Dumazet
2016-07-08  4:16     ` Alexei Starovoitov
2016-07-08  6:56       ` Eric Dumazet
2016-07-08 16:49         ` Brenden Blanco
2016-07-10 20:48           ` Tom Herbert
2016-07-10 20:50           ` Tom Herbert
2016-07-11 14:54             ` Jesper Dangaard Brouer
2016-07-08 15:20     ` Jesper Dangaard Brouer
2016-07-08 16:02       ` [net-next PATCH RFC] mlx4: RX prefetch loop Jesper Dangaard Brouer
2016-07-11 11:09         ` Jesper Dangaard Brouer
2016-07-11 16:00           ` Brenden Blanco
2016-07-11 23:05           ` Alexei Starovoitov
2016-07-12 12:45             ` Jesper Dangaard Brouer
2016-07-12 16:46               ` Alexander Duyck
2016-07-12 19:52                 ` Jesper Dangaard Brouer
2016-07-13  1:37                   ` Alexei Starovoitov
2016-07-10 16:14 ` [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160709101403.1ed7d021@redhat.com \
    --to=brouer@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).