* [RFC PATCH v2 0/5] Add driver bpf hook for early packet drop
@ 2016-04-08 4:47 Brenden Blanco
2016-04-09 14:37 ` Jamal Hadi Salim
0 siblings, 1 reply; 2+ messages in thread
From: Brenden Blanco @ 2016-04-08 4:47 UTC (permalink / raw)
To: davem
Cc: Brenden Blanco, netdev, tom, alexei.starovoitov, ogerlitz, daniel,
brouer, eric.dumazet, ecree, john.fastabend, tgraf, johannes,
eranlinuxmellanox, lorenzo
This patch set introduces new infrastructure for programmatically
processing packets in the earliest stages of rx, as part of an effort
others are calling Express Data Path (XDP) [1]. Start this effort by
introducing a new bpf program type for early packet filtering, before even
an skb has been allocated.
With this, hope to enable line rate filtering, with this initial
implementation providing drop/allow action only.
Patch 1 introduces the new prog type and helpers for validating the bpf
program. A new userspace struct is defined containing only len as a field,
with others to follow in the future.
In patch 2, create a new ndo to pass the fd to support drivers.
In patch 3, expose a new rtnl option to userspace.
In patch 4, enable support in mlx4 driver. No skb allocation is required,
instead a static percpu skb is kept in the driver and minimally initialized
for each driver frag.
In patch 5, create a sample drop and count program. With single core,
achieved ~20 Mpps drop rate on a 40G mlx4. This includes packet data
access, bpf array lookup, and increment.
Interestingly, accessing packet data from the program did not have a
noticeable impact on performance. Even so, future enhancements to
prefetching / batching / page-allocs should hopefully improve the
performance in this path.
[1] https://github.com/iovisor/bpf-docs/blob/master/Express_Data_Path.pdf
v2:
1/5: Drop xdp from types, instead consistently use bpf_phys_dev_.
Introduce enum for return values from phys_dev hook.
2/5: Move prog->type check to just before invoking ndo.
Change ndo to take a bpf_prog * instead of fd.
Add ndo_bpf_get rather than keeping a bool in the netdev struct.
3/5: Use ndo_bpf_get to fetch bool.
4/5: Enforce that only 1 frag is ever given to bpf prog by disallowing
mtu to increase beyond FRAG_SZ0 when bpf prog is running, or conversely
to set a bpf prog when priv->num_frags > 1.
Rename pseudo_skb to bpf_phys_dev_md.
Implement ndo_bpf_get.
Add dma sync just before invoking prog.
Check for explicit bpf return code rather than nonzero.
Remove increment of rx_dropped.
5/5: Use explicit bpf return code in example.
Update commit log with higher pps numbers.
Brenden Blanco (5):
bpf: add PHYS_DEV prog type for early driver filter
net: add ndo to set bpf prog in adapter rx
rtnl: add option for setting link bpf prog
mlx4: add support for fast rx drop bpf program
Add sample for adding simple drop program to link
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 65 +++++++++++
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 25 +++-
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 6 +
include/linux/netdevice.h | 13 +++
include/uapi/linux/bpf.h | 14 +++
include/uapi/linux/if_link.h | 1 +
kernel/bpf/verifier.c | 1 +
net/core/dev.c | 38 ++++++
net/core/filter.c | 68 +++++++++++
net/core/rtnetlink.c | 12 ++
samples/bpf/Makefile | 4 +
samples/bpf/bpf_load.c | 8 ++
samples/bpf/netdrvx1_kern.c | 26 +++++
samples/bpf/netdrvx1_user.c | 155 +++++++++++++++++++++++++
14 files changed, 432 insertions(+), 4 deletions(-)
create mode 100644 samples/bpf/netdrvx1_kern.c
create mode 100644 samples/bpf/netdrvx1_user.c
--
2.8.0
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [RFC PATCH v2 0/5] Add driver bpf hook for early packet drop
2016-04-08 4:47 [RFC PATCH v2 0/5] Add driver bpf hook for early packet drop Brenden Blanco
@ 2016-04-09 14:37 ` Jamal Hadi Salim
0 siblings, 0 replies; 2+ messages in thread
From: Jamal Hadi Salim @ 2016-04-09 14:37 UTC (permalink / raw)
To: Brenden Blanco, davem
Cc: netdev, tom, alexei.starovoitov, ogerlitz, daniel, brouer,
eric.dumazet, ecree, john.fastabend, tgraf, johannes,
eranlinuxmellanox, lorenzo
On 16-04-08 12:47 AM, Brenden Blanco wrote:
> This patch set introduces new infrastructure for programmatically
> processing packets in the earliest stages of rx, as part of an effort
> others are calling Express Data Path (XDP) [1]. Start this effort by
> introducing a new bpf program type for early packet filtering, before even
> an skb has been allocated.
>
> With this, hope to enable line rate filtering, with this initial
> implementation providing drop/allow action only.
>
> Patch 1 introduces the new prog type and helpers for validating the bpf
> program. A new userspace struct is defined containing only len as a field,
> with others to follow in the future.
> In patch 2, create a new ndo to pass the fd to support drivers.
> In patch 3, expose a new rtnl option to userspace.
> In patch 4, enable support in mlx4 driver. No skb allocation is required,
> instead a static percpu skb is kept in the driver and minimally initialized
> for each driver frag.
> In patch 5, create a sample drop and count program. With single core,
> achieved ~20 Mpps drop rate on a 40G mlx4. This includes packet data
> access, bpf array lookup, and increment.
Hrm. This doesnt sound very high (less than 50%?).
Is the driver the main overhead?
I'd be curious, for comparison, if you just dropped everything
without bpf and alternatively with tc + bpf of the same program
on the one cpu.
Numbers we had for the NUC with tc on single core were a bit higher
than 20Mpps but there was no driver overhead - so i expected
to see much higher numbers if you did it at the driver...
Note back in the day Alexey(not Alexei;->) had a built-in driver
level forwarder;
however the advantage there was derived out of packets being DMAed
from ingress to egress port after some simple lookup.
cheers,
jamal
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2016-04-09 14:37 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-08 4:47 [RFC PATCH v2 0/5] Add driver bpf hook for early packet drop Brenden Blanco
2016-04-09 14:37 ` Jamal Hadi Salim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).