From: David Miller <davem@davemloft.net>
To: tom@herbertland.com
Cc: alexei.starovoitov@gmail.com, bblanco@plumgrid.com,
netdev@vger.kernel.org, jhs@mojatatu.com,
saeedm@dev.mellanox.co.il, kafai@fb.com, brouer@redhat.com,
as754m@att.com, gerlitz.or@gmail.com, john.fastabend@gmail.com,
hannes@stressinduktion.org, tgraf@suug.ch, daniel@iogearbox.net,
ttoukan.linux@gmail.com, haoxuany@fb.com
Subject: Re: [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite
Date: Wed, 03 Aug 2016 11:29:47 -0700 (PDT) [thread overview]
Message-ID: <20160803.112947.1365083919840672357.davem@davemloft.net> (raw)
In-Reply-To: <CALx6S36ARsfC8XxPHYsbD+T6Dx3SOgxdxv2CDJWuVZGf+H1oAw@mail.gmail.com>
From: Tom Herbert <tom@herbertland.com>
Date: Wed, 3 Aug 2016 10:29:58 -0700
> On Wed, Aug 3, 2016 at 10:11 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>> On Wed, Aug 03, 2016 at 10:01:54AM -0700, Tom Herbert wrote:
>>> On Tue, Jul 19, 2016 at 12:16 PM, Brenden Blanco <bblanco@plumgrid.com> wrote:
>>> > Add a sample that rewrites and forwards packets out on the same
>>> > interface. Observed single core forwarding performance of ~10Mpps.
>>> >
>>> > Since the mlx4 driver under test recycles every single packet page, the
>>> > perf output shows almost exclusively just the ring management and bpf
>>> > program work. Slowdowns are likely occurring due to cache misses.
>>> >
>>> > Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
>>> > ---
>>> > samples/bpf/Makefile | 5 +++
>>> > samples/bpf/xdp2_kern.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
>>> > 2 files changed, 119 insertions(+)
>>> > create mode 100644 samples/bpf/xdp2_kern.c
>>> >
>>> > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
>>> > index 0e4ab3a..d2d2b35 100644
>>> > --- a/samples/bpf/Makefile
>>> > +++ b/samples/bpf/Makefile
>>> > @@ -22,6 +22,7 @@ hostprogs-y += map_perf_test
>>> > hostprogs-y += test_overhead
>>> > hostprogs-y += test_cgrp2_array_pin
>>> > hostprogs-y += xdp1
>>> > +hostprogs-y += xdp2
>>> >
>>> > test_verifier-objs := test_verifier.o libbpf.o
>>> > test_maps-objs := test_maps.o libbpf.o
>>> > @@ -44,6 +45,8 @@ map_perf_test-objs := bpf_load.o libbpf.o map_perf_test_user.o
>>> > test_overhead-objs := bpf_load.o libbpf.o test_overhead_user.o
>>> > test_cgrp2_array_pin-objs := libbpf.o test_cgrp2_array_pin.o
>>> > xdp1-objs := bpf_load.o libbpf.o xdp1_user.o
>>> > +# reuse xdp1 source intentionally
>>> > +xdp2-objs := bpf_load.o libbpf.o xdp1_user.o
>>> >
>>> > # Tell kbuild to always build the programs
>>> > always := $(hostprogs-y)
>>> > @@ -67,6 +70,7 @@ always += test_overhead_kprobe_kern.o
>>> > always += parse_varlen.o parse_simple.o parse_ldabs.o
>>> > always += test_cgrp2_tc_kern.o
>>> > always += xdp1_kern.o
>>> > +always += xdp2_kern.o
>>> >
>>> > HOSTCFLAGS += -I$(objtree)/usr/include
>>> >
>>> > @@ -88,6 +92,7 @@ HOSTLOADLIBES_spintest += -lelf
>>> > HOSTLOADLIBES_map_perf_test += -lelf -lrt
>>> > HOSTLOADLIBES_test_overhead += -lelf -lrt
>>> > HOSTLOADLIBES_xdp1 += -lelf
>>> > +HOSTLOADLIBES_xdp2 += -lelf
>>> >
>>> > # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
>>> > # make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
>>> > diff --git a/samples/bpf/xdp2_kern.c b/samples/bpf/xdp2_kern.c
>>> > new file mode 100644
>>> > index 0000000..38fe7e1
>>> > --- /dev/null
>>> > +++ b/samples/bpf/xdp2_kern.c
>>> > @@ -0,0 +1,114 @@
>>> > +/* Copyright (c) 2016 PLUMgrid
>>> > + *
>>> > + * This program is free software; you can redistribute it and/or
>>> > + * modify it under the terms of version 2 of the GNU General Public
>>> > + * License as published by the Free Software Foundation.
>>> > + */
>>> > +#define KBUILD_MODNAME "foo"
>>> > +#include <uapi/linux/bpf.h>
>>> > +#include <linux/in.h>
>>> > +#include <linux/if_ether.h>
>>> > +#include <linux/if_packet.h>
>>> > +#include <linux/if_vlan.h>
>>> > +#include <linux/ip.h>
>>> > +#include <linux/ipv6.h>
>>> > +#include "bpf_helpers.h"
>>> > +
>>> > +struct bpf_map_def SEC("maps") dropcnt = {
>>> > + .type = BPF_MAP_TYPE_PERCPU_ARRAY,
>>> > + .key_size = sizeof(u32),
>>> > + .value_size = sizeof(long),
>>> > + .max_entries = 256,
>>> > +};
>>> > +
>>> > +static void swap_src_dst_mac(void *data)
>>> > +{
>>> > + unsigned short *p = data;
>>> > + unsigned short dst[3];
>>> > +
>>> > + dst[0] = p[0];
>>> > + dst[1] = p[1];
>>> > + dst[2] = p[2];
>>> > + p[0] = p[3];
>>> > + p[1] = p[4];
>>> > + p[2] = p[5];
>>> > + p[3] = dst[0];
>>> > + p[4] = dst[1];
>>> > + p[5] = dst[2];
>>> > +}
>>> > +
>>> > +static int parse_ipv4(void *data, u64 nh_off, void *data_end)
>>> > +{
>>> > + struct iphdr *iph = data + nh_off;
>>> > +
>>> > + if (iph + 1 > data_end)
>>> > + return 0;
>>> > + return iph->protocol;
>>> > +}
>>> > +
>>> > +static int parse_ipv6(void *data, u64 nh_off, void *data_end)
>>> > +{
>>> > + struct ipv6hdr *ip6h = data + nh_off;
>>> > +
>>> > + if (ip6h + 1 > data_end)
>>> > + return 0;
>>> > + return ip6h->nexthdr;
>>> > +}
>>> > +
>>> > +SEC("xdp1")
>>> > +int xdp_prog1(struct xdp_md *ctx)
>>> > +{
>>> > + void *data_end = (void *)(long)ctx->data_end;
>>> > + void *data = (void *)(long)ctx->data;
>>>
>>> Brendan,
>>>
>>> It seems that the cast to long here is done because data_end and data
>>> are u32s in xdp_md. So the effect is that we are upcasting a
>>> thirty-bit integer into a sixty-four bit pointer (in fact without the
>>> cast we see compiler warnings). I don't understand how this can be
>>> correct. Can you shed some light on this?
>>
>> please see:
>> http://lists.iovisor.org/pipermail/iovisor-dev/2016-August/000355.html
>>
> That doesn't explain it.
Yes it does explain it, think more about the word "meta" and what the
code generator might be doing.
next prev parent reply other threads:[~2016-08-03 18:30 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20 8:38 ` Daniel Borkmann
2016-07-20 17:35 ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41 ` Alexei Starovoitov
2016-07-20 9:07 ` Daniel Borkmann
2016-07-20 17:33 ` Brenden Blanco
2016-07-24 11:56 ` Jesper Dangaard Brouer
2016-07-24 16:57 ` Tom Herbert
2016-07-24 20:34 ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49 ` Alexei Starovoitov
2016-07-25 7:35 ` Eric Dumazet
2016-08-03 17:45 ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19 ` Jesper Dangaard Brouer
2016-08-05 0:30 ` Alexander Duyck
2016-08-05 3:55 ` Alexei Starovoitov
2016-08-05 15:15 ` Alexander Duyck
2016-08-05 15:33 ` David Laight
2016-08-05 16:00 ` Alexander Duyck
2016-08-05 7:15 ` Eric Dumazet
2016-08-08 2:15 ` Alexei Starovoitov
2016-08-08 8:01 ` Jesper Dangaard Brouer
2016-08-08 18:34 ` Alexei Starovoitov
2016-08-09 12:14 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59 ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05 ` Alexei Starovoitov
2016-07-20 17:38 ` Brenden Blanco
2016-07-27 18:25 ` Jesper Dangaard Brouer
2016-08-03 17:01 ` Tom Herbert
2016-08-03 17:11 ` Alexei Starovoitov
2016-08-03 17:29 ` Tom Herbert
2016-08-03 18:29 ` David Miller [this message]
2016-08-03 18:29 ` Brenden Blanco
2016-08-03 18:31 ` David Miller
2016-08-03 19:06 ` Tom Herbert
2016-08-03 22:36 ` Alexei Starovoitov
2016-08-03 23:18 ` Daniel Borkmann
2016-07-20 5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
[not found] ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08 ` Brenden Blanco
2016-07-20 19:14 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160803.112947.1365083919840672357.davem@davemloft.net \
--to=davem@davemloft.net \
--cc=alexei.starovoitov@gmail.com \
--cc=as754m@att.com \
--cc=bblanco@plumgrid.com \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=gerlitz.or@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=haoxuany@fb.com \
--cc=jhs@mojatatu.com \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@dev.mellanox.co.il \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
--cc=ttoukan.linux@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).