netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Miller <davem@davemloft.net>
To: tom@herbertland.com
Cc: alexei.starovoitov@gmail.com, bblanco@plumgrid.com,
	netdev@vger.kernel.org, jhs@mojatatu.com,
	saeedm@dev.mellanox.co.il, kafai@fb.com, brouer@redhat.com,
	as754m@att.com, gerlitz.or@gmail.com, john.fastabend@gmail.com,
	hannes@stressinduktion.org, tgraf@suug.ch, daniel@iogearbox.net,
	ttoukan.linux@gmail.com, haoxuany@fb.com
Subject: Re: [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite
Date: Wed, 03 Aug 2016 11:29:47 -0700 (PDT)	[thread overview]
Message-ID: <20160803.112947.1365083919840672357.davem@davemloft.net> (raw)
In-Reply-To: <CALx6S36ARsfC8XxPHYsbD+T6Dx3SOgxdxv2CDJWuVZGf+H1oAw@mail.gmail.com>

From: Tom Herbert <tom@herbertland.com>
Date: Wed, 3 Aug 2016 10:29:58 -0700

> On Wed, Aug 3, 2016 at 10:11 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>> On Wed, Aug 03, 2016 at 10:01:54AM -0700, Tom Herbert wrote:
>>> On Tue, Jul 19, 2016 at 12:16 PM, Brenden Blanco <bblanco@plumgrid.com> wrote:
>>> > Add a sample that rewrites and forwards packets out on the same
>>> > interface. Observed single core forwarding performance of ~10Mpps.
>>> >
>>> > Since the mlx4 driver under test recycles every single packet page, the
>>> > perf output shows almost exclusively just the ring management and bpf
>>> > program work. Slowdowns are likely occurring due to cache misses.
>>> >
>>> > Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
>>> > ---
>>> >  samples/bpf/Makefile    |   5 +++
>>> >  samples/bpf/xdp2_kern.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++
>>> >  2 files changed, 119 insertions(+)
>>> >  create mode 100644 samples/bpf/xdp2_kern.c
>>> >
>>> > diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
>>> > index 0e4ab3a..d2d2b35 100644
>>> > --- a/samples/bpf/Makefile
>>> > +++ b/samples/bpf/Makefile
>>> > @@ -22,6 +22,7 @@ hostprogs-y += map_perf_test
>>> >  hostprogs-y += test_overhead
>>> >  hostprogs-y += test_cgrp2_array_pin
>>> >  hostprogs-y += xdp1
>>> > +hostprogs-y += xdp2
>>> >
>>> >  test_verifier-objs := test_verifier.o libbpf.o
>>> >  test_maps-objs := test_maps.o libbpf.o
>>> > @@ -44,6 +45,8 @@ map_perf_test-objs := bpf_load.o libbpf.o map_perf_test_user.o
>>> >  test_overhead-objs := bpf_load.o libbpf.o test_overhead_user.o
>>> >  test_cgrp2_array_pin-objs := libbpf.o test_cgrp2_array_pin.o
>>> >  xdp1-objs := bpf_load.o libbpf.o xdp1_user.o
>>> > +# reuse xdp1 source intentionally
>>> > +xdp2-objs := bpf_load.o libbpf.o xdp1_user.o
>>> >
>>> >  # Tell kbuild to always build the programs
>>> >  always := $(hostprogs-y)
>>> > @@ -67,6 +70,7 @@ always += test_overhead_kprobe_kern.o
>>> >  always += parse_varlen.o parse_simple.o parse_ldabs.o
>>> >  always += test_cgrp2_tc_kern.o
>>> >  always += xdp1_kern.o
>>> > +always += xdp2_kern.o
>>> >
>>> >  HOSTCFLAGS += -I$(objtree)/usr/include
>>> >
>>> > @@ -88,6 +92,7 @@ HOSTLOADLIBES_spintest += -lelf
>>> >  HOSTLOADLIBES_map_perf_test += -lelf -lrt
>>> >  HOSTLOADLIBES_test_overhead += -lelf -lrt
>>> >  HOSTLOADLIBES_xdp1 += -lelf
>>> > +HOSTLOADLIBES_xdp2 += -lelf
>>> >
>>> >  # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
>>> >  #  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
>>> > diff --git a/samples/bpf/xdp2_kern.c b/samples/bpf/xdp2_kern.c
>>> > new file mode 100644
>>> > index 0000000..38fe7e1
>>> > --- /dev/null
>>> > +++ b/samples/bpf/xdp2_kern.c
>>> > @@ -0,0 +1,114 @@
>>> > +/* Copyright (c) 2016 PLUMgrid
>>> > + *
>>> > + * This program is free software; you can redistribute it and/or
>>> > + * modify it under the terms of version 2 of the GNU General Public
>>> > + * License as published by the Free Software Foundation.
>>> > + */
>>> > +#define KBUILD_MODNAME "foo"
>>> > +#include <uapi/linux/bpf.h>
>>> > +#include <linux/in.h>
>>> > +#include <linux/if_ether.h>
>>> > +#include <linux/if_packet.h>
>>> > +#include <linux/if_vlan.h>
>>> > +#include <linux/ip.h>
>>> > +#include <linux/ipv6.h>
>>> > +#include "bpf_helpers.h"
>>> > +
>>> > +struct bpf_map_def SEC("maps") dropcnt = {
>>> > +       .type = BPF_MAP_TYPE_PERCPU_ARRAY,
>>> > +       .key_size = sizeof(u32),
>>> > +       .value_size = sizeof(long),
>>> > +       .max_entries = 256,
>>> > +};
>>> > +
>>> > +static void swap_src_dst_mac(void *data)
>>> > +{
>>> > +       unsigned short *p = data;
>>> > +       unsigned short dst[3];
>>> > +
>>> > +       dst[0] = p[0];
>>> > +       dst[1] = p[1];
>>> > +       dst[2] = p[2];
>>> > +       p[0] = p[3];
>>> > +       p[1] = p[4];
>>> > +       p[2] = p[5];
>>> > +       p[3] = dst[0];
>>> > +       p[4] = dst[1];
>>> > +       p[5] = dst[2];
>>> > +}
>>> > +
>>> > +static int parse_ipv4(void *data, u64 nh_off, void *data_end)
>>> > +{
>>> > +       struct iphdr *iph = data + nh_off;
>>> > +
>>> > +       if (iph + 1 > data_end)
>>> > +               return 0;
>>> > +       return iph->protocol;
>>> > +}
>>> > +
>>> > +static int parse_ipv6(void *data, u64 nh_off, void *data_end)
>>> > +{
>>> > +       struct ipv6hdr *ip6h = data + nh_off;
>>> > +
>>> > +       if (ip6h + 1 > data_end)
>>> > +               return 0;
>>> > +       return ip6h->nexthdr;
>>> > +}
>>> > +
>>> > +SEC("xdp1")
>>> > +int xdp_prog1(struct xdp_md *ctx)
>>> > +{
>>> > +       void *data_end = (void *)(long)ctx->data_end;
>>> > +       void *data = (void *)(long)ctx->data;
>>>
>>> Brendan,
>>>
>>> It seems that the cast to long here is done because data_end and data
>>> are u32s in xdp_md. So the effect is that we are upcasting a
>>> thirty-bit integer into a sixty-four bit pointer (in fact without the
>>> cast we see compiler warnings). I don't understand how this can be
>>> correct. Can you shed some light on this?
>>
>> please see:
>> http://lists.iovisor.org/pipermail/iovisor-dev/2016-August/000355.html
>>
> That doesn't explain it.

Yes it does explain it, think more about the word "meta" and what the
code generator might be doing.

  reply	other threads:[~2016-08-03 18:30 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-19 19:16 [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 01/12] bpf: add bpf_prog_add api for bulk prog refcnt Brenden Blanco
2016-07-19 21:46   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 02/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-19 21:33   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 03/12] net: add ndo to setup/query xdp prog in adapter rx Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 04/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-20  8:38   ` Daniel Borkmann
2016-07-20 17:35     ` Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 05/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-19 21:41   ` Alexei Starovoitov
2016-07-20  9:07   ` Daniel Borkmann
2016-07-20 17:33     ` Brenden Blanco
2016-07-24 11:56   ` Jesper Dangaard Brouer
2016-07-24 16:57   ` Tom Herbert
2016-07-24 20:34     ` Daniel Borkmann
2016-07-19 19:16 ` [PATCH v10 06/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-19 21:44   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 07/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-19 21:49   ` Alexei Starovoitov
2016-07-25  7:35   ` Eric Dumazet
2016-08-03 17:45     ` order-0 vs order-N driver allocation. Was: " Alexei Starovoitov
2016-08-04 16:19       ` Jesper Dangaard Brouer
2016-08-05  0:30         ` Alexander Duyck
2016-08-05  3:55           ` Alexei Starovoitov
2016-08-05 15:15             ` Alexander Duyck
2016-08-05 15:33               ` David Laight
2016-08-05 16:00                 ` Alexander Duyck
2016-08-05  7:15         ` Eric Dumazet
2016-08-08  2:15           ` Alexei Starovoitov
2016-08-08  8:01             ` Jesper Dangaard Brouer
2016-08-08 18:34               ` Alexei Starovoitov
2016-08-09 12:14                 ` Jesper Dangaard Brouer
2016-07-19 19:16 ` [PATCH v10 08/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-19 21:53   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 09/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 10/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-19 19:16 ` [PATCH v10 11/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-19 21:59   ` Alexei Starovoitov
2016-07-19 19:16 ` [PATCH v10 12/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-19 22:05   ` Alexei Starovoitov
2016-07-20 17:38     ` Brenden Blanco
2016-07-27 18:25     ` Jesper Dangaard Brouer
2016-08-03 17:01   ` Tom Herbert
2016-08-03 17:11     ` Alexei Starovoitov
2016-08-03 17:29       ` Tom Herbert
2016-08-03 18:29         ` David Miller [this message]
2016-08-03 18:29         ` Brenden Blanco
2016-08-03 18:31           ` David Miller
2016-08-03 19:06           ` Tom Herbert
2016-08-03 22:36             ` Alexei Starovoitov
2016-08-03 23:18               ` Daniel Borkmann
2016-07-20  5:09 ` [PATCH v10 00/12] Add driver bpf hook for early packet drop and forwarding David Miller
     [not found]   ` <6a09ce5d-f902-a576-e44e-8e1e111ae26b@gmail.com>
2016-07-20 14:08     ` Brenden Blanco
2016-07-20 19:14     ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160803.112947.1365083919840672357.davem@davemloft.net \
    --to=davem@davemloft.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=as754m@att.com \
    --cc=bblanco@plumgrid.com \
    --cc=brouer@redhat.com \
    --cc=daniel@iogearbox.net \
    --cc=gerlitz.or@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=haoxuany@fb.com \
    --cc=jhs@mojatatu.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tgraf@suug.ch \
    --cc=tom@herbertland.com \
    --cc=ttoukan.linux@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).