From: Jakub Kicinski <kubakici@wp.pl>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: netdev@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>,
Jason Wang <jasowang@redhat.com>,
mchan@broadcom.com, John Fastabend <john.fastabend@gmail.com>,
peter.waskiewicz.jr@intel.com,
Daniel Borkmann <borkmann@iogearbox.net>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andy Gospodarek <andy@greyhouse.net>
Subject: Re: [net-next V2 PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP
Date: Fri, 29 Sep 2017 11:41:54 -0700 [thread overview]
Message-ID: <20170929114154.4b5d5918@cakuba> (raw)
In-Reply-To: <150670285218.23765.2480801081343646072.stgit@firesoul>
On Fri, 29 Sep 2017 18:34:12 +0200, Jesper Dangaard Brouer wrote:
> The 'cpumap' is primary used as a backend map for XDP BPF helper
> call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'.
>
> This patch implement the main part of the map. It is not connected to
> the XDP redirect system yet, and no SKB allocation are done yet.
>
> The main concern in this patch is to ensure the datapath can run
> without any locking. This adds complexity to the setup and tear-down
> procedure, which assumptions are extra carefully documented in the
> code comments.
>
> V2: make sure array isn't larger than num possible CPUs
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Few trivial nitpicks, hope you don't mind :)
> @@ -0,0 +1,555 @@
> +/* bpf/cpumap.c
> + *
> + * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc.
> + * Released under terms in GPL version 2. See COPYING.
> + */
> +
> +/* The 'cpumap' is primary used as a backend map for XDP BPF helper
> + * call bpf_redirect_map() and XDP_REDIRECT action, like 'devmap'.
> + *
> + * Unlike devmap which redirect XDP frames out another NIC device,
> + * this map type redirect raw XDP frames to another CPU. The remote
> + * CPU will do SKB-allocation and call the normal network stack.
> + *
> + * This is a scalability and isolation mechanism, that allow
> + * separating the early driver network XDP layer, from the rest of the
> + * netstack, and assigning dedicated CPUs for this stage. This
> + * basically allows for 10G wirespeed pre-filtering via bpf.
> + */
> +#include <linux/bpf.h>
> +#include <linux/filter.h>
> +#include <linux/ptr_ring.h>
> +
> +#include <linux/sched.h>
> +#include <linux/workqueue.h>
> +#include <linux/kthread.h>
> +
> +/*
> + * General idea: XDP packets getting XDP redirected to another CPU,
> + * will maximum be stored/queued for one driver ->poll() call. It is
> + * guaranteed that setting flush bit and flush operation happen on
> + * same CPU. Thus, cpu_map_flush operation can deduct via this_cpu_ptr()
> + * which queue in bpf_cpu_map_entry contains packets.
> + */
> +
> +#define CPU_MAP_BULK_SIZE 8 /* 8 == one cacheline on 64-bit archs */
> +struct xdp_bulk_queue {
> + void *q[CPU_MAP_BULK_SIZE];
> + unsigned int count;
> +};
Out of curiosity - would it make sense to make sure the entire struct
fits into a cache line? The comment seems to indicate that the array is
sized to fit a cache line, but then there is also the count member...
> +/*
> + * After xchg pointer to bpf_cpu_map_entry, use the call_rcu() to
...
There is a mix for networking and non-networking style comments in this
file, is this intentional?
> +const struct bpf_map_ops cpu_map_ops = {
> + .map_alloc = cpu_map_alloc,
> + .map_free = cpu_map_free,
> + .map_delete_elem = cpu_map_delete_elem,
> + .map_update_elem = cpu_map_update_elem,
> + .map_lookup_elem = cpu_map_lookup_elem,
> + .map_get_next_key = cpu_map_get_next_key,
> +};
> +
> +
Extra new line.
> +/* Runs under RCU-read-side, plus in softirq under NAPI protection.
> + * Thus, safe percpu variable access.
> + */
> +static int bq_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_pkt *xdp_pkt)
> +{
> + struct xdp_bulk_queue *bq = this_cpu_ptr(rcpu->bulkq);
> +
> + if (unlikely(bq->count == CPU_MAP_BULK_SIZE)) {
> + bq_flush_to_queue(rcpu, bq);
> + }
Curly brackets not needed.
next prev parent reply other threads:[~2017-09-29 18:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-29 16:34 [net-next V2 PATCH 0/5] New bpf cpumap type for XDP_REDIRECT Jesper Dangaard Brouer
2017-09-29 16:34 ` [net-next V2 PATCH 1/5] bpf: introduce new bpf cpu map type BPF_MAP_TYPE_CPUMAP Jesper Dangaard Brouer
2017-09-29 18:41 ` Jakub Kicinski [this message]
2017-09-29 19:58 ` Jesper Dangaard Brouer
2017-09-29 16:34 ` [net-next V2 PATCH 2/5] bpf: XDP_REDIRECT enable use of cpumap Jesper Dangaard Brouer
2017-10-01 0:13 ` kbuild test robot
2017-09-29 16:34 ` [net-next V2 PATCH 3/5] bpf: cpumap xdp_buff to skb conversion and allocation Jesper Dangaard Brouer
2017-09-29 16:34 ` [net-next V2 PATCH 4/5] bpf: cpumap add tracepoints Jesper Dangaard Brouer
2017-09-29 16:34 ` [net-next V2 PATCH 5/5] samples/bpf: add cpumap sample program xdp_redirect_cpu Jesper Dangaard Brouer
2017-09-30 3:06 ` Alexei Starovoitov
2017-10-02 12:07 ` Jesper Dangaard Brouer
2017-10-02 19:44 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170929114154.4b5d5918@cakuba \
--to=kubakici@wp.pl \
--cc=alexei.starovoitov@gmail.com \
--cc=andy@greyhouse.net \
--cc=borkmann@iogearbox.net \
--cc=brouer@redhat.com \
--cc=jasowang@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=mchan@broadcom.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=peter.waskiewicz.jr@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.