bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Song Liu <liu.song.a23@gmail.com>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Networking <netdev@vger.kernel.org>,
	"Daniel Borkmann" <borkmann@iogearbox.net>,
	"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	bpf <bpf@vger.kernel.org>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>
Subject: Re: [PATCH bpf-next 5/5] bpf: cpumap memory prefetchw optimizations for struct page
Date: Wed, 10 Apr 2019 16:35:45 -0700	[thread overview]
Message-ID: <CAPhsuW7Gbrsh0WKPWB1GPue_gKFnuT-byoSYYx3-qXfcvpDPYA@mail.gmail.com> (raw)
In-Reply-To: <155489663807.20826.15883373865371166146.stgit@firesoul>

On Wed, Apr 10, 2019 at 6:02 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> A lot of the performance gain comes from this patch.
>
> While analysing performance overhead it was found that the largest CPU
> stalls were caused when touching the struct page area. It is first read with
> a READ_ONCE from build_skb_around via page_is_pfmemalloc(), and when freed
> written by page_frag_free() call.
>
> Measurements show that the prefetchw (W) variant operation is needed to
> achieve the performance gain. We believe this optimization it two fold,
> first the W-variant saves one step in the cache-coherency protocol, and
> second it helps us to avoid the non-temporal prefetch HW optimizations and
> bring this into all cache-levels. It might be worth investigating if
> prefetch into L2 will have the same benefit.
>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Acked-by: Song Liu <songliubraving@fb.com>


> ---
>  kernel/bpf/cpumap.c |   12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index b82a11556ad5..4758482ab5b9 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -281,6 +281,18 @@ static int cpu_map_kthread_run(void *data)
>                  * consume side valid as no-resize allowed of queue.
>                  */
>                 n = ptr_ring_consume_batched(rcpu->queue, frames, CPUMAP_BATCH);
> +
> +               for (i = 0; i < n; i++) {
> +                       void *f = frames[i];
> +                       struct page *page = virt_to_page(f);
> +
> +                       /* Bring struct page memory area to curr CPU. Read by
> +                        * build_skb_around via page_is_pfmemalloc(), and when
> +                        * freed written by page_frag_free call.
> +                        */
> +                       prefetchw(page);
> +               }
> +
>                 m = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, n, skbs);
>                 if (unlikely(m == 0)) {
>                         for (i = 0; i < n; i++)
>

  reply	other threads:[~2019-04-10 23:35 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-10 11:43 [PATCH bpf-next 0/5] Bulk optimization for XDP cpumap redirect Jesper Dangaard Brouer
2019-04-10 11:43 ` [PATCH bpf-next 1/5] bpf: cpumap use ptr_ring_consume_batched Jesper Dangaard Brouer
2019-04-10 23:24   ` Song Liu
2019-04-11 11:23     ` Jesper Dangaard Brouer
2019-04-11 17:38       ` Song Liu
2019-04-10 11:43 ` [PATCH bpf-next 2/5] bpf: cpumap use netif_receive_skb_list Jesper Dangaard Brouer
2019-04-10 18:56   ` Edward Cree
2019-04-10 11:43 ` [PATCH bpf-next 3/5] net: core: introduce build_skb_around Jesper Dangaard Brouer
2019-04-10 23:34   ` Song Liu
2019-04-11 15:39     ` Jesper Dangaard Brouer
2019-04-11 17:43       ` Song Liu
2019-04-11  5:33   ` Ilias Apalodimas
2019-04-11 11:17     ` Jesper Dangaard Brouer
2019-04-10 11:43 ` [PATCH bpf-next 4/5] bpf: cpumap do bulk allocation of SKBs Jesper Dangaard Brouer
2019-04-10 23:30   ` Song Liu
2019-04-10 11:43 ` [PATCH bpf-next 5/5] bpf: cpumap memory prefetchw optimizations for struct page Jesper Dangaard Brouer
2019-04-10 23:35   ` Song Liu [this message]
2019-04-11  5:47   ` Ilias Apalodimas
2019-04-10 23:36 ` [PATCH bpf-next 0/5] Bulk optimization for XDP cpumap redirect Song Liu
2019-04-11 13:18   ` Jesper Dangaard Brouer
2019-04-11 17:45     ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPhsuW7Gbrsh0WKPWB1GPue_gKFnuT-byoSYYx3-qXfcvpDPYA@mail.gmail.com \
    --to=liu.song.a23@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=borkmann@iogearbox.net \
    --cc=bpf@vger.kernel.org \
    --cc=brouer@redhat.com \
    --cc=davem@davemloft.net \
    --cc=ilias.apalodimas@linaro.org \
    --cc=netdev@vger.kernel.org \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).