From: Jesper Dangaard Brouer <brouer@redhat.com>
To: "xdp-newbies@vger.kernel.org" <xdp-newbies@vger.kernel.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: brouer@redhat.com, "Christoph Hellwig" <hch@lst.de>,
"David Woodhouse" <dwmw2@infradead.org>,
"William Tu" <u9012063@gmail.com>,
"Björn Töpel" <bjorn.topel@intel.com>,
"Karlsson, Magnus" <magnus.karlsson@intel.com>,
"Alexander Duyck" <alexander.duyck@gmail.com>,
"Arnaldo Carvalho de Melo" <acme@redhat.com>
Subject: XDP performance regression due to CONFIG_RETPOLINE Spectre V2
Date: Thu, 12 Apr 2018 15:50:29 +0200 [thread overview]
Message-ID: <20180412155029.0324fe58@redhat.com> (raw)
Heads-up XDP performance nerds!
I got an unpleasant surprise when I updated my GCC compiler (to support
the option -mindirect-branch=thunk-extern). My XDP redirect
performance numbers when cut in half; from approx 13Mpps to 6Mpps
(single CPU core). I've identified the issue, which is caused by
kernel CONFIG_RETPOLINE, that only have effect when the GCC compiler
have support. This is mitigation of Spectre variant 2 (CVE-2017-5715)
related to indirect (function call) branches.
XDP_REDIRECT itself only have two primary (per packet) indirect
function calls, ndo_xdp_xmit and invoking bpf_prog, plus any
map_lookup_elem calls in the bpf_prog. I PoC implemented bulking for
ndo_xdp_xmit, which helped, but not enough. The real root-cause is all
the DMA API calls, which uses function pointers extensively.
Mitigation plan
---------------
Implement support for keeping the DMA mapping through the XDP return
call, to remove RX map/unmap calls. Implement bulking for XDP
ndo_xdp_xmit and XDP return frame API. Bulking allows to perform DMA
bulking via scatter-gatter DMA calls, XDP TX need it for DMA
map+unmap. The driver RX DMA-sync (to CPU) per packet calls are harder
to mitigate (via bulk technique). Ask DMA maintainer for a common
case direct call for swiotlb DMA sync call ;-)
Root-cause verification
-----------------------
I have verified that indirect DMA calls are the root-cause, by
removing the DMA sync calls from the code (as they for swiotlb does
nothing), and manually inlined the DMA map calls (basically calling
phys_to_dma(dev, page_to_phys(page)) + offset). For my ixgbe test,
performance "returned" to 11Mpps.
Perf reports
------------
It is not easy to diagnose via perf event tool. I'm coordinating with
ACME to make it easier to pinpoint the hotspots. Lookout for symbols:
__x86_indirect_thunk_r10, __indirect_thunk_start, __x86_indirect_thunk_rdx
etc. Be aware that they might not be super high in perf top, but they
stop CPU speculation. Thus, instead use perf-stat and see the
negative effect of 'insn per cycle'.
Want to understand retpoline at ASM level read this:
https://support.google.com/faqs/answer/7625886
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
next reply other threads:[~2018-04-12 13:50 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-12 13:50 Jesper Dangaard Brouer [this message]
2018-04-12 14:51 ` XDP performance regression due to CONFIG_RETPOLINE Spectre V2 Christoph Hellwig
2018-04-12 14:56 ` Christoph Hellwig
2018-04-12 15:31 ` Jesper Dangaard Brouer
2018-04-13 16:49 ` Christoph Hellwig
2018-04-13 17:12 ` Tushar Dave
2018-04-13 17:26 ` Christoph Hellwig
2018-04-14 19:29 ` David Woodhouse
2018-04-16 6:02 ` Jesper Dangaard Brouer
2018-04-16 12:27 ` Christoph Hellwig
2018-04-16 16:04 ` Alexander Duyck
2018-04-17 6:19 ` Christoph Hellwig
2018-04-16 18:05 ` dma-mapping: bypass dma_ops for direct mappings kbuild test robot
2018-04-16 18:26 ` Jesper Dangaard Brouer
2018-04-16 18:31 ` kbuild test robot
2018-04-16 21:07 ` XDP performance regression due to CONFIG_RETPOLINE Spectre V2 Jesper Dangaard Brouer
2018-04-17 6:15 ` Christoph Hellwig
2018-04-17 7:07 ` Jesper Dangaard Brouer
2018-04-17 7:13 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180412155029.0324fe58@redhat.com \
--to=brouer@redhat.com \
--cc=acme@redhat.com \
--cc=alexander.duyck@gmail.com \
--cc=bjorn.topel@intel.com \
--cc=dwmw2@infradead.org \
--cc=hch@lst.de \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
--cc=u9012063@gmail.com \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).