From: "Björn Töpel" <bjorn.topel@intel.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "Jesper Dangaard Brouer" <brouer@redhat.com>,
"Björn Töpel" <bjorn.topel@gmail.com>,
"Eric Dumazet" <eric.dumazet@gmail.com>,
ast@kernel.org, daniel@iogearbox.net, netdev@vger.kernel.org,
bpf@vger.kernel.org, magnus.karlsson@intel.com,
davem@davemloft.net, john.fastabend@gmail.com,
intel-wired-lan@lists.osuosl.org
Subject: Re: [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF_XDP Rx ring is full
Date: Mon, 7 Sep 2020 15:37:40 +0200 [thread overview]
Message-ID: <1d2e781e-b26d-4cf0-0178-25b8835dbe26@intel.com> (raw)
In-Reply-To: <20200904165837.16d8ecfd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
On 2020-09-05 01:58, Jakub Kicinski wrote:
> On Fri, 4 Sep 2020 16:32:56 +0200 Björn Töpel wrote:
>> On 2020-09-04 16:27, Jesper Dangaard Brouer wrote:
>>> On Fri, 4 Sep 2020 15:53:25 +0200
>>> Björn Töpel <bjorn.topel@gmail.com> wrote:
>>>
>>>> On my machine the "one core scenario Rx drop" performance went from
>>>> ~65Kpps to 21Mpps. In other words, from "not usable" to
>>>> "usable". YMMV.
>>>
>>> We have observed this kind of dropping off an edge before with softirq
>>> (when userspace process runs on same RX-CPU), but I thought that Eric
>>> Dumazet solved it in 4cd13c21b207 ("softirq: Let ksoftirqd do its
job").
>>>
>>> I wonder what makes AF_XDP different or if the problem have come back?
>>>
>>
>> I would say this is not the same issue. The problem is that the softirq
>> is busy dropping packets since the AF_XDP Rx is full. So, the cycles
>> *are* split 50/50, which is not what we want in this case. :-)
>>
>> This issue is more of a "Intel AF_XDP ZC drivers does stupid work", than
>> fairness. If the Rx ring is full, then there is really no use to let the
>> NAPI loop continue.
>>
>> Would you agree, or am I rambling? :-P
>
> I wonder if ksoftirqd never kicks in because we are able to discard
> the entire ring before we run out of softirq "slice".
>
This is exactly what's happening, so we're entering a "busy poll like"
behavior; syscall, return from syscall softirq/napi, userland.
>
> I've been pondering the exact problem you're solving with Maciej
> recently. The efficiency of AF_XDP on one core with the NAPI processing.
>
> Your solution (even though it admittedly helps, and is quite simple)
> still has the application potentially not able to process packets
> until the queue fills up. This will be bad for latency.
>
> Why don't we move closer to application polling? Never re-arm the NAPI
> after RX, let the application ask for packets, re-arm if 0 polled.
> You'd get max batching, min latency.
>
> Who's the rambling one now? :-D
>
:-D No, these are all very good ideas! We've actually experimented
with it with the busy-poll series a while back -- NAPI busy-polling
does exactly "application polling".
However, I wonder if the busy-polling would have better performance
than the scenario above (i.e. when the ksoftirqd never kicks in)?
Executing the NAPI poll *explicitly* in the syscall, or implicitly
from the softirq.
Hmm, thinking out loud here. A simple(r) patch enabling busy poll;
Exporting the napi_id to the AF_XDP socket (xdp->rxq->napi_id to
sk->sk_napi_id), and do the sk_busy_poll_loop() in sendmsg.
Or did you have something completely different in mind?
As for this patch set, I think it would make sense to pull it in since
it makes the single-core scenario *much* better, and it is pretty
simple. Then do the application polling as another, potentially,
improvement series.
Thoughts? Thanks a lot for the feedback!
Björn
next prev parent reply other threads:[~2020-09-07 17:24 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-04 13:53 [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF_XDP Rx ring is full Björn Töpel
2020-09-04 13:53 ` [PATCH bpf-next 1/6] xsk: improve xdp_do_redirect() error codes Björn Töpel
2020-09-04 13:53 ` [PATCH bpf-next 2/6] xdp: introduce xdp_do_redirect_ext() function Björn Töpel
2020-09-04 13:53 ` [PATCH bpf-next 3/6] xsk: introduce xsk_do_redirect_rx_full() helper Björn Töpel
2020-09-04 15:11 ` Jesper Dangaard Brouer
2020-09-04 15:39 ` Björn Töpel
2020-09-07 12:45 ` Jesper Dangaard Brouer
2020-09-04 13:53 ` [PATCH bpf-next 4/6] i40e, xsk: finish napi loop if AF_XDP Rx queue is full Björn Töpel
2020-09-04 13:53 ` [PATCH bpf-next 5/6] ice, " Björn Töpel
2020-09-04 13:53 ` [PATCH bpf-next 6/6] ixgbe, " Björn Töpel
2020-09-04 15:35 ` Jesper Dangaard Brouer
2020-09-04 15:54 ` Björn Töpel
2020-09-04 13:59 ` [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF_XDP Rx ring " Björn Töpel
2020-09-08 10:32 ` Maxim Mikityanskiy
2020-09-08 11:37 ` Magnus Karlsson
2020-09-08 12:21 ` Björn Töpel
2020-09-09 15:37 ` Jesper Dangaard Brouer
2020-09-04 14:27 ` Jesper Dangaard Brouer
2020-09-04 14:32 ` Björn Töpel
2020-09-04 23:58 ` Jakub Kicinski
2020-09-07 13:37 ` Björn Töpel [this message]
2020-09-07 18:40 ` Jakub Kicinski
2020-09-08 6:58 ` Björn Töpel
2020-09-08 17:24 ` Jakub Kicinski
2020-09-08 18:28 ` Björn Töpel
2020-09-08 18:34 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1d2e781e-b26d-4cf0-0178-25b8835dbe26@intel.com \
--to=bjorn.topel@intel.com \
--cc=ast@kernel.org \
--cc=bjorn.topel@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=magnus.karlsson@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox