From: Paolo Abeni <pabeni@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jiri Pirko <jiri@mellanox.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Alexei Starovoitov <ast@plumgrid.com>,
Alexander Duyck <aduyck@mirantis.com>,
Tom Herbert <tom@herbertland.com>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>, Rik van Riel <riel@redhat.com>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] net: threadable napi poll loop
Date: Tue, 10 May 2016 22:22:50 +0200 [thread overview]
Message-ID: <1462911770.5333.11.camel@redhat.com> (raw)
In-Reply-To: <1462896539.23934.93.camel@edumazet-glaptop3.roam.corp.google.com>
On Tue, 2016-05-10 at 09:08 -0700, Eric Dumazet wrote:
> On Tue, 2016-05-10 at 18:03 +0200, Paolo Abeni wrote:
>
> > If a single core host is under network flood, i.e. ksoftirqd is
> > scheduled and it eventually (after processing ~640 packets) will let the
> > user space process run. The latter will execute a syscall to receive a
> > packet, which will have to disable/enable bh at least once and that will
> > cause the processing of another ~640 packets. To receive a single packet
> > in user space, the kernel has to process more than one thousand packets.
>
> Looks you found the bug then. Have you tried to fix it ?
The core functionality is implemented in ~100 lines of code, is that
the kind of bloat that do concerns you ?
That could probably be improved removing some code duplication, i.e.
factorizing napi_thread_wait() with irq_wait_for_interrupt() and
possibly napi_threaded_poll() with net_rx_action().
If the additional test inside napi_schedule() is really scaring, it can
be guarded with a static_key.
The ksoftirq and the local_bh_enable() design are the root of the
problem, they need to be touched/affected to solve it.
We actually experimented several different options.
Limiting the amount of work performed by local_bh_enable() somewhat
mitigate the issue, but it adds just another kernel parameter difficult
to be tuned.
Running the softirq loop exclusively inside the ksoftirqd will solve the
issue, but this is a very invasive approach, affecting all others
subsystem.
The above can be restricted to the net_rx_action only (i.e. running
net_rx_action always in ksoftirqd context). The related patch isn't
really much simpler than this and will add at least the same number of
additional tests in fast path.
Running the napi loop in a thread that can be migrated gives additional
benefit in the hyper-visor/VM scenario, which can't be achieved
elsewhere.
Would you consider the threaded irq alternative more viable ?
Cheers,
Paolo
next prev parent reply other threads:[~2016-05-10 20:24 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-10 14:11 [RFC PATCH 0/2] net: threadable napi poll loop Paolo Abeni
2016-05-10 14:11 ` [RFC PATCH 1/2] net: implement threaded-able napi poll loop support Paolo Abeni
2016-05-10 14:11 ` [RFC PATCH 2/2] net: add sysfs attribute to control napi threaded mode Paolo Abeni
2016-05-10 14:29 ` [RFC PATCH 0/2] net: threadable napi poll loop Eric Dumazet
2016-05-10 15:51 ` David Miller
2016-05-10 16:03 ` Paolo Abeni
2016-05-10 16:08 ` Eric Dumazet
2016-05-10 20:22 ` Paolo Abeni [this message]
2016-05-10 20:45 ` David Miller
2016-05-10 20:50 ` Rik van Riel
2016-05-10 20:52 ` David Miller
2016-05-10 21:01 ` Rik van Riel
2016-05-10 20:46 ` Hannes Frederic Sowa
2016-05-10 21:09 ` Eric Dumazet
2016-05-10 21:31 ` Eric Dumazet
2016-05-10 21:35 ` Rik van Riel
2016-05-10 21:53 ` Eric Dumazet
2016-05-10 22:02 ` Eric Dumazet
2016-05-10 22:44 ` Eric Dumazet
2016-05-10 22:02 ` Rik van Riel
2016-05-11 17:55 ` Eric Dumazet
2016-05-10 22:32 ` Hannes Frederic Sowa
2016-05-10 22:51 ` Eric Dumazet
2016-05-11 6:55 ` Peter Zijlstra
2016-05-11 13:13 ` Hannes Frederic Sowa
2016-05-11 14:40 ` Eric Dumazet
2016-05-11 15:01 ` Rik van Riel
2016-05-11 15:50 ` Eric Dumazet
2016-05-11 21:56 ` Eric Dumazet
2016-05-12 20:07 ` Paolo Abeni
2016-05-12 20:49 ` Eric Dumazet
2016-05-12 20:58 ` Paolo Abeni
2016-05-12 21:05 ` Eric Dumazet
2016-05-13 16:50 ` Paolo Abeni
2016-05-13 17:03 ` Eric Dumazet
2016-05-13 17:19 ` Paolo Abeni
2016-05-13 17:36 ` Eric Dumazet
2016-05-16 13:10 ` Paolo Abeni
2016-05-16 13:38 ` Eric Dumazet
2016-05-11 9:48 ` Paolo Abeni
2016-05-11 13:08 ` Eric Dumazet
2016-05-11 13:39 ` Hannes Frederic Sowa
2016-05-11 13:47 ` Hannes Frederic Sowa
2016-05-11 14:38 ` Paolo Abeni
2016-05-11 14:45 ` Eric Dumazet
2016-05-11 22:47 ` Hannes Frederic Sowa
2016-05-10 15:57 ` Thomas Gleixner
2016-05-10 20:41 ` Paolo Abeni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1462911770.5333.11.camel@redhat.com \
--to=pabeni@redhat.com \
--cc=aduyck@mirantis.com \
--cc=ast@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=jiri@mellanox.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.