Re: [RFC PATCH net-next 1/2] net: Use SMP threads for backlog NAPI.

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ferenc Fejes <primalgamer@gmail.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Paolo Abeni <pabeni@redhat.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Wander Lairson Costa <wander@redhat.com>
Subject: Re: [RFC PATCH net-next 1/2] net: Use SMP threads for backlog NAPI.
Date: Thu, 21 Sep 2023 12:41:33 +0200	[thread overview]
Message-ID: <2eb9af65d098bb54ed54178d7269e7197d6de5a0.camel@gmail.com> (raw)
In-Reply-To: <20230920155754.KzYGXMWh@linutronix.de>

Hi!

On Wed, 2023-09-20 at 17:57 +0200, Sebastian Andrzej Siewior wrote:
> On 2023-08-23 15:35:41 [+0200], Paolo Abeni wrote:
> > On Mon, 2023-08-14 at 11:35 +0200, Sebastian Andrzej Siewior wrote:
> > > @@ -4781,7 +4733,7 @@ static int enqueue_to_backlog(struct
> > > sk_buff *skb, int cpu,
> > >  		 * We can use non atomic operation since we own
> > > the queue lock
> > >  		 */
> > >  		if (!__test_and_set_bit(NAPI_STATE_SCHED, &sd-
> > > >backlog.state))
> > > -			napi_schedule_rps(sd);
> > > +			__napi_schedule_irqoff(&sd->backlog);
> > >  		goto enqueue;
> > >  	}
> > >  	reason = SKB_DROP_REASON_CPU_BACKLOG;
> > 
> > I *think* that the above could be quite dangerous when cpu ==
> > smp_processor_id() - that is, with plain veth usage.
> > 
> > Currently, each packet runs into the rx path just after
> > enqueue_to_backlog()/tx completes.
> > 
> > With this patch there will be a burst effect, where the backlog
> > thread
> > will run after a few (several) packets will be enqueued, when the
> > process scheduler will decide - note that the current CPU is
> > already
> > hosting a running process, the tx thread.
> > 
> > The above can cause packet drops (due to limited buffering) or very
> > high latency (due to long burst), even in non overload situation,
> > quite
> > hard to debug.
> > 
> > I think the above needs to be an opt-in, but I guess that even RT
> > deployments doing some packet forwarding will not be happy with
> > this
> > on.
> 
> I've been looking at this again and have been thinking what you said
> here. I think part of the problem is that we lack a policy/ mechanism
> when a DoS is happening and what to do.
> 
> Before commit d15121be74856 ("Revert "softirq: Let ksoftirqd do its
> job"") when a lot of network packets are processed then processing is
> moved to ksoftirqd and continues based on how the scheduler schedules
> the SCHED_OTHER ksoftirqd task. This avoids lock-ups of the system
> and
> it can do something else in between. Any interrupt will not continue
> the
> outstanding softirq backlog but wait for ksoftirqd. So it basically
> avoids the networking overload. It throttles the throughput if
> needed.
> 
> This isn't the case after that commit. Now, the CPU can be stuck with
> processing networking packets if the packets come in fast enough.
> Even
> if ksoftirqd is woken up, the next interrupt (say the timer) will
> continue with at least one round.
> By using NAPI-threads it is possible to give the control back to the
> scheduler which can throttle the NAPI processing in favour of other
> threads that ask for CPU. As you pointed out, waking the thread does
> not
> guarantee that it will immediately do the NAPI work. It can be
> delayed
> based on current load on the system.
> 
> This could be influenced by assigning the NAPI-thread a SCHED_FIFO
> priority. Based on the priority it could be ensured that the thread
> starts right away or "later" if something else is more important.
> However, this opens the DoS window again: The scheduler will put the
> NAPI thread on CPU as long as it asks for it with no throttling.
> 
> If we could somehow define a DoS condition once we are overwhelmed
> with
> packets, then we could act on it and throttle it. This in turn would
> allow a SCHED_FIFO priority without the fear of a lockup if the
> system
> is flooded with packets.

Can this be avoided if we reuse gro_flush_timeout as the maximum time
the NAPI thread can be scheduled?

> 
> > Cheers,
> > 
> > Paolo
> 
> Sebastian

Ferenc

next prev parent reply	other threads:[~2023-09-21 18:11 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14  9:35 [RFC PATCH net-next 0/2] net: Use SMP threads for backlog NAPI Sebastian Andrzej Siewior
2023-08-14  9:35 ` [RFC PATCH net-next 1/2] " Sebastian Andrzej Siewior
2023-08-21  8:32   ` kernel test robot
2023-08-23 13:35   ` Paolo Abeni
2023-09-20 15:57     ` Sebastian Andrzej Siewior
2023-09-21 10:41       ` Ferenc Fejes [this message]
2023-09-22  7:26         ` Sebastian Andrzej Siewior
2023-09-22  9:38       ` Paolo Abeni
2023-08-14  9:35 ` [RFC PATCH 2/2] softirq: Drop the warning from do_softirq_post_smp_call_flush() Sebastian Andrzej Siewior
2023-08-15 12:08   ` Jesper Dangaard Brouer
2023-08-15 22:31     ` Yan Zhai
2023-08-16 14:48     ` Jesper Dangaard Brouer
2023-08-16 15:15       ` Yan Zhai
2023-08-16 21:02         ` Jesper Dangaard Brouer
2023-08-18 15:49           ` Yan Zhai
2023-08-16 15:22       ` Sebastian Andrzej Siewior
2023-08-14 18:24 ` [RFC PATCH net-next 0/2] net: Use SMP threads for backlog NAPI Jakub Kicinski
2023-08-17 13:16   ` Sebastian Andrzej Siewior
2023-08-17 15:30     ` Jakub Kicinski
2023-08-18  9:03       ` Sebastian Andrzej Siewior
2023-08-18 14:43     ` Yan Zhai
2023-08-18 14:57       ` Sebastian Andrzej Siewior
2023-08-18 16:21         ` Jakub Kicinski
2023-08-18 16:40           ` Eric Dumazet
2023-08-23  6:57           ` Sebastian Andrzej Siewior
2023-08-18 16:56         ` Yan Zhai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2eb9af65d098bb54ed54178d7269e7197d6de5a0.camel@gmail.com \
    --to=primalgamer@gmail.com \
    --cc=bigeasy@linutronix.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=wander@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).