6.1-rt: NOHZ tick-stop error: local softirq work is pending

public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed

* 6.1-rt: NOHZ tick-stop error: local softirq work is pending
@ 2025-02-21  9:32 Bezdeka, Florian
  2025-02-24 11:55 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Bezdeka, Florian @ 2025-02-21  9:32 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org
  Cc: Ziegler, Andreas, Kiszka, Jan, MOESSBAUER, Felix

Hi all,

when stressing a 6.1-rt based system with network load we can
immediately see the following in the system log:

[  165.260690] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
[  165.264689] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
[  165.268687] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
...

or (from a different system)

[  227.230611] NOHZ tick-stop error: local softirq work is pending, handler #08!!!
[  227.231894] NOHZ tick-stop error: local softirq work is pending, handler #88!!!
[  227.232218] NOHZ tick-stop error: local softirq work is pending, handler #88!!!
...

handler #80 means SCHED_SOFTIRQ,
handler #08 means NET_RX_SOFTIRQ

It seems that

96c1fa04f089 ("tick/rcu: Fix false positive "softirq work is pending" messages")

tried to fix this issue, but for some reason it does not work.

Is that something that is really allowed to happen on RT (which means
that one of the conditions for the warning is still wrong) or a real
problem? We did not notice any negative impact on the system so far.

Input welcome...

The warning is raised by report_idle_softirq() in kernel/time/tick-
sched.c.

Best regards,
Florian

-- 
Siemens AG, Foundational Technologies
Linux Expert Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-21  9:32 6.1-rt: NOHZ tick-stop error: local softirq work is pending Bezdeka, Florian
@ 2025-02-24 11:55 ` Sebastian Andrzej Siewior
  2025-02-25 15:16   ` Florian Bezdeka
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-02-24 11:55 UTC (permalink / raw)
  To: Bezdeka, Florian
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On 2025-02-21 09:32:41 [+0000], Bezdeka, Florian wrote:
> Hi all,
Hi,

> when stressing a 6.1-rt based system with network load we can
> immediately see the following in the system log:
> 
> [  165.260690] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> [  165.264689] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> [  165.268687] NOHZ tick-stop error: local softirq work is pending, handler #80!!!

which version is this? I think is is an imported issue. Is v6.1.119-rt45
also affected?

> It seems that
> 
> 96c1fa04f089 ("tick/rcu: Fix false positive "softirq work is pending" messages")
> 
> tried to fix this issue, but for some reason it does not work.
> 
> Is that something that is really allowed to happen on RT (which means
> that one of the conditions for the warning is still wrong) or a real
> problem? We did not notice any negative impact on the system so far.
> 
> Input welcome...

The thing is that this may happen on PREEMPT_RT. Usually because
softirqs can't be run as the CPU is blocked on locks and the lock-owner
is either preempted on another CPU or blocked on something else.
The thing is that a NO_HZ CPU should not go idle if there are softirqs
pending as in "there is work to do, no nap for you". But as I explained
earlier, on PREEMPT_RT it might happen that the work can't be handled
and if no task can be run, the CPU goes to sleep.

That means if you replace with PERIODIC, the warning goes away. If you
start a CPU-hog (on per-CPU) the warning goes away.

> Best regards,
> Florian

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-24 11:55 ` Sebastian Andrzej Siewior
@ 2025-02-25 15:16   ` Florian Bezdeka
  2025-02-26  9:17     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-02-25 15:16 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On Mon, 2025-02-24 at 12:55 +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-21 09:32:41 [+0000], Bezdeka, Florian wrote:
> > Hi all,
> Hi,
> 
> > when stressing a 6.1-rt based system with network load we can
> > immediately see the following in the system log:
> > 
> > [  165.260690] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> > [  165.264689] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> > [  165.268687] NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> 
> which version is this? I think is is an imported issue. Is v6.1.119-rt45
> also affected?

This is a typo, right? You mean it is an important issue, no?

We can see that on

- v6.1.90-rt (Debian -rt kernel)
- v6.1.120-rt (Debian -rt kernel)
- v6.1.119-rt45 (So yes, this is also affected)
- v6.1.120-rt47

> 
> > It seems that
> > 
> > 96c1fa04f089 ("tick/rcu: Fix false positive "softirq work is pending" messages")
> > 
> > tried to fix this issue, but for some reason it does not work.
> > 
> > Is that something that is really allowed to happen on RT (which means
> > that one of the conditions for the warning is still wrong) or a real
> > problem? We did not notice any negative impact on the system so far.
> > 
> > Input welcome...
> 
> The thing is that this may happen on PREEMPT_RT. Usually because
> softirqs can't be run as the CPU is blocked on locks and the lock-owner
> is either preempted on another CPU or blocked on something else.
> The thing is that a NO_HZ CPU should not go idle if there are softirqs
> pending as in "there is work to do, no nap for you". But as I explained
> earlier, on PREEMPT_RT it might happen that the work can't be handled
> and if no task can be run, the CPU goes to sleep.
> 
> That means if you replace with PERIODIC, the warning goes away. If you
> start a CPU-hog (on per-CPU) the warning goes away.

With PERIODIC you mean CONFIG_HZ_PERIODIC, right?

We have CONFIG_NO_HZ_FULL=y set but do net set the nohz_full= cmdline
parameter, so that we should get CONFIG_NO_HZ_IDLE behavior at the end.

I realized today that the warning is somehow related to our RT tuning.
Enabling NAPI threading makes the warning go away, even if NAPI threads
are tuned the same way as ksoftirqd.

I will have to look into that in more depth.

Thanks for your input Sebastian.

> 
> > Best regards,
> > Florian
> 
> Sebastian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-25 15:16   ` Florian Bezdeka
@ 2025-02-26  9:17     ` Sebastian Andrzej Siewior
  2025-02-26 17:41       ` MOESSBAUER, Felix
  2025-02-26 22:23       ` Florian Bezdeka
  0 siblings, 2 replies; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-02-26  9:17 UTC (permalink / raw)
  To: Florian Bezdeka
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On 2025-02-25 16:16:25 [+0100], Florian Bezdeka wrote:
> > which version is this? I think is is an imported issue. Is v6.1.119-rt45
> > also affected?
> 
> This is a typo, right? You mean it is an important issue, no?

No, the version is correct. And I meant "imported" as in we got it from
the stable queue.

> We can see that on
> 
> - v6.1.90-rt (Debian -rt kernel)
> - v6.1.120-rt (Debian -rt kernel)
> - v6.1.119-rt45 (So yes, this is also affected)
> - v6.1.120-rt47

But if this is visible on v6.1.90-rt then it is not originating from
what I assumed.

> With PERIODIC you mean CONFIG_HZ_PERIODIC, right?
correct.

> We have CONFIG_NO_HZ_FULL=y set but do net set the nohz_full= cmdline
> parameter, so that we should get CONFIG_NO_HZ_IDLE behavior at the end.
> 
> I realized today that the warning is somehow related to our RT tuning.
> Enabling NAPI threading makes the warning go away, even if NAPI threads
> are tuned the same way as ksoftirqd.
NAPI threads? You have RPS enabled by any chance?
Would commit
    dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
    80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the defer_list.")

help?

> I will have to look into that in more depth.
> 
> Thanks for your input Sebastian.

You are welcome.

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-26  9:17     ` Sebastian Andrzej Siewior
@ 2025-02-26 17:41       ` MOESSBAUER, Felix
  2025-02-26 22:23       ` Florian Bezdeka
  1 sibling, 0 replies; 9+ messages in thread
From: MOESSBAUER, Felix @ 2025-02-26 17:41 UTC (permalink / raw)
  To: Bezdeka, Florian, bigeasy@linutronix.de
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan

On Wed, 2025-02-26 at 10:17 +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-25 16:16:25 [+0100], Florian Bezdeka wrote:
> > > which version is this? I think is is an imported issue. Is
> > > v6.1.119-rt45
> > > also affected?
> > 
> > This is a typo, right? You mean it is an important issue, no?
> 
> No, the version is correct. And I meant "imported" as in we got it
> from
> the stable queue.
> 
> > We can see that on
> > 
> > - v6.1.90-rt (Debian -rt kernel)
> > - v6.1.120-rt (Debian -rt kernel)
> > - v6.1.119-rt45 (So yes, this is also affected)
> > - v6.1.120-rt47
> 
> But if this is visible on v6.1.90-rt then it is not originating from
> what I assumed.
> 
> > With PERIODIC you mean CONFIG_HZ_PERIODIC, right?
> correct.
> 
> > We have CONFIG_NO_HZ_FULL=y set but do net set the nohz_full=
> > cmdline
> > parameter, so that we should get CONFIG_NO_HZ_IDLE behavior at the
> > end.
> > 
> > I realized today that the warning is somehow related to our RT
> > tuning.
> > Enabling NAPI threading makes the warning go away, even if NAPI
> > threads
> > are tuned the same way as ksoftirqd.
> NAPI threads? You have RPS enabled by any chance?
> Would commit
>     dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
>     80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the
> defer_list.")

Hi,

I tried a backport of the two patches to 6.1.120-rt47, but for that a
lot of infrastructure needs to be backported as well. In a minimal
setting, I was able to reduce that to the following patches:

80d2eefcb4c84 net: Use backlog-NAPI to clean up the defer_list.
be12a1fe298e8 net: skbuff: add skb_append_pagefrags and use it
dad6b97702639 net: Allow to use SMP threads for backlog NAPI.
87eff2ec57b6d net: optimize napi_threaded_poll() vs RPS/RFS
8fcb76b934daf net: napi_schedule_rps() cleanup
a1aaee7f8f79d net: make napi_threaded_poll() aware of sd->defer_list

This, however requires CONFIG_PAGE_POOL=n, CONFIG_DEVMEM=n as the
page_pool_create_percpu parts added in 2b0cfa6e49566 ("net: add generic
percpu page_pool allocator") is not easy to backport.

With these settings we were not able to run our test workload that
reproduces the warning. By that, I simply can't tell if it reproduces
or not.

Best regards,
Felix

> 
> help?
> 
> > I will have to look into that in more depth.
> > 
> > Thanks for your input Sebastian.
> 
> You are welcome.
> 
> Sebastian

-- 
Siemens AG
Linux Expert Center
Friedrich-Ludwig-Bauer-Str. 3
85748 Garching, Germany

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-26  9:17     ` Sebastian Andrzej Siewior
  2025-02-26 17:41       ` MOESSBAUER, Felix
@ 2025-02-26 22:23       ` Florian Bezdeka
  2025-02-27 13:43         ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-02-26 22:23 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On Wed, 2025-02-26 at 10:17 +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-25 16:16:25 [+0100], Florian Bezdeka wrote:
> > > which version is this? I think is is an imported issue. Is v6.1.119-rt45
> > > also affected?
> > 
> > This is a typo, right? You mean it is an important issue, no?
> 
> No, the version is correct. And I meant "imported" as in we got it from
> the stable queue.
> 
> > We can see that on
> > 
> > - v6.1.90-rt (Debian -rt kernel)
> > - v6.1.120-rt (Debian -rt kernel)
> > - v6.1.119-rt45 (So yes, this is also affected)
> > - v6.1.120-rt47
> 
> But if this is visible on v6.1.90-rt then it is not originating from
> what I assumed.
> 
> > With PERIODIC you mean CONFIG_HZ_PERIODIC, right?
> correct.
> 
> > We have CONFIG_NO_HZ_FULL=y set but do net set the nohz_full= cmdline
> > parameter, so that we should get CONFIG_NO_HZ_IDLE behavior at the end.
> > 
> > I realized today that the warning is somehow related to our RT tuning.
> > Enabling NAPI threading makes the warning go away, even if NAPI threads
> > are tuned the same way as ksoftirqd.
> NAPI threads? You have RPS enabled by any chance?
> Would commit
>     dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
>     80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the defer_list.")

With NAPI threads I meant the kernel threads driving the NAPI poll
which can be activated by "echo 1 > /sys/class/net/$device/threaded"

RPS is build configuration wise enabled but not configured.

As Felix already reported, backporting the mentioned commits to 6.1
would require some manual work. Simply picking all the dependencies
seems too much for now. Will have a look again...

> 
> help?
> 
> > I will have to look into that in more depth.
> > 
> > Thanks for your input Sebastian.
> 
> You are welcome.
> 
> Sebastian


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-26 22:23       ` Florian Bezdeka
@ 2025-02-27 13:43         ` Sebastian Andrzej Siewior
  2025-02-28 16:28           ` Florian Bezdeka
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-02-27 13:43 UTC (permalink / raw)
  To: Florian Bezdeka
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On 2025-02-26 23:23:30 [+0100], Florian Bezdeka wrote:
> > NAPI threads? You have RPS enabled by any chance?
> > Would commit
> >     dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
> >     80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the defer_list.")
> 
> With NAPI threads I meant the kernel threads driving the NAPI poll
> which can be activated by "echo 1 > /sys/class/net/$device/threaded"
> 
> RPS is build configuration wise enabled but not configured.

If RPS is disabled then this might be something else.

> As Felix already reported, backporting the mentioned commits to 6.1
> would require some manual work. Simply picking all the dependencies
> seems too much for now. Will have a look again...

Can you reproduce this on a later (more recent) kernel? What are the
steps to reproduce this?

Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-27 13:43         ` Sebastian Andrzej Siewior
@ 2025-02-28 16:28           ` Florian Bezdeka
  2025-02-28 16:41             ` Jan Kiszka
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Bezdeka @ 2025-02-28 16:28 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas, Kiszka, Jan,
	MOESSBAUER, Felix

On Thu, 2025-02-27 at 14:43 +0100, Sebastian Andrzej Siewior wrote:
> On 2025-02-26 23:23:30 [+0100], Florian Bezdeka wrote:
> > > NAPI threads? You have RPS enabled by any chance?
> > > Would commit
> > >     dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
> > >     80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the defer_list.")
> > 
> > With NAPI threads I meant the kernel threads driving the NAPI poll
> > which can be activated by "echo 1 > /sys/class/net/$device/threaded"
> > 
> > RPS is build configuration wise enabled but not configured.
> 
> If RPS is disabled then this might be something else.
> 
> > As Felix already reported, backporting the mentioned commits to 6.1
> > would require some manual work. Simply picking all the dependencies
> > seems too much for now. Will have a look again...
> 
> Can you reproduce this on a later (more recent) kernel? What are the
> steps to reproduce this?

I haven't tried that yet but after some tracing I have a much better
idea whats going on:

Our RT tuning sets ksofirqd threads to FIFO, which seems not the best
idea. I have to ask some colleagues why they thought this is necessary.
As those threads are driving the NAPI poll in my case they might run to
infinity and with that into the RT throttling - forcing CPUs to idle
with softirqs pending.

Does that make sense for you as well?

I can force the system into the same situation with threaded NAPI
enabled and assigning FIFO to the NAPI threads.

FIFO for threads driving network RX traffic doesn't seem wise. A DDoS
should be quite easy...

Florian

> 
> Sebastian

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 6.1-rt: NOHZ tick-stop error: local softirq work is pending
  2025-02-28 16:28           ` Florian Bezdeka
@ 2025-02-28 16:41             ` Jan Kiszka
  0 siblings, 0 replies; 9+ messages in thread
From: Jan Kiszka @ 2025-02-28 16:41 UTC (permalink / raw)
  To: Florian Bezdeka, Sebastian Andrzej Siewior
  Cc: linux-rt-users@vger.kernel.org, Ziegler, Andreas,
	MOESSBAUER, Felix

On 28.02.25 17:28, Florian Bezdeka wrote:
> On Thu, 2025-02-27 at 14:43 +0100, Sebastian Andrzej Siewior wrote:
>> On 2025-02-26 23:23:30 [+0100], Florian Bezdeka wrote:
>>>> NAPI threads? You have RPS enabled by any chance?
>>>> Would commit
>>>>     dad6b97702639 ("net: Allow to use SMP threads for backlog NAPI.")
>>>>     80d2eefcb4c84 ("net: Use backlog-NAPI to clean up the defer_list.")
>>>
>>> With NAPI threads I meant the kernel threads driving the NAPI poll
>>> which can be activated by "echo 1 > /sys/class/net/$device/threaded"
>>>
>>> RPS is build configuration wise enabled but not configured.
>>
>> If RPS is disabled then this might be something else.
>>
>>> As Felix already reported, backporting the mentioned commits to 6.1
>>> would require some manual work. Simply picking all the dependencies
>>> seems too much for now. Will have a look again...
>>
>> Can you reproduce this on a later (more recent) kernel? What are the
>> steps to reproduce this?
> 
> I haven't tried that yet but after some tracing I have a much better
> idea whats going on:
> 
> Our RT tuning sets ksofirqd threads to FIFO, which seems not the best
> idea. I have to ask some colleagues why they thought this is necessary.

There are some remaining AF_PACKET workloads with RT needs, and NAPI
threading was probably not available yet when this was first configured.
That was at least one reason I heard.

Jan

> As those threads are driving the NAPI poll in my case they might run to
> infinity and with that into the RT throttling - forcing CPUs to idle
> with softirqs pending.
> 
> Does that make sense for you as well?
> 
> I can force the system into the same situation with threaded NAPI
> enabled and assigning FIFO to the NAPI threads.
> 
> FIFO for threads driving network RX traffic doesn't seem wise. A DDoS
> should be quite easy...
> 
> Florian
> 
>>
>> Sebastian
> 

-- 
Siemens AG, Foundational Technologies
Linux Expert Center

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-02-28 16:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-21  9:32 6.1-rt: NOHZ tick-stop error: local softirq work is pending Bezdeka, Florian
2025-02-24 11:55 ` Sebastian Andrzej Siewior
2025-02-25 15:16   ` Florian Bezdeka
2025-02-26  9:17     ` Sebastian Andrzej Siewior
2025-02-26 17:41       ` MOESSBAUER, Felix
2025-02-26 22:23       ` Florian Bezdeka
2025-02-27 13:43         ` Sebastian Andrzej Siewior
2025-02-28 16:28           ` Florian Bezdeka
2025-02-28 16:41             ` Jan Kiszka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox