All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
@ 2023-08-18 20:07 paul.gortmaker
  2023-08-20 17:23 ` Wen Yang
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: paul.gortmaker @ 2023-08-18 20:07 UTC (permalink / raw)
  To: LKML, linux-rt-users
  Cc: Paul Gortmaker, Wen Yang, Thomas Gleixner, Peter Zijlstra,
	Paul E . McKenney, Frederic Weisbecker

From: Paul Gortmaker <paul.gortmaker@windriver.com>

In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
the new function report_idle_softirq() was created by breaking code out
of the existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

	if (a && b && c)
		warn();

to a three conditional:

	if (!a)
		return;
	if (!b)
		return;
	if (!c)
		return;
	warn();

However, it seems one of the conditionals didn't get a "!" removed.
Compare the instance of local_bh_blocked() in the old code:

-               if (ratelimit < 10 && !local_bh_blocked() &&
-                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
-                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
-                               (unsigned int) local_softirq_pending());
-                       ratelimit++;
-               }

...to the usage in the new (5.18+) code:

+       /* On RT, softirqs handling may be waiting on some lock */
+       if (!local_bh_blocked())
+               return false;

It seems apparent that the "!" should be removed from the new code.

This issue lay dormant until another fixup for the same commit was added
in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
This commit realized the ratelimit was essentially set to zero instead
of ten, and hence *no* softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4
preempt-rt kernels started printing out 10 instances of this at boot:

  NOHZ tick-stop error: local softirq work is pending, handler #80!!!

Just to double check my understanding of things, I confirmed that the
v5.18-rt did print the pending-80 messages with a cherry pick of the
ratelimit fix, and then confirmed no pending softirq messages were
printed with a revert of mainline's 034569 on a v5.18-rt baseline.

Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
also didn't break anything on a defconfig of mainline master of today.

Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Cc: Wen Yang <wenyang.linux@foxmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 2b865cb77feb..b52e1861b913 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
 		return false;
 
 	/* On RT, softirqs handling may be waiting on some lock */
-	if (!local_bh_blocked())
+	if (local_bh_blocked())
 		return false;
 
 	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
-- 
2.40.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-18 20:07 [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT paul.gortmaker
@ 2023-08-20 17:23 ` Wen Yang
  2023-08-21 22:03   ` Paul E. McKenney
  2023-08-24 16:00 ` Ahmad Fatoum
  2023-08-30 10:30 ` [tip: timers/urgent] tick/rcu: Fix false positive "softirq work is pending" messages tip-bot2 for Paul Gortmaker
  2 siblings, 1 reply; 8+ messages in thread
From: Wen Yang @ 2023-08-20 17:23 UTC (permalink / raw)
  To: paul.gortmaker, LKML, linux-rt-users
  Cc: Thomas Gleixner, Peter Zijlstra, Paul E . McKenney,
	Frederic Weisbecker


On 2023/8/19 04:07, paul.gortmaker@windriver.com wrote:
> From: Paul Gortmaker <paul.gortmaker@windriver.com>
>
> In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> the new function report_idle_softirq() was created by breaking code out
> of the existing can_stop_idle_tick() for kernels v5.18 and newer.
>
> In doing so, the code essentially went from a one conditional:
>
> 	if (a && b && c)
> 		warn();
>
> to a three conditional:
>
> 	if (!a)
> 		return;
> 	if (!b)
> 		return;
> 	if (!c)
> 		return;
> 	warn();
>
> However, it seems one of the conditionals didn't get a "!" removed.
> Compare the instance of local_bh_blocked() in the old code:
>
> -               if (ratelimit < 10 && !local_bh_blocked() &&
> -                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
> -                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
> -                               (unsigned int) local_softirq_pending());
> -                       ratelimit++;
> -               }
>
> ...to the usage in the new (5.18+) code:
>
> +       /* On RT, softirqs handling may be waiting on some lock */
> +       if (!local_bh_blocked())
> +               return false;
>
> It seems apparent that the "!" should be removed from the new code.
>
> This issue lay dormant until another fixup for the same commit was added
> in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
> This commit realized the ratelimit was essentially set to zero instead
> of ten, and hence *no* softirq pending messages would ever be issued.
>
> Once this commit was backported via linux-stable, both the v6.1 and v6.4
> preempt-rt kernels started printing out 10 instances of this at boot:
>
>    NOHZ tick-stop error: local softirq work is pending, handler #80!!!
>
> Just to double check my understanding of things, I confirmed that the
> v5.18-rt did print the pending-80 messages with a cherry pick of the
> ratelimit fix, and then confirmed no pending softirq messages were
> printed with a revert of mainline's 034569 on a v5.18-rt baseline.
>
> Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
> also didn't break anything on a defconfig of mainline master of today.
>
> Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> Cc: Wen Yang <wenyang.linux@foxmail.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
>
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 2b865cb77feb..b52e1861b913 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
>   		return false;
>   
>   	/* On RT, softirqs handling may be waiting on some lock */
> -	if (!local_bh_blocked())
> +	if (local_bh_blocked())
>   		return false;
>   
>   	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",


Good catch!

Reviewed-by: Wen Yang <wenyang.linux@foxmail.com>

--
Thanks,
Wen


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-20 17:23 ` Wen Yang
@ 2023-08-21 22:03   ` Paul E. McKenney
  2023-08-28 15:03     ` Frederic Weisbecker
  0 siblings, 1 reply; 8+ messages in thread
From: Paul E. McKenney @ 2023-08-21 22:03 UTC (permalink / raw)
  To: Wen Yang
  Cc: paul.gortmaker, LKML, linux-rt-users, Thomas Gleixner,
	Peter Zijlstra, Frederic Weisbecker

On Mon, Aug 21, 2023 at 01:23:15AM +0800, Wen Yang wrote:
> 
> On 2023/8/19 04:07, paul.gortmaker@windriver.com wrote:
> > From: Paul Gortmaker <paul.gortmaker@windriver.com>
> > 
> > In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> > the new function report_idle_softirq() was created by breaking code out
> > of the existing can_stop_idle_tick() for kernels v5.18 and newer.
> > 
> > In doing so, the code essentially went from a one conditional:
> > 
> > 	if (a && b && c)
> > 		warn();
> > 
> > to a three conditional:
> > 
> > 	if (!a)
> > 		return;
> > 	if (!b)
> > 		return;
> > 	if (!c)
> > 		return;
> > 	warn();
> > 
> > However, it seems one of the conditionals didn't get a "!" removed.
> > Compare the instance of local_bh_blocked() in the old code:
> > 
> > -               if (ratelimit < 10 && !local_bh_blocked() &&
> > -                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
> > -                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
> > -                               (unsigned int) local_softirq_pending());
> > -                       ratelimit++;
> > -               }
> > 
> > ...to the usage in the new (5.18+) code:
> > 
> > +       /* On RT, softirqs handling may be waiting on some lock */
> > +       if (!local_bh_blocked())
> > +               return false;
> > 
> > It seems apparent that the "!" should be removed from the new code.
> > 
> > This issue lay dormant until another fixup for the same commit was added
> > in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
> > This commit realized the ratelimit was essentially set to zero instead
> > of ten, and hence *no* softirq pending messages would ever be issued.
> > 
> > Once this commit was backported via linux-stable, both the v6.1 and v6.4
> > preempt-rt kernels started printing out 10 instances of this at boot:
> > 
> >    NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> > 
> > Just to double check my understanding of things, I confirmed that the
> > v5.18-rt did print the pending-80 messages with a cherry pick of the
> > ratelimit fix, and then confirmed no pending softirq messages were
> > printed with a revert of mainline's 034569 on a v5.18-rt baseline.
> > 
> > Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
> > also didn't break anything on a defconfig of mainline master of today.
> > 
> > Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> > Cc: Wen Yang <wenyang.linux@foxmail.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> > 
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index 2b865cb77feb..b52e1861b913 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
> >   		return false;
> >   	/* On RT, softirqs handling may be waiting on some lock */
> > -	if (!local_bh_blocked())
> > +	if (local_bh_blocked())
> >   		return false;
> >   	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
> 
> Good catch!
> 
> Reviewed-by: Wen Yang <wenyang.linux@foxmail.com>

Frederic would normally take this, but he appears to be out.  So I am
(probably only temporarily) queueing this in -rcu for more testing
coverage.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-18 20:07 [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT paul.gortmaker
  2023-08-20 17:23 ` Wen Yang
@ 2023-08-24 16:00 ` Ahmad Fatoum
  2023-08-30 10:30 ` [tip: timers/urgent] tick/rcu: Fix false positive "softirq work is pending" messages tip-bot2 for Paul Gortmaker
  2 siblings, 0 replies; 8+ messages in thread
From: Ahmad Fatoum @ 2023-08-24 16:00 UTC (permalink / raw)
  To: paul.gortmaker, LKML, linux-rt-users
  Cc: Wen Yang, Thomas Gleixner, Peter Zijlstra, Paul E . McKenney,
	Frederic Weisbecker

On 18.08.23 22:07, paul.gortmaker@windriver.com wrote:
> From: Paul Gortmaker <paul.gortmaker@windriver.com>
> 
> In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> the new function report_idle_softirq() was created by breaking code out
> of the existing can_stop_idle_tick() for kernels v5.18 and newer.
> 
> In doing so, the code essentially went from a one conditional:
> 
> 	if (a && b && c)
> 		warn();
> 
> to a three conditional:
> 
> 	if (!a)
> 		return;
> 	if (!b)
> 		return;
> 	if (!c)
> 		return;
> 	warn();
> 
> However, it seems one of the conditionals didn't get a "!" removed.
> Compare the instance of local_bh_blocked() in the old code:
> 
> -               if (ratelimit < 10 && !local_bh_blocked() &&
> -                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
> -                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
> -                               (unsigned int) local_softirq_pending());
> -                       ratelimit++;
> -               }
> 
> ...to the usage in the new (5.18+) code:
> 
> +       /* On RT, softirqs handling may be waiting on some lock */
> +       if (!local_bh_blocked())
> +               return false;
> 
> It seems apparent that the "!" should be removed from the new code.
> 
> This issue lay dormant until another fixup for the same commit was added
> in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
> This commit realized the ratelimit was essentially set to zero instead
> of ten, and hence *no* softirq pending messages would ever be issued.
> 
> Once this commit was backported via linux-stable, both the v6.1 and v6.4
> preempt-rt kernels started printing out 10 instances of this at boot:
> 
>   NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> 
> Just to double check my understanding of things, I confirmed that the
> v5.18-rt did print the pending-80 messages with a cherry pick of the
> ratelimit fix, and then confirmed no pending softirq messages were
> printed with a revert of mainline's 034569 on a v5.18-rt baseline.
> 
> Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
> also didn't break anything on a defconfig of mainline master of today.
> 
> Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> Cc: Wen Yang <wenyang.linux@foxmail.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>

Thanks,
Ahmad

> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 2b865cb77feb..b52e1861b913 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
>  		return false;
>  
>  	/* On RT, softirqs handling may be waiting on some lock */
> -	if (!local_bh_blocked())
> +	if (local_bh_blocked())
>  		return false;
>  
>  	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-21 22:03   ` Paul E. McKenney
@ 2023-08-28 15:03     ` Frederic Weisbecker
  2023-08-31 13:32       ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 8+ messages in thread
From: Frederic Weisbecker @ 2023-08-28 15:03 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Wen Yang, paul.gortmaker, LKML, linux-rt-users, Thomas Gleixner,
	Peter Zijlstra

Le Mon, Aug 21, 2023 at 03:03:10PM -0700, Paul E. McKenney a écrit :
> On Mon, Aug 21, 2023 at 01:23:15AM +0800, Wen Yang wrote:
> > 
> > On 2023/8/19 04:07, paul.gortmaker@windriver.com wrote:
> > > From: Paul Gortmaker <paul.gortmaker@windriver.com>
> > > 
> > > In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> > > the new function report_idle_softirq() was created by breaking code out
> > > of the existing can_stop_idle_tick() for kernels v5.18 and newer.
> > > 
> > > In doing so, the code essentially went from a one conditional:
> > > 
> > > 	if (a && b && c)
> > > 		warn();
> > > 
> > > to a three conditional:
> > > 
> > > 	if (!a)
> > > 		return;
> > > 	if (!b)
> > > 		return;
> > > 	if (!c)
> > > 		return;
> > > 	warn();
> > > 
> > > However, it seems one of the conditionals didn't get a "!" removed.
> > > Compare the instance of local_bh_blocked() in the old code:
> > > 
> > > -               if (ratelimit < 10 && !local_bh_blocked() &&
> > > -                   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
> > > -                       pr_warn("NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #%02x!!!\n",
> > > -                               (unsigned int) local_softirq_pending());
> > > -                       ratelimit++;
> > > -               }
> > > 
> > > ...to the usage in the new (5.18+) code:
> > > 
> > > +       /* On RT, softirqs handling may be waiting on some lock */
> > > +       if (!local_bh_blocked())
> > > +               return false;
> > > 
> > > It seems apparent that the "!" should be removed from the new code.
> > > 
> > > This issue lay dormant until another fixup for the same commit was added
> > > in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
> > > This commit realized the ratelimit was essentially set to zero instead
> > > of ten, and hence *no* softirq pending messages would ever be issued.
> > > 
> > > Once this commit was backported via linux-stable, both the v6.1 and v6.4
> > > preempt-rt kernels started printing out 10 instances of this at boot:
> > > 
> > >    NOHZ tick-stop error: local softirq work is pending, handler #80!!!
> > > 
> > > Just to double check my understanding of things, I confirmed that the
> > > v5.18-rt did print the pending-80 messages with a cherry pick of the
> > > ratelimit fix, and then confirmed no pending softirq messages were
> > > printed with a revert of mainline's 034569 on a v5.18-rt baseline.
> > > 
> > > Finally I confirmed it fixed the issue on v6.1-rt and v6.4-rt, and
> > > also didn't break anything on a defconfig of mainline master of today.
> > > 
> > > Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
> > > Cc: Wen Yang <wenyang.linux@foxmail.com>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Paul E. McKenney <paulmck@kernel.org>
> > > Cc: Frederic Weisbecker <frederic@kernel.org>
> > > Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> > > 
> > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > > index 2b865cb77feb..b52e1861b913 100644
> > > --- a/kernel/time/tick-sched.c
> > > +++ b/kernel/time/tick-sched.c
> > > @@ -1050,7 +1050,7 @@ static bool report_idle_softirq(void)
> > >   		return false;
> > >   	/* On RT, softirqs handling may be waiting on some lock */
> > > -	if (!local_bh_blocked())
> > > +	if (local_bh_blocked())
> > >   		return false;
> > >   	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",
> > 
> > Good catch!
> > 
> > Reviewed-by: Wen Yang <wenyang.linux@foxmail.com>
> 
> Frederic would normally take this, but he appears to be out.  So I am
> (probably only temporarily) queueing this in -rcu for more testing
> coverage.

I'm back, I should relay this to Thomas to avoid conflicts with
timers changes.

Thanks all of you, clearly I wasn't thinking much the day I wrote this
patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [tip: timers/urgent] tick/rcu: Fix false positive "softirq work is pending" messages
  2023-08-18 20:07 [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT paul.gortmaker
  2023-08-20 17:23 ` Wen Yang
  2023-08-24 16:00 ` Ahmad Fatoum
@ 2023-08-30 10:30 ` tip-bot2 for Paul Gortmaker
  2 siblings, 0 replies; 8+ messages in thread
From: tip-bot2 for Paul Gortmaker @ 2023-08-30 10:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Paul Gortmaker, Thomas Gleixner, Ahmad Fatoum, Wen Yang,
	Frederic Weisbecker, x86, linux-kernel

The following commit has been merged into the timers/urgent branch of tip:

Commit-ID:     96c1fa04f089a7e977a44e4e8fdc92e81be20bef
Gitweb:        https://git.kernel.org/tip/96c1fa04f089a7e977a44e4e8fdc92e81be20bef
Author:        Paul Gortmaker <paul.gortmaker@windriver.com>
AuthorDate:    Fri, 18 Aug 2023 16:07:57 -04:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Wed, 30 Aug 2023 12:20:28 +02:00

tick/rcu: Fix false positive "softirq work is pending" messages

In commit 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle") the
new function report_idle_softirq() was created by breaking code out of the
existing can_stop_idle_tick() for kernels v5.18 and newer.

In doing so, the code essentially went from a one conditional:

	if (a && b && c)
		warn();

to a three conditional:

	if (!a)
		return;
	if (!b)
		return;
	if (!c)
		return;
	warn();

But that conversion got the condition for the RT specific
local_bh_blocked() wrong. The original condition was:

   	!local_bh_blocked()

but the conversion failed to negate it so it ended up as:

        if (!local_bh_blocked())
		return false;

This issue lay dormant until another fixup for the same commit was added
in commit a7e282c77785 ("tick/rcu: Fix bogus ratelimit condition").
This commit realized the ratelimit was essentially set to zero instead
of ten, and hence *no* softirq pending messages would ever be issued.

Once this commit was backported via linux-stable, both the v6.1 and v6.4
preempt-rt kernels started printing out 10 instances of this at boot:

  NOHZ tick-stop error: local softirq work is pending, handler #80!!!

Remove the negation and return when local_bh_blocked() evaluates to true to
bring the correct behaviour back.

Fixes: 0345691b24c0 ("tick/rcu: Stop allowing RCU_SOFTIRQ in idle")
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Wen Yang <wenyang.linux@foxmail.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230818200757.1808398-1-paul.gortmaker@windriver.com


---
 kernel/time/tick-sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 4df14db..87015e9 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1045,7 +1045,7 @@ static bool report_idle_softirq(void)
 		return false;
 
 	/* On RT, softirqs handling may be waiting on some lock */
-	if (!local_bh_blocked())
+	if (local_bh_blocked())
 		return false;
 
 	pr_warn("NOHZ tick-stop error: local softirq work is pending, handler #%02x!!!\n",

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-28 15:03     ` Frederic Weisbecker
@ 2023-08-31 13:32       ` Sebastian Andrzej Siewior
  2023-09-01  9:56         ` Thomas Gleixner
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-08-31 13:32 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Wen Yang, paul.gortmaker, LKML, linux-rt-users,
	Thomas Gleixner, Peter Zijlstra

On 2023-08-28 17:03:39 [+0200], Frederic Weisbecker wrote:
> > Frederic would normally take this, but he appears to be out.  So I am
> > (probably only temporarily) queueing this in -rcu for more testing
> > coverage.
> 
> I'm back, I should relay this to Thomas to avoid conflicts with
> timers changes.

I somehow missed this thread and I do see this if I enable NO_HZ. I lost
it…
Anyway, I'm going to pick it up for RT and ping the timer department
after the merge window.

> Thanks all of you, clearly I wasn't thinking much the day I wrote this
> patch.
:)

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT
  2023-08-31 13:32       ` Sebastian Andrzej Siewior
@ 2023-09-01  9:56         ` Thomas Gleixner
  0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2023-09-01  9:56 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Frederic Weisbecker
  Cc: Paul E. McKenney, Wen Yang, paul.gortmaker, LKML, linux-rt-users,
	Peter Zijlstra

On Thu, Aug 31 2023 at 15:32, Sebastian Andrzej Siewior wrote:

> On 2023-08-28 17:03:39 [+0200], Frederic Weisbecker wrote:
>> > Frederic would normally take this, but he appears to be out.  So I am
>> > (probably only temporarily) queueing this in -rcu for more testing
>> > coverage.
>> 
>> I'm back, I should relay this to Thomas to avoid conflicts with
>> timers changes.
>
> I somehow missed this thread and I do see this if I enable NO_HZ. I lost
> it…
> Anyway, I'm going to pick it up for RT and ping the timer department
> after the merge window.

It's queued in timers/urgent and will hit Linus tree before rc1

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-09-01  9:56 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-18 20:07 [PATCH] tick/rcu: fix false positive "softirq work is pending" messages on RT paul.gortmaker
2023-08-20 17:23 ` Wen Yang
2023-08-21 22:03   ` Paul E. McKenney
2023-08-28 15:03     ` Frederic Weisbecker
2023-08-31 13:32       ` Sebastian Andrzej Siewior
2023-09-01  9:56         ` Thomas Gleixner
2023-08-24 16:00 ` Ahmad Fatoum
2023-08-30 10:30 ` [tip: timers/urgent] tick/rcu: Fix false positive "softirq work is pending" messages tip-bot2 for Paul Gortmaker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.