Re: [PATCH v2] workqueue: Add pool_workqueue to pending_pwqs list when unplugging multiple inactive works

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

From: Matthew Brost <matthew.brost@intel.com>
To: Waiman Long <longman@redhat.com>
Cc: <intel-xe@lists.freedesktop.org>,
	<dri-devel@lists.freedesktop.org>, <linux-kernel@vger.kernel.org>,
	Carlos Santa <carlos.santa@intel.com>,
	"Ryan Neph" <ryanneph@google.com>, <stable@vger.kernel.org>,
	Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH v2] workqueue: Add pool_workqueue to pending_pwqs list when unplugging multiple inactive works
Date: Wed, 1 Apr 2026 08:40:01 -0700	[thread overview]
Message-ID: <ac08UdszEeEI2iJj@gsse-cloud1.jf.intel.com> (raw)
In-Reply-To: <8eaf9c5e-70fc-4d68-a919-df371bb38283@redhat.com>

On Wed, Apr 01, 2026 at 10:44:55AM -0400, Waiman Long wrote:
> On 3/31/26 9:07 PM, Matthew Brost wrote:
> > In unplug_oldest_pwq(), the first inactive work item on the
> > pool_workqueue is activated correctly. However, if multiple inactive
> > works exist on the same pool_workqueue, subsequent works fail to
> > activate because wq_node_nr_active.pending_pwqs is empty — the list
> > insertion is skipped when the pool_workqueue is plugged.
> > 
> > Fix this by checking for additional inactive works in
> > unplug_oldest_pwq() and updating wq_node_nr_active.pending_pwqs
> > accordingly.
> > 
> > v2:
> >   - Use pwq_activate_first_inactive(pwq, false) rather than open coding
> >     list operations (Tejun)
> > 
> > Cc: Carlos Santa <carlos.santa@intel.com>
> > Cc: Ryan Neph <ryanneph@google.com>
> > Cc: stable@vger.kernel.org
> > Cc: Tejun Heo <tj@kernel.org>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Waiman Long <longman@redhat.com>
> > Cc: linux-kernel@vger.kernel.org
> > Fixes: 4c065dbce1e8 ("workqueue: Enable unbound cpumask update on ordered workqueues")
> > Signed-off-by: Matthew Brost <matthew.brost@intel.com>
> > 
> > ---
> > 
> > This bug was first reported by Google, where the Xe driver appeared to
> > hang due to a fencing signal not completing. We traced the issue to work
> > items not being scheduled, and it can be trivially reproduced on drm-tip
> > with the following commands:
> > 
> > shell0:
> > for i in {1..100}; do echo "Run $i"; xe_exec_threads --r \
> > threads-rebind-bindexecqueue; done
> > 
> > shell1:
> > for i in {1..1000}; do echo "toggle $i"; echo f > \
> > /sys/devices/virtual/workqueue/cpumask; echo ff > \
> > /sys/devices/virtual/workqueue/cpumask; echo fff > \
> > /sys/devices/virtual/workqueue/cpumask ; echo ffff > \
> > /sys/devices/virtual/workqueue/cpumask; sleep .1; done
> > ---
> >   kernel/workqueue.c | 11 ++++++++++-
> >   1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index b77119d71641..bee3f37fffde 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -1849,8 +1849,17 @@ static void unplug_oldest_pwq(struct workqueue_struct *wq)
> >   	raw_spin_lock_irq(&pwq->pool->lock);
> >   	if (pwq->plugged) {
> >   		pwq->plugged = false;
> > -		if (pwq_activate_first_inactive(pwq, true))
> > +		if (pwq_activate_first_inactive(pwq, true)) {
> > +			/*
> > +			 * pwq is unbound. Additional inactive work_items need
> > +			 * to reinsert the pwq into nna->pending_pwqs, which
> > +			 * was skipped while pwq->plugged was true. See
> > +			 * pwq_tryinc_nr_active() for additional details.
> > +			 */
> > +			pwq_activate_first_inactive(pwq, false);
> > +
> >   			kick_pool(pwq->pool);
> > +		}
> >   	}
> >   	raw_spin_unlock_irq(&pwq->pool->lock);
> >   }
> 
> Thanks for fixing this bug. However, calling pwq_activate_first_inactive

No problem — I think this one has been lurking around for a while, and
we’ve just papered over it in Xe for a couple of years.

> twice can be a bit hard to understand. Will modifying pwq_tryinc_nr_active()

I actually think it makes quite a bit of sense, as it matches what
__queue_work does if two items are added back-to-back on an ordered
workqueue — the first one updates the nr_active counts and activates,
and the second one updates the pending_pwqs.

> like the following works?
>

My initial thought was that your snippet should work — in fact, it does
for a while (drm-tip hangs almost immediately), but eventually I do get
a hang when running my reproducer, whereas with this patch I don’t. I
can’t reason exactly why — maybe it’s because
node_activate_pending_pwq() can find a plugged pwq, but that’s just a
guess.

Matt
 
> Thanks,
> Longman
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index b77119d71641..b35e6e62e474 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1738,9 +1738,6 @@ static bool pwq_tryinc_nr_active(struct pool_workqueue *pwq, bool fill)
>                 goto out;
>         }
> -       if (unlikely(pwq->plugged))
> -               return false;
> -
>         /*
>          * Unbound workqueue uses per-node shared nr_active $nna. If @pwq is
>          * already waiting on $nna, pwq_dec_nr_active() will maintain the
> @@ -1749,13 +1746,19 @@ static bool pwq_tryinc_nr_active(struct pool_workqueue *pwq, bool fill)
>          * We need to ignore the pending test after max_active has increased as
>          * pwq_dec_nr_active() can only maintain the concurrency level but not
>          * increase it. This is indicated by @fill.
> +        *
> +        * If @pwq is plugged, we need to make sure that it is linked to a
> +        * pending_pwqs of a $nna.
> +        *
>          */
> -       if (!list_empty(&pwq->pending_node) && likely(!fill))
> +       if (!list_empty(&pwq->pending_node) && likely(!fill || pwq->plugged))
>                 goto out;
> -       obtained = tryinc_node_nr_active(nna);
> -       if (obtained)
> -               goto out;
> +       if (likely(!pwq->plugged)) {
> +               obtained = tryinc_node_nr_active(nna);
> +               if (obtained)
> +                       goto out;
> +       }
>         /*
>          * Lockless acquisition failed. Lock, add ourself to $nna->pending_pwqs
> @@ -1773,7 +1776,8 @@ static bool pwq_tryinc_nr_active(struct pool_workqueue *pwq, bool fill)
>         smp_mb();
> -       obtained = tryinc_node_nr_active(nna);
> +       if (likely(!pwq->plugged))
> +               obtained = tryinc_node_nr_active(nna);
>         /*
>          * If @fill, @pwq might have already been pending. Being spuriously
>

next prev parent reply	other threads:[~2026-04-01 15:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01  1:07 [PATCH v2] workqueue: Add pool_workqueue to pending_pwqs list when unplugging multiple inactive works Matthew Brost
2026-04-01 14:44 ` Waiman Long
2026-04-01 15:40   ` Matthew Brost [this message]
2026-04-01 18:04     ` Waiman Long
2026-04-01 20:20 ` Tejun Heo
2026-04-02  4:18   ` Matthew Brost

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac08UdszEeEI2iJj@gsse-cloud1.jf.intel.com \
    --to=matthew.brost@intel.com \
    --cc=carlos.santa@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=ryanneph@google.com \
    --cc=stable@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox