The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	 linux-kernel@vger.kernel.org, marco.crivellari@suse.com,
	frederic@kernel.org,  Hillf Danton <hdanton@sina.com>,
	kernel-team@meta.com, kmagar@redhat.com, psuriset@redhat.com
Subject: Re: [PATCH v3 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work()
Date: Wed, 24 Jun 2026 04:19:02 -0700	[thread overview]
Message-ID: <aju8GR04O-haKOTE@gmail.com> (raw)
In-Reply-To: <20260624084756.X2i4QiPm@linutronix.de>

Hello Sebastian,

On Wed, Jun 24, 2026 at 10:47:56AM +0200, Sebastian Andrzej Siewior wrote:
> On 2026-06-16 06:33:32 [-0700], Breno Leitao wrote:
> > __queue_work() is the enqueue hot path: it inserts the work item and
> > calls kick_pool() while holding pool->lock. kick_pool() ends in a
> > wakeup, which takes the target task's rq->lock, so rq->lock nests under
> > pool->lock on every enqueue that wakes a worker on a contended unbound
> > pool.
> > 
> > Use kick_pool_pick() to select and claim the worker under pool->lock,
> > queue it on an on-stack wake_q, and issue the wakeup with wake_up_q()
> > right after dropping the lock via raw_spin_unlock_wake(). Worker
> > selection, wake_cpu setup and claiming the worker off pool->idle_list
> > still happen under the lock; only the rq->lock acquisition moves out.
> > 
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> …
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> …
> >  out:
> > -	raw_spin_unlock(&pool->lock);
> > +	raw_spin_unlock_wake(&pool->lock, &wakeq);
> >  	rcu_read_unlock();
> >  }
> 
> It is not wrong but I am not sure if this is really needed here. The
> pattern
> 	preempt_disable();
> 	raw_spin_unlock();
> 	wake_up_q()
> 	preempt_enable();
> 
> is used to prevent task preemption after the unlock operation. The
> futex/ locking code needs to wake a task but before the unlock operation
> the task priority might have been lowered as result of dropping the
> lock. This means it might not be the task with the highest priority in
> the system and the task with highest is not yet active and we schedule a
> task in middle in the instead.
> To form this easier: say we have Task A prio=1, B prio=2 and C prio=3
> with higher number higher priority. 
> A owns a lock and is preempted by B. C gets on the CPU preempts B. C
> wants A's lock so it passes its priority to A (PI-boost) and goes idle.
> A gets on the CPU and unlocks.
> Now: As part of the unlock operation (before the raw_spin_unlock()) A
> goes back to its initial priority and C is not yet woken up meaning B is
> the task with highest priority and the held raw_spinlock_t is the only
> thing preventing scheduling. So we disable preemption, unlock and then
> perform the wake-up so C becomes the next candidate (as it should be) to
> occupy the CPU.
> Otherwise it would be B which means per definition a task with lower
> priority runs before a task with higher priority.
> 
> I don't think workqueue has this requirements here. Worst case something
> else gets on the CPU and worker wakeup is delayed until the task is
> scheduled again. It could be fine since the preemption could happen
> before queue-work().
> It probably does not lead to a huge performance
> regression but who knows.
> 
> Your goal was to lower the contention on the pool lock during the wake
> up which you achieved. The __queue_work() remains still not preemptible
> until after the wake up which might be fine.

You're absolutely right — the preemption dance isn't needed here.

I used raw_spin_unlock_wake() because the helper seemed to match what I
was doing, but this preempt_disable/enable pattern (while cheap) is
unnecessary for this use case.

I'll update the series to use the standard approach:

  raw_spin_unlock_irq(&pool->lock);
  wake_up_q(&wakeq);

Thanks,
--breno

  reply	other threads:[~2026-06-24 11:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 13:33 [PATCH v3 0/3] workqueue: Shrink the lock time Breno Leitao
2026-06-16 13:33 ` [PATCH v3 1/3] workqueue: split kick_pool() into kick_pool_pick() + wake_up_q() Breno Leitao
2026-06-16 13:33 ` [PATCH v3 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work() Breno Leitao
2026-06-24  8:47   ` Sebastian Andrzej Siewior
2026-06-24 11:19     ` Breno Leitao [this message]
2026-06-16 13:33 ` [PATCH v3 3/3] workqueue: defer the worker wakeup outside pool->lock in process_one_work() Breno Leitao
2026-06-24  8:54 ` [PATCH v3 0/3] workqueue: Shrink the lock time Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aju8GR04O-haKOTE@gmail.com \
    --to=leitao@debian.org \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kmagar@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=psuriset@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox