All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>,
	 linux-kernel@vger.kernel.org, marco.crivellari@suse.com,
	frederic@kernel.org,  Hillf Danton <hdanton@sina.com>,
	kernel-team@meta.com, kmagar@redhat.com, psuriset@redhat.com
Subject: Re: [PATCH v3 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work()
Date: Wed, 24 Jun 2026 04:19:02 -0700	[thread overview]
Message-ID: <aju8GR04O-haKOTE@gmail.com> (raw)
In-Reply-To: <20260624084756.X2i4QiPm@linutronix.de>

Hello Sebastian,

On Wed, Jun 24, 2026 at 10:47:56AM +0200, Sebastian Andrzej Siewior wrote:
> On 2026-06-16 06:33:32 [-0700], Breno Leitao wrote:
> > __queue_work() is the enqueue hot path: it inserts the work item and
> > calls kick_pool() while holding pool->lock. kick_pool() ends in a
> > wakeup, which takes the target task's rq->lock, so rq->lock nests under
> > pool->lock on every enqueue that wakes a worker on a contended unbound
> > pool.
> > 
> > Use kick_pool_pick() to select and claim the worker under pool->lock,
> > queue it on an on-stack wake_q, and issue the wakeup with wake_up_q()
> > right after dropping the lock via raw_spin_unlock_wake(). Worker
> > selection, wake_cpu setup and claiming the worker off pool->idle_list
> > still happen under the lock; only the rq->lock acquisition moves out.
> > 
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> …
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> …
> >  out:
> > -	raw_spin_unlock(&pool->lock);
> > +	raw_spin_unlock_wake(&pool->lock, &wakeq);
> >  	rcu_read_unlock();
> >  }
> 
> It is not wrong but I am not sure if this is really needed here. The
> pattern
> 	preempt_disable();
> 	raw_spin_unlock();
> 	wake_up_q()
> 	preempt_enable();
> 
> is used to prevent task preemption after the unlock operation. The
> futex/ locking code needs to wake a task but before the unlock operation
> the task priority might have been lowered as result of dropping the
> lock. This means it might not be the task with the highest priority in
> the system and the task with highest is not yet active and we schedule a
> task in middle in the instead.
> To form this easier: say we have Task A prio=1, B prio=2 and C prio=3
> with higher number higher priority. 
> A owns a lock and is preempted by B. C gets on the CPU preempts B. C
> wants A's lock so it passes its priority to A (PI-boost) and goes idle.
> A gets on the CPU and unlocks.
> Now: As part of the unlock operation (before the raw_spin_unlock()) A
> goes back to its initial priority and C is not yet woken up meaning B is
> the task with highest priority and the held raw_spinlock_t is the only
> thing preventing scheduling. So we disable preemption, unlock and then
> perform the wake-up so C becomes the next candidate (as it should be) to
> occupy the CPU.
> Otherwise it would be B which means per definition a task with lower
> priority runs before a task with higher priority.
> 
> I don't think workqueue has this requirements here. Worst case something
> else gets on the CPU and worker wakeup is delayed until the task is
> scheduled again. It could be fine since the preemption could happen
> before queue-work().
> It probably does not lead to a huge performance
> regression but who knows.
> 
> Your goal was to lower the contention on the pool lock during the wake
> up which you achieved. The __queue_work() remains still not preemptible
> until after the wake up which might be fine.

You're absolutely right — the preemption dance isn't needed here.

I used raw_spin_unlock_wake() because the helper seemed to match what I
was doing, but this preempt_disable/enable pattern (while cheap) is
unnecessary for this use case.

I'll update the series to use the standard approach:

  raw_spin_unlock_irq(&pool->lock);
  wake_up_q(&wakeq);

Thanks,
--breno

  reply	other threads:[~2026-06-24 11:19 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 13:33 [PATCH v3 0/3] workqueue: Shrink the lock time Breno Leitao
2026-06-16 13:33 ` [PATCH v3 1/3] workqueue: split kick_pool() into kick_pool_pick() + wake_up_q() Breno Leitao
2026-06-16 13:33 ` [PATCH v3 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work() Breno Leitao
2026-06-24  8:47   ` Sebastian Andrzej Siewior
2026-06-24 11:19     ` Breno Leitao [this message]
2026-06-16 13:33 ` [PATCH v3 3/3] workqueue: defer the worker wakeup outside pool->lock in process_one_work() Breno Leitao
2026-06-24  8:54 ` [PATCH v3 0/3] workqueue: Shrink the lock time Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aju8GR04O-haKOTE@gmail.com \
    --to=leitao@debian.org \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kmagar@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=psuriset@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.