All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Hillf Danton <hdanton@sina.com>, Tejun Heo <tj@kernel.org>,
	 Lai Jiangshan <jiangshanlai@gmail.com>,
	linux-kernel@vger.kernel.org, marco.crivellari@suse.com,
	 frederic@kernel.org
Subject: Re: [PATCH 2/2] workqueue: defer wake_up_process() outside pool->lock on hot paths
Date: Thu, 28 May 2026 15:35:16 +0100	[thread overview]
Message-ID: <ahhR7TDX2OxUkxn4@gmail.com> (raw)
In-Reply-To: <20260527153500.ERVl3my3@linutronix.de>

On Wed, May 27, 2026 at 05:35:00PM +0000, Sebastian Andrzej Siewior wrote:
> On 2026-05-27 07:51:17 [-0700], Breno Leitao wrote:
> > @@ -3447,7 +3459,13 @@ static int worker_thread(void *__worker)
> >  		return 0;
> >  	}
> >  
> > -	worker_leave_idle(worker);
> > +	/*
> > +	 * Kicked workers have already been removed from pool->idle_list
> > +	 * by kick_pool(); only first-time wakeups (via create_worker())
> > +	 * still arrive with WORKER_IDLE set.
> > +	 */
> > +	if (worker->flags & WORKER_IDLE)
> > +		worker_leave_idle(worker);
> 
> Couldn't create_worker() be aligned here not set the idle flag and wake
> the thread a few lines later? Then we wouldn't have to conditionally
> clear the idle flag here (which sort of NULL renders the flag check in
> worker_leave_idle()).

Agreed. If create_worker() defers worker_enter_idle() until just before
wake_up_process(), we can drop this enter/leave pair entirely and avoid
the awkward conditional worker_leave_idle()

What about something like:

commit 0e91d33f8fe8e86c9a2be40d7a0163f68300ea1f
Author: Breno Leitao <leitao@debian.org>
Date:   Thu May 28 09:58:28 2026 -0400

    workqueue: leave idle under pool->lock before waking the worker
    
    A woken worker is woken from two different shapes today:
    
      - kick_pool_pick() picks a worker off pool->idle_list (LIFO head),
        optionally adjusts wake_cpu under pool->lock and returns the
        task to wake.  The worker is left WORKER_IDLE and on
        pool->idle_list at the point wake_up_process() runs.
    
      - create_worker() calls worker_enter_idle() to put a brand-new
        worker on pool->idle_list, then wakes it for the first time.
    
    In both cases the woken worker eventually reaches worker_thread:woke_up:,
    reacquires pool->lock and unconditionally calls worker_leave_idle() to
    flip its own WORKER_IDLE / pool->idle_list / pool->nr_idle state.  That
    is bookkeeping the waker could just as well have done under the same
    pool->lock acquisition it already holds, and it muddles the invariant
    for any concurrent observer: a runnable worker still observably looks
    idle until it actually schedules in.
    
    Move the worker_leave_idle() to the waker side, while pool->lock is
    still held:
    
      - kick_pool_pick() calls worker_leave_idle(worker) right before
        returning the task to wake.  Both kick_pool() and the deferred-
        wake callers in __queue_work() / process_one_work() inherit this.
    
      - create_worker() pairs its worker_enter_idle() with a matching
        worker_leave_idle() right before wake_up_process(), under the
        same pool->lock section.  nr_idle accounting is unchanged from
        the outside (enter and leave happen back-to-back under the
        lock); the only visible difference is that the new worker is no
        longer on pool->idle_list at the point it is woken.
    
      - worker_thread:woke_up: drops its worker_leave_idle() call: by the
        invariant above, a worker reaching woke_up: is always already
        !WORKER_IDLE and off pool->idle_list.
    
    The resulting invariant is uniform across all wakeup paths:
    
      a woken worker is !WORKER_IDLE and off pool->idle_list.
    
    No functional change intended.  This also shrinks the locked region
    at worker_thread:woke_up: (one fewer field update while holding
    pool->lock).
    
    Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    Signed-off-by: Breno Leitao <leitao@debian.org>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 8df671066dd1..e66dd507a841 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1310,6 +1310,8 @@ static bool kick_pool(struct worker_pool *pool)
 		}
 	}
 #endif
+	/* Leave idle under pool->lock; worker_thread:woke_up: relies on this. */
+	worker_leave_idle(worker);
 	wake_up_process(p);
 	return true;
 }
@@ -2881,8 +2883,11 @@ static struct worker *create_worker(struct worker_pool *pool)
 	 * check if not woken up soon. As kick_pool() is noop if @pool is empty,
 	 * wake it up explicitly.
 	 */
-	if (worker->task)
+	if (worker->task) {
+		/* match kick_pool_pick(): leave idle before waking */
+		worker_leave_idle(worker);
 		wake_up_process(worker->task);
+	}
 
 	raw_spin_unlock_irq(&pool->lock);
 
@@ -3447,7 +3452,7 @@ static int worker_thread(void *__worker)
 		return 0;
 	}
 
-	worker_leave_idle(worker);
+	/* wakers already called worker_leave_idle() under pool->lock */
 recheck:
 	/* no more worker necessary? */
 	if (!need_more_worker(pool))

  reply	other threads:[~2026-05-28 14:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-26 18:08 [PATCH 0/2] workqueue: Shrink the lock time Breno Leitao
2026-05-26 18:08 ` [PATCH 1/2] workqueue: split kick_pool() into kick_pool_pick() + wake_up_process() Breno Leitao
2026-05-26 18:08 ` [PATCH 2/2] workqueue: defer wake_up_process() outside pool->lock on hot paths Breno Leitao
2026-05-26 21:23   ` Hillf Danton
2026-05-27  9:48     ` Breno Leitao
2026-05-27 14:51       ` Breno Leitao
2026-05-27 15:35         ` Sebastian Andrzej Siewior
2026-05-28 14:35           ` Breno Leitao [this message]
2026-06-01 17:26           ` Breno Leitao
2026-05-27 15:22   ` Sebastian Andrzej Siewior
2026-05-28 13:41     ` Breno Leitao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahhR7TDX2OxUkxn4@gmail.com \
    --to=leitao@debian.org \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.