All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, marco.crivellari@suse.com,
	 frederic@kernel.org, bigeasy@linutronix.de,
	Hillf Danton <hdanton@sina.com>,
	 Breno Leitao <leitao@debian.org>,
	kernel-team@meta.com, kmagar@redhat.com,  psuriset@redhat.com
Subject: [PATCH v3 0/3] workqueue: Shrink the lock time
Date: Tue, 16 Jun 2026 06:33:30 -0700	[thread overview]
Message-ID: <20260616-fastwake-v3-0-79da19fcd08f@debian.org> (raw)

The goal of this patchset is to decrease the time spent under the
workqueue pool->lock.

Currently the worker process is woken up inside pool->lock. The wakeup
ends in wake_up_process(), which takes the target task's rq->lock, so
rq->lock nests under pool->lock on the two hottest paths of a contended
unbound workqueue (__queue_work() enqueue and process_one_work() chain
kick). On some architectures the wakeup is even more expensive: on
arm64 waking a CPU that is idle (in wfi) issues an IPI.

Doing all of that while holding pool->lock lengthens the locked region
and hurts throughput on contended unbound pools.

This series shortens the locked region by selecting and claiming the
worker to wake under pool->lock, but issuing the actual wakeup after the
lock is dropped, using the wake_q machinery (wake_q_add() under the
lock, wake_up_q() after).

Because the win is a shorter pool->lock hold time, it shows up most
clearly as lower enqueue latency under contention.

Performance numbers (based on in-kernel workqueue microbenchmark)

VMs and arm64 (Grace) is where this series is meant to pay off -- waking
an idle CPU sitting in wfi costs an IPI (on arm; similar type of
operation on VMs), so doing it under pool->lock lengthens the critical
section.

The arm64 bare-metal numbers match what the x86-or-arm64 VM showed:

    affinity_scope    baseline    patched    tput     p95
                     (items/s)  (items/s)    gain    drop
    --------------   ---------  ---------  ------  ------
    cpu              2,569,880  3,029,740  +17.9%  -13.6%
    smt              2,586,485  3,044,788  +17.7%  -14.0%
    cache_shard        572,055    797,621  +39.4%  -37.1%
    cache              538,132    724,997  +34.7%  -30.1%
    numa               528,673    658,215  +24.5%  -20.5%
    system             524,287    614,486  +17.2%  -21.1%

(p95 drop = change in p95 enqueue latency; negative is better.)
(tput gain = number of requests enqueued per sec; bigger is better.)

Patch 1 is a pure refactor introducing kick_pool_pick().
Patch 2 defers the wakeup on the enqueue path (__queue_work()).
Patch 3 defers the wakeup on the per-work chain-kick path
(process_one_work()).

Changes in v3:
- Drop the "park kicked worker on pool->kicked_list" patch (v2 1/4).
  * That is a fix that is independent of this patch, in case we want to
    revamp it, it can be sent separately.
- Link to v2: https://lore.kernel.org/r/20260603-fastwake-v2-0-2977512fe7fa@debian.org

Changes in v2:
- Close the idle_cull_fn() vs kicked-worker race by parking the kicked
  worker on a new pool->kicked_list under pool->lock (new patch 1).
  Reported by Hillf Danton.
- Use the wake_q machinery (wake_q_add() / wake_up_q() via
  raw_spin_unlock_wake()) instead of plumbing a task_struct out of the
  helper by hand. Suggested by Sebastian Andrzej Siewior.
- Link to v1: https://lore.kernel.org/r/20260526-fastwake-v1-0-e69ad86923e6@debian.org

Signed-off-by: Breno Leitao <leitao@debian.org>
---
To: Tejun Heo <tj@kernel.org>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org

---
Breno Leitao (3):
      workqueue: split kick_pool() into kick_pool_pick() + wake_up_q()
      workqueue: defer the worker wakeup outside pool->lock in __queue_work()
      workqueue: defer the worker wakeup outside pool->lock in process_one_work()

 kernel/workqueue.c | 42 +++++++++++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 9 deletions(-)
---
base-commit: 8d6dbbbe3ba62de0a63e962ee004afb848c8e3ac
change-id: 20260526-fastwake-02982fd66312

Best regards,
-- 
Breno Leitao <leitao@debian.org>


             reply	other threads:[~2026-06-16 13:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 13:33 Breno Leitao [this message]
2026-06-16 13:33 ` [PATCH v3 1/3] workqueue: split kick_pool() into kick_pool_pick() + wake_up_q() Breno Leitao
2026-06-16 13:33 ` [PATCH v3 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work() Breno Leitao
2026-06-24  8:47   ` Sebastian Andrzej Siewior
2026-06-24 11:19     ` Breno Leitao
2026-06-16 13:33 ` [PATCH v3 3/3] workqueue: defer the worker wakeup outside pool->lock in process_one_work() Breno Leitao
2026-06-24  8:54 ` [PATCH v3 0/3] workqueue: Shrink the lock time Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616-fastwake-v3-0-79da19fcd08f@debian.org \
    --to=leitao@debian.org \
    --cc=bigeasy@linutronix.de \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kmagar@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=psuriset@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.