All of lore.kernel.org
 help / color / mirror / Atom feed
From: Breno Leitao <leitao@debian.org>
To: Tejun Heo <tj@kernel.org>, Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, marco.crivellari@suse.com,
	 frederic@kernel.org, bigeasy@linutronix.de,
	Hillf Danton <hdanton@sina.com>,
	 Breno Leitao <leitao@debian.org>,
	kernel-team@meta.com, kmagar@redhat.com,  psuriset@redhat.com,
	david.dai@linux.dev
Subject: [PATCH v5 0/3] workqueue: Shrink the lock time
Date: Fri, 26 Jun 2026 02:57:52 -0700	[thread overview]
Message-ID: <20260626-fastwake-v5-0-9ae2f1867234@debian.org> (raw)

The goal of this patchset is to decrease the time spent under the
workqueue pool->lock.

Currently the worker process is woken up inside pool->lock. The wakeup
ends in wake_up_process(), which takes the target task's rq->lock, so
rq->lock nests under pool->lock on the two hottest paths of a contended
unbound workqueue (__queue_work() enqueue and process_one_work() chain
kick). On some architectures the wakeup is even more expensive: on
arm64 waking a CPU that is idle (in wfi) issues an IPI.

Doing all of that while holding pool->lock lengthens the locked region
and hurts throughput on contended unbound pools.

This series shortens the locked region by selecting and claiming the
worker to wake under pool->lock, but issuing the actual wakeup after the
lock is dropped, using the wake_q machinery (wake_q_add() under the
lock, wake_up_q() after).

Because the win is a shorter pool->lock hold time, it shows up most
clearly as lower enqueue latency under contention.

Performance numbers (based on in-kernel workqueue microbenchmark)

VMs and arm64 (Grace) is where this series is meant to pay off -- waking
an idle CPU sitting in wfi costs an IPI (on arm; similar type of
operation on VMs), so doing it under pool->lock lengthens the critical
section.

Latested number (from v5) on a Grace arm64 host:

      affinity_scope    baseline    patched    tput     p95
                       (items/s)  (items/s)    gain    drop
      --------------   ---------  ---------  ------  ------
      cpu              3,580,440  3,486,014   -2.6%   +3.5%
      smt              3,545,763  3,512,633   -0.9%   +2.8%
      cache_shard      3,397,678  3,651,063   +7.5%   -4.2%
      cache              720,368    797,914  +10.8%   -9.8%
      numa               719,794    794,049  +10.3%  -10.3%
      system             721,058    798,010  +10.7%  -10.0%

Signed-off-by: Breno Leitao <leitao@debian.org>

Changes in v5:
- Use wake_up_process() instead of the fancy wake_q_add() as raised by
  tejun.
- Dropped the review-by from Sebastian, given the code changed.
- Link to v4: https://lore.kernel.org/r/20260624-fastwake-v4-0-7b6d7b494a44@debian.org

Changes in v4:
- replace raw_spin_unlock_wake() with a standard 
  raw_spin_unlock() + wake_up_q() (Sebastian Andrzej Siewior)
- Link to v3: https://lore.kernel.org/r/20260616-fastwake-v3-0-79da19fcd08f@debian.org

Changes in v3:
- Drop the "park kicked worker on pool->kicked_list" patch (v2 1/4).
  * That is a fix that is independent of this patch, in case we want to
    revamp it, it can be sent separately.
- Link to v2: https://lore.kernel.org/r/20260603-fastwake-v2-0-2977512fe7fa@debian.org

Changes in v2:
- Close the idle_cull_fn() vs kicked-worker race by parking the kicked
  worker on a new pool->kicked_list under pool->lock (new patch 1).
  Reported by Hillf Danton.
- Use the wake_q machinery (wake_q_add() / wake_up_q() via
  raw_spin_unlock_wake()) instead of plumbing a task_struct out of the
  helper by hand. Suggested by Sebastian Andrzej Siewior.
- Link to v1: https://lore.kernel.org/r/20260526-fastwake-v1-0-e69ad86923e6@debian.org

---
Breno Leitao (3):
      workqueue: split kick_pool() into kick_pool_pick()
      workqueue: defer the worker wakeup outside pool->lock in __queue_work()
      workqueue: defer the worker wakeup outside pool->lock in process_one_work()

 kernel/workqueue.c | 51 ++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 7 deletions(-)
---
base-commit: 8d6dbbbe3ba62de0a63e962ee004afb848c8e3ac
change-id: 20260526-fastwake-02982fd66312

Best regards,
-- 
Breno Leitao <leitao@debian.org>


             reply	other threads:[~2026-06-26  9:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-26  9:57 Breno Leitao [this message]
2026-06-26  9:57 ` [PATCH v5 1/3] workqueue: split kick_pool() into kick_pool_pick() Breno Leitao
2026-06-26  9:57 ` [PATCH v5 2/3] workqueue: defer the worker wakeup outside pool->lock in __queue_work() Breno Leitao
2026-06-26  9:57 ` [PATCH v5 3/3] workqueue: defer the worker wakeup outside pool->lock in process_one_work() Breno Leitao
2026-06-29 18:24 ` [PATCH v5 0/3] workqueue: Shrink the lock time Tejun Heo
2026-06-29 19:31   ` Breno Leitao
2026-06-29 20:05     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260626-fastwake-v5-0-9ae2f1867234@debian.org \
    --to=leitao@debian.org \
    --cc=bigeasy@linutronix.de \
    --cc=david.dai@linux.dev \
    --cc=frederic@kernel.org \
    --cc=hdanton@sina.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=kmagar@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=psuriset@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.