All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL] workqueue changes for v7.1
@ 2026-04-13 17:29 Tejun Heo
  2026-04-15 18:13 ` pr-tracker-bot
  2026-04-22 18:49 ` workqueue: introduce cache_shard_size Bhithashri385
  0 siblings, 2 replies; 3+ messages in thread
From: Tejun Heo @ 2026-04-13 17:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Lai Jiangshan, linux-kernel

Hello,

The following changes since commit 6de23f81a5e08be8fbf5e8d7e9febc72a5b5f27f:

  Linux 7.0-rc1 (2026-02-22 13:18:59 -0800)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git tags/wq-for-7.1

for you to fetch changes up to 76af54648899abbd6b449c035583e47fd407078a:

  workqueue: validate cpumask_first() result in llc_populate_cpu_shard_id() (2026-04-13 06:15:26 -1000)

----------------------------------------------------------------
workqueue: Changes for v7.1

- New default WQ_AFFN_CACHE_SHARD affinity scope subdivides LLCs into
  smaller shards to improve scalability on machines with many CPUs per
  LLC.

- Misc: system_dfl_long_wq for long unbound works, devm_alloc_workqueue()
  for device-managed allocation, sysfs exposure for ordered workqueues and
  the EFI workqueue, removal of HK_TYPE_WQ from wq_unbound_cpumask, and
  various small fixes.

----------------------------------------------------------------
Arnd Bergmann (1):
      workqueue: avoid unguarded 64-bit division

Breno Leitao (12):
      tools/workqueue/wq_dump.py: remove backslash separator from node_nr/max_active header
      tools/workqueue/wq_dump.py: fix column alignment in node_nr/max_active section
      tools/workqueue/wq_dump.py: add NODE prefix to all node columns
      workqueue: fix parse_affn_scope() prefix matching bug
      workqueue: unlink pwqs from wq->pwqs list in alloc_and_link_pwqs() error path
      workqueue: fix typo in WQ_AFFN_SMT comment
      workqueue: add WQ_AFFN_CACHE_SHARD affinity scope
      workqueue: set WQ_AFFN_CACHE_SHARD as the default affinity scope
      tools/workqueue: add CACHE_SHARD support to wq_dump.py
      workqueue: add test_workqueue benchmark module
      docs: workqueue: document WQ_AFFN_CACHE_SHARD affinity scope
      workqueue: validate cpumask_first() result in llc_populate_cpu_shard_id()

Krzysztof Kozlowski (1):
      workqueue: devres: Add device-managed allocate workqueue

Mallesh Koujalagi (1):
      workqueue: Update documentation as per system_percpu_wq naming

Maninder Singh (1):
      workqueue: use NR_STD_WORKER_POOLS instead of hardcoded value

Marco Crivellari (1):
      workqueue: Add system_dfl_long_wq for long unbound works

Sebastian Andrzej Siewior (2):
      workqueue: Allow to expose ordered workqueues via sysfs
      efi: Allow to expose the workqueue via sysfs

Tejun Heo (2):
      Merge branch 'for-7.1-devm-alloc-wq' into for-7.1
      workqueue: Remove NULL wq WARN in __queue_delayed_work()

Waiman Long (1):
      workqueue: Remove HK_TYPE_WQ from affecting wq_unbound_cpumask

 Documentation/admin-guide/kernel-parameters.txt  |   3 +-
 Documentation/core-api/workqueue.rst             |  14 +-
 Documentation/driver-api/driver-model/devres.rst |   4 +
 drivers/firmware/efi/efi.c                       |   2 +-
 include/linux/workqueue.h                        |  47 +++-
 kernel/workqueue.c                               | 285 ++++++++++++++++++++--
 lib/Kconfig.debug                                |  10 +
 lib/Makefile                                     |   1 +
 lib/test_workqueue.c                             | 294 +++++++++++++++++++++++
 tools/workqueue/wq_dump.py                       |  20 +-
 10 files changed, 629 insertions(+), 51 deletions(-)
 create mode 100644 lib/test_workqueue.c

--
tejun

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [GIT PULL] workqueue changes for v7.1
  2026-04-13 17:29 [GIT PULL] workqueue changes for v7.1 Tejun Heo
@ 2026-04-15 18:13 ` pr-tracker-bot
  2026-04-22 18:49 ` workqueue: introduce cache_shard_size Bhithashri385
  1 sibling, 0 replies; 3+ messages in thread
From: pr-tracker-bot @ 2026-04-15 18:13 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Linus Torvalds, Lai Jiangshan, linux-kernel

The pull request you sent on Mon, 13 Apr 2026 07:29:30 -1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git tags/wq-for-7.1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7de6b4a246330fe29fa2fd144b4724ca35d60d6c

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: workqueue: introduce cache_shard_size
  2026-04-13 17:29 [GIT PULL] workqueue changes for v7.1 Tejun Heo
  2026-04-15 18:13 ` pr-tracker-bot
@ 2026-04-22 18:49 ` Bhithashri385
  1 sibling, 0 replies; 3+ messages in thread
From: Bhithashri385 @ 2026-04-22 18:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: tj

Hi,

I did a quick sanity check of cache_shard_size on a dual-socket system:

* 2x28C/56T (112 CPUs total), x86_64
* single local NVMe (XFS)
* upstream kernel with this change

Workload:

fio, 4k buffered writes with fsync=1
numjobs = 56 / 112 / 168

Compared:
workqueue.cache_shard_size=1 vs 8

Results (IOPS / BW):

jobs=56:
shard=1: ~265k IOPS, ~1035 MiB/s
shard=8: ~276k IOPS, ~1078 MiB/s

jobs=112:
shard=1: ~248k IOPS, ~968 MiB/s
shard=8: ~241k IOPS, ~941 MiB/s

jobs=168:
shard=1: ~234k IOPS, ~912 MiB/s
shard=8: ~233k IOPS, ~909 MiB/s

fsync latency (avg):

jobs=56:
shard=1: ~33 us
shard=8: ~28 us

jobs=112:
shard=1: ~53 us
shard=8: ~47 us

jobs=168:
shard=1: ~279 us
shard=8: ~256 us

Observations:

* Small improvement (~4%) at moderate concurrency (56 jobs).
* Differences mostly disappear at higher concurrency; both configs
  converge once the device is saturated (~93–96% util).
* fsync latency is consistently a bit lower with sharding.

Interpretation:

This workload appears to become device-bound quickly, so the benefit
from reduced workqueue contention is limited. The small gain at
moderate load and lower fsync latency suggest sharding is helping
somewhat before hitting the storage bottleneck.

I haven't tried multi-device or more metadata-heavy workloads yet,
which may stress workqueues more directly.

Thanks,
Hithashree


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-22 18:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 17:29 [GIT PULL] workqueue changes for v7.1 Tejun Heo
2026-04-15 18:13 ` pr-tracker-bot
2026-04-22 18:49 ` workqueue: introduce cache_shard_size Bhithashri385

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.