public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET wq/for-6.5] workqueue: Implement automatic CPU intensive detection and add monitoring
@ 2023-04-18 20:51 Tejun Heo
  2023-04-18 20:51 ` [PATCH 1/5] workqueue, sched: Notify workqueue of scheduling of RUNNING tasks Tejun Heo
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Tejun Heo @ 2023-04-18 20:51 UTC (permalink / raw)
  To: jiangshanlai; +Cc: torvalds, peterz, linux-kernel, kernel-team

Hello,

To reduce the number of concurrent worker threads, workqueue holds back
starting per-cpu work items while the previous work item stays in the
RUNNING state. As such a per-cpu work item which consumes a lot of CPU
cycles, even if it has cond_resched()'s in the right places, can stall other
per-cpu work items.

To support per-cpu work items that may occupy the CPU for a substantial
period of time, workqueue has WQ_CPU_INTENSIVE flag which exempts work items
issued through the marked workqueue from concurrency management - they're
started immediately and don't block other work items. While this works, it's
error-prone in that a workqueue user can easily forget to set the flag or
set it unnecessarily. Furthermore, the impacts of the wrong flag setting can
be rather indirect and challenging to root-cause.

This patchset makes workqueue auto-detect CPU intensive work items based on
CPU consumption. If a work item consumes more than the threshold (5ms by
default) of CPU time, it's automatically marked as CPU intensive when it
gets scheduled out which unblocks starting of pending per-cpu work items.

The mechanism isn't foolproof in that the detection delays can add up if
many CPU-hogging work items are queued at the same time. However, in such
situations, the bigger problem likely is the CPU being saturated with
per-cpu work items and the solution would be making them UNBOUND. Future
changes will make UNBOUND workqueues more attractive by improving their
locality behaviors and eventually remove the explicit WQ_CPU_INTENSIVE flag.

While at it, add statistics and a monitoring script. Lack of visibility has
always been a bit of pain point when debugging workqueue related issues and
with this change and more drastic ones planned for workqueue, this is a good
time to address the shortcoming.

This patchset was born out of the discussion in the following thread:

 https://lkml.kernel.org/r/CAHk-=wgE9kORADrDJ4nEsHHLirqPCZ1tGaEPAZejHdZ03qCOGg@mail.gmail.com

and contains the following five patches:

 0001-workqueue-sched-Notify-workqueue-of-scheduling-of-RU.patch
 0002-workqueue-Re-order-struct-worker-fields.patch
 0003-workqueue-Move-worker_set-clr_flags-upwards.patch
 0004-workqueue-Automatically-mark-CPU-hogging-work-items-.patch
 0005-workqueue-Add-pwq-stats-and-a-monitoring-script.patch

and also available in the following git branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git auto-cpu-intensive

diffstat follows. Thanks.

 Documentation/core-api/workqueue.rst |   30 ++++++++
 kernel/sched/core.c                  |   18 +----
 kernel/workqueue.c                   |  221 +++++++++++++++++++++++++++++++++++++++-----------------------
 kernel/workqueue_internal.h          |   14 +--
 tools/workqueue/wq_monitor.py        |  148 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 333 insertions(+), 98 deletions(-)

--
tejun


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-04-28 15:19 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-18 20:51 [PATCHSET wq/for-6.5] workqueue: Implement automatic CPU intensive detection and add monitoring Tejun Heo
2023-04-18 20:51 ` [PATCH 1/5] workqueue, sched: Notify workqueue of scheduling of RUNNING tasks Tejun Heo
2023-04-18 20:51 ` [PATCH 2/5] workqueue: Re-order struct worker fields Tejun Heo
2023-04-18 20:51 ` [PATCH 3/5] workqueue: Move worker_set/clr_flags() upwards Tejun Heo
2023-04-18 20:51 ` [PATCH 4/5] workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE Tejun Heo
2023-04-23  3:23   ` Lai Jiangshan
2023-04-24 15:29     ` Tejun Heo
2023-04-25 13:12   ` Peter Zijlstra
2023-04-28 15:19     ` Tejun Heo
2023-04-18 20:51 ` [PATCH 5/5] workqueue: Add pwq->stats[] and a monitoring script Tejun Heo
     [not found] ` <20230419014552.1410-1-hdanton@sina.com>
2023-04-19 15:45   ` [PATCH 4/5] workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox