Re: [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Tejun Heo <tj@kernel.org>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: "jiangshanlai@gmail.com" <jiangshanlai@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kernel-team@meta.com" <kernel-team@meta.com>
Subject: Re: [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues
Date: Fri, 12 Jan 2024 14:17:01 -1000	[thread overview]
Message-ID: <ZaHWfWYvAiolChWG@slm.duckdns.org> (raw)
In-Reply-To: <ZaCMkV_pjPfhZmrn@mtj.duckdns.org>

Hello,

On Thu, Jan 11, 2024 at 02:49:21PM -1000, Tejun Heo wrote:
> On Fri, Jan 05, 2024 at 02:44:08AM +0000, Naohiro Aota wrote:
> > Thank you for the series. I applied the patches on btrfs's development tree
> > below, and ran the benchmark.
> > 
> > https://gitlab.com/kdave/btrfs-devel.git misc-next
> > 
> > - misc-next, numa=off (baseline)
> >   WRITE: bw=1117MiB/s (1171MB/s), 1117MiB/s-1117MiB/s (1171MB/s-1171MB/s), io=332GiB (356GB), run=304322-304322msec
> > - misc-next + wq patches, numa=off
> >   WRITE: bw=1866MiB/s (1957MB/s), 1866MiB/s-1866MiB/s (1957MB/s-1957MB/s), io=684GiB (735GB), run=375472-375472msec
> > 
> > So, the patches surely improved the performance. However, as show below, it
> > is still lower than reverting previous workqueue patches. The reverting is
> > done by reverse applying output of "git diff 4cbfd3de737b
> > kernel/workqueue.c kernel/workqueue_internal.h include/linux/workqueue*
> > init/main.c"
> > 
> > - misc-next + wq reverted, numa=off
> >   WRITE: bw=2472MiB/s (2592MB/s), 2472MiB/s-2472MiB/s (2592MB/s-2592MB/s), io=732GiB (786GB), run=303257-303257msec
> 
> Can you describe the test setup in detail? What kind of machine is it? What
> do you mean by `numa=off`? Can you report tools/workqueue/wq_dump.py output?

So, I fixed the possible ordering bug that Lai noticed and dropped the last
patch (more on this in the reply to that path) and did some benchmarking
with fio and dm-crypt and at least in that testing the new code seems to
perform just as well as before. The only variable seems to be what
max_active is used for the workqueue in question.

For dm-crypt, kcryptd workqueue uses num_online_cpus(). Depending on how the
value is interpreted, it may not provide high enough concurrency as some
workers wait for IOs and show slightly slower performance but that's easily
fixed by bumping max_active value so that there's some buffer, which is the
right way to configure it anyway.

It'd be great if you can share more details on the benchmarks you're
running, so that we can rule out similar issues.

Thanks.

-- 
tejun

next prev parent reply	other threads:[~2024-01-13  0:17 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-20  7:24 [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues Tejun Heo
2023-12-20  7:24 ` [PATCH 01/10] workqueue: Move pwq->max_active to wq->max_active Tejun Heo
2023-12-26  9:13   ` Lai Jiangshan
2023-12-26 20:05     ` Tejun Heo
2023-12-26 21:36       ` Tejun Heo
2023-12-20  7:24 ` [PATCH 02/10] workqueue: Factor out pwq_is_empty() Tejun Heo
2023-12-20  7:24 ` [PATCH 03/10] workqueue: Replace pwq_activate_inactive_work() with [__]pwq_activate_work() Tejun Heo
2023-12-20  7:24 ` [PATCH 04/10] workqueue: Move nr_active handling into helpers Tejun Heo
2023-12-26  9:12   ` Lai Jiangshan
2023-12-26 20:06     ` Tejun Heo
2023-12-20  7:24 ` [PATCH 05/10] workqueue: Make wq_adjust_max_active() round-robin pwqs while activating Tejun Heo
2023-12-20  7:24 ` [PATCH 06/10] workqueue: Add first_possible_node and node_nr_cpus[] Tejun Heo
2023-12-20  7:24 ` [PATCH 07/10] workqueue: Move pwq_dec_nr_in_flight() to the end of work item handling Tejun Heo
2023-12-20  7:24 ` [PATCH 08/10] workqueue: Introduce struct wq_node_nr_active Tejun Heo
2023-12-26  9:14   ` Lai Jiangshan
2023-12-26 20:12     ` Tejun Heo
2023-12-20  7:24 ` [PATCH 09/10] workqueue: Implement system-wide nr_active enforcement for unbound workqueues Tejun Heo
2023-12-20  7:24 ` [PATCH 10/10] workqueue: Reimplement ordered workqueue using shared nr_active Tejun Heo
2024-01-13  0:18   ` Tejun Heo
2023-12-20  9:20 ` [PATCHSET wq/for-6.8] workqueue: Implement system-wide max_active for unbound workqueues Lai Jiangshan
2023-12-21 23:01   ` Tejun Heo
2023-12-22  8:04     ` Lai Jiangshan
2023-12-22  9:08       ` Tejun Heo
2024-01-05  2:44 ` Naohiro Aota
2024-01-12  0:49   ` Tejun Heo
2024-01-13  0:17     ` Tejun Heo [this message]
2024-01-15  5:46     ` Naohiro Aota
2024-01-16 21:04       ` Tejun Heo
2024-01-30  2:24         ` Naohiro Aota
2024-01-30 16:11           ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZaHWfWYvAiolChWG@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=Naohiro.Aota@wdc.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox