From: Christoph Hellwig <hch@infradead.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
Dexuan Cui <decui@microsoft.com>, Ming Lei <ming.lei@redhat.com>,
linux-scsi@vger.kernel.org, Tejun Heo <tj@kernel.org>,
Lai Jiangshan <jiangshanlai@gmail.com>
Subject: Re: [PATCH] block: reduce kblockd_mod_delayed_work_on() CPU consumption
Date: Tue, 14 Dec 2021 07:04:37 -0800 [thread overview]
Message-ID: <YbiyhcbZmnNbed3O@infradead.org> (raw)
In-Reply-To: <bc529a3e-31d5-c266-8633-91095b346b19@kernel.dk>
On Tue, Dec 14, 2021 at 07:53:46AM -0700, Jens Axboe wrote:
> Dexuan reports that he's seeing spikes of very heavy CPU utilization when
> running 24 disks and using the 'none' scheduler. This happens off the
> flush path, because SCSI requires the queue to be restarted async, and
> hence we're hammering on mod_delayed_work_on() to ensure that the work
> item gets run appropriately.
>
> What we care about here is that the queue is run, and we don't need to
> repeatedly re-arm the timer associated with the delayed work item. If we
> check if the work item is pending upfront, then we don't really need to do
> anything else. This is safe as theh work pending bit is cleared before a
> work item is started.
>
> The only potential caveat here is if we have callers with wildly different
> timeouts specified. That's generally not the case, so don't think we need
> to care for that case.
So why not do a non-delayed queue_work for that case? Might be good
to get the scsi and workqueue maintaines involved to understand the
issue a bit better first.
>
> Reported-by: Dexuan Cui <decui@microsoft.com>
> Link: https://lore.kernel.org/linux-block/BYAPR21MB1270C598ED214C0490F47400BF719@BYAPR21MB1270.namprd21.prod.outlook.com/
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>
> ---
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 1378d084c770..4584fe709c15 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1484,7 +1484,16 @@ EXPORT_SYMBOL(kblockd_schedule_work);
> int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
> unsigned long delay)
> {
> - return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
> + /*
> + * Avoid hammering on work addition, if the work item is already
> + * pending. This is safe the work pending state is cleared before
> + * the work item is started, so if we see it set, then we know that
> + * whatever was previously queued on the block side will get run by
> + * an existing pending work item.
> + */
> + if (!work_pending(&dwork->work))
> + return mod_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
> + return true;
> }
> EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
>
> --
> Jens Axboe
>
---end quoted text---
next prev parent reply other threads:[~2021-12-14 15:05 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-14 14:53 [PATCH] block: reduce kblockd_mod_delayed_work_on() CPU consumption Jens Axboe
2021-12-14 15:04 ` Christoph Hellwig [this message]
2021-12-14 15:59 ` Jens Axboe
2021-12-14 20:42 ` Dexuan Cui
2021-12-15 17:40 ` Bart Van Assche
2021-12-16 7:22 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YbiyhcbZmnNbed3O@infradead.org \
--to=hch@infradead.org \
--cc=axboe@kernel.dk \
--cc=decui@microsoft.com \
--cc=jiangshanlai@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=ming.lei@redhat.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).