All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <tom.leiming@gmail.com>
To: Wen Xiong <wenxiong@linux.ibm.com>
Cc: linux-block@vger.kernel.org, axboe@kernel.dk, jmoyer@redhat.com,
	Gjoyce <gjoyce@linux.ibm.com>,
	wenxiong@us.ibm.com
Subject: Re: Observing higher CPU utilization during random IO fio testing
Date: Fri, 29 May 2026 20:10:55 -0500	[thread overview]
Message-ID: <aho5HxLsWMEpbUg2@fedora> (raw)
In-Reply-To: <338169f719c77e4afe58f42e9760349e@linux.ibm.com>

On Thu, May 21, 2026 at 02:44:22PM -0500, Wen Xiong wrote:
> Hi All,
> 
> Our performance team observed the higher CPU utilization in RHEL10 compared
> to RHEL9.8, observed the similar issue in upstream kernel(v7.1-rc4) as well
> when running FIO random IO tests.
> 
> System configuration:
> 47 dedicate cores
> 120 GB memory
> PCIe4 2-Port 64Gb FC Adapter
> FlashSystem: FS9500, 12 LUNs/FC port, 100G each LUN.
> 
> Random IO tests are more CPU intensive than sequential IO tests due to
> several factors: more context switching, Interrupt Handling,  cache
> Inefficiency etc. We found out the following patch which caused the higher
> CPU utilization in rhel10 and newer linux kernel:
> 
> commit 060406c61c7cb4bbd82a02d179decca9c9bb3443 (HEAD)
> Author: Yu Kuai <yukuai3@huawei.com>
> Date:   Thu May 9 20:38:25 2024 +0800
> 
> block: add plug while submitting IO
> 
> So that if caller didn't use plug, for example, __blkdev_direct_IO_simple()
> and __blkdev_direct_IO_async(), block layer can still benefit from caching
> nsec time in the plug.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> Link:
> https://lore.kernel.org/r/20240509123825.3225207-1-yukuai1@huaweicloud.com
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> 
> We reverted above patch in rhel10 kernel and upstream 7.1-rc4, saw lower CPU
> utilization when doing the same FIO test.
> 
> The patch adds plugging in __submit_bio() in block layer, maybe cause
> performance degradation:
> - Random IO tests have less merging, flush overhead.
> - More IO scheduler interaction, forces requests through scheduler instead
> of direct dispatch(direct dispatch to hardware queue)
> - Poor cache locality during plug operation

Yes, it is expected to see regression on QD=1 workload.

Adding inner plug for caching timestamp only is not good from plug function viewpoint,
because only the outer code path(io_uring, libaio, ...) knows exact IO batch size
and can decide if plug should be used.

Given 060406c61c7c ("block: add plug while submitting IO") doesn't provide
any performance data, maybe it can be reverted.

I am wondering why not move the timestamp cache into 'task_struct' and get wider users?


Thanks,
Ming

  parent reply	other threads:[~2026-05-30  1:11 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-21 19:44 Observing higher CPU utilization during random IO fio testing Wen Xiong
2026-05-21 21:52 ` Jens Axboe
2026-05-25  5:28   ` Yu Kuai
2026-05-26 15:28     ` Wen Xiong
2026-05-29 17:13     ` Wen Xiong
2026-05-31 11:45       ` Yu Kuai
2026-05-30  1:10 ` Ming Lei [this message]
2026-05-31 11:56   ` Yu Kuai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aho5HxLsWMEpbUg2@fedora \
    --to=tom.leiming@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=gjoyce@linux.ibm.com \
    --cc=jmoyer@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=wenxiong@linux.ibm.com \
    --cc=wenxiong@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.