public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Dexuan Cui <decui@microsoft.com>
Cc: Jens Axboe <axboe@kernel.dk>, 'Christoph Hellwig' <hch@lst.de>,
	"'linux-block@vger.kernel.org'" <linux-block@vger.kernel.org>,
	Long Li <longli@microsoft.com>,
	"Michael Kelley (LINUX)" <mikelley@microsoft.com>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	Mike Snitzer <snitzer@redhat.com>,
	dm-devel@redhat.com
Subject: Re: Random high CPU utilization in blk-mq with the none scheduler
Date: Tue, 14 Dec 2021 08:53:26 +0800	[thread overview]
Message-ID: <YbfrBpcV4hasdqQB@T590> (raw)
In-Reply-To: <BYAPR21MB12706DCD5ED9FC7AB3EE2EEABF759@BYAPR21MB1270.namprd21.prod.outlook.com>

On Tue, Dec 14, 2021 at 12:31:23AM +0000, Dexuan Cui wrote:
> > From: Ming Lei <ming.lei@redhat.com>
> > Sent: Sunday, December 12, 2021 11:38 PM
> 
> Ming, thanks so much for the detailed analysis!
> 
> > From the log:
> > 
> > 1) dm-mpath:
> > - queue depth: 2048
> > - busy: 848, and 62 of them are in sw queue, so run queue is often
> >   caused
> > - nr_hw_queues: 1
> > - dm-2 is in use, and dm-1/dm-3 is idle
> > - dm-2's dispatch busy is 8, that should be the reason why excessive CPU
> > usage is observed when flushing plug list without commit dc5fc361d891 in
> > which hctx->dispatch_busy is just bypassed
> > 
> > 2) iscsi
> > - dispatch_busy is 0
> > - nr_hw_queues: 1
> > - queue depth: 113
> > - busy=~33, active_queues is 3, so each LUN/iscsi host is saturated
> > - 23 active LUNs, 23 * 33 = 759 in-flight commands
> > 
> > The high CPU utilization may be caused by:
> > 
> > 1) big queue depth of dm mpath, the situation may be improved much if it
> > is reduced to 1024 or 800. The max allowed inflight commands from iscsi
> > hosts can be figured out, if dm's queue depth is much more than this number,
> > the extra commands need to dispatch, and run queue can be scheduled
> > immediately, so high CPU utilization is caused.
> 
> I think you're correct:
> with dm_mod.dm_mq_queue_depth=256, the max CPU utilization is 8%.
> with dm_mod.dm_mq_queue_depth=400, the max CPU utilization is 12%. 
> with dm_mod.dm_mq_queue_depth=800, the max CPU utilization is 88%.
> 
> The performance with queue_depth=800 is poor.
> The performance with queue_depth=400 is good.
> The performance with queue_depth=256 is also good, and there is only a 
> small drop comared with the 400 case.

That should be the reason why the issue isn't triggered in case of real
io scheduler.

So far blk-mq doesn't provide way to adjust tags queue depth
dynamically.

But not understand reason of default dm_mq_queue_depth(2048), in this
situation, each LUN can just queue 113/3 requests at most, and 3 LUNs
are attached to single iscsi host.

Mike, can you share why the default dm_mq_queue_depth is so big? And
seems it doesn't consider the underlying queue's queue depth. What is
the biggest dm rq queue depth? which need to saturate all underlying paths?

> 
> > 2) single hw queue, so contention should be big, which should be avoided
> > in big machine, nvme-tcp might be better than iscsi here
> > 
> > 3) iscsi io latency is a bit big
> > 
> > Even CPU utilization is reduced by commit dc5fc361d891, io performance
> > can't be good too with v5.16-rc, I guess.
> > 
> > Thanks,
> > Ming
> 
> Actually the I/O performance of v5.16-rc4 (commit dc5fc361d891 is included)
> is good -- it's about the same as the case where v5.16-rc4 + reverting
> dc5fc361d891 + dm_mod.dm_mq_queue_depth=400 (or 256).

The single hw queue may be the root cause of your issue, and there
is only single run_work, which can be touched by all CPUs(~200) almost, so cache
ping-pong could be very serious. 

Jens patch may improve it more or less, please test it.

Thanks,
Ming


  reply	other threads:[~2021-12-14  0:54 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10  3:30 Random high CPU utilization in blk-mq with the none scheduler Dexuan Cui
2021-12-11  1:29 ` Dexuan Cui
2021-12-11  2:04   ` Jens Axboe
2021-12-11  3:10     ` Dexuan Cui
2021-12-11  3:15       ` Jens Axboe
2021-12-11  3:44         ` Dexuan Cui
2021-12-11  7:09           ` Dexuan Cui
2021-12-11 14:21             ` Jens Axboe
2021-12-11 18:54               ` Dexuan Cui
2021-12-13 18:43                 ` Jens Axboe
2021-12-14  0:43                   ` Dexuan Cui
2021-12-13  3:23       ` Ming Lei
2021-12-13  4:20         ` Dexuan Cui
2021-12-13  7:38           ` Ming Lei
2021-12-14  0:31             ` Dexuan Cui
2021-12-14  0:53               ` Ming Lei [this message]
2021-12-14  3:09                 ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YbfrBpcV4hasdqQB@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=decui@microsoft.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mikelley@microsoft.com \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox