Re: [PATCH v5 0/2] fix inaccurate io_ticks

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ming Lei <ming.lei@redhat.com>
To: Weiping Zhang <zwp10758@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>, Mike Snitzer <snitzer@redhat.com>,
	mpatocka@redhat.com, linux-block@vger.kernel.org
Subject: Re: [PATCH v5 0/2] fix inaccurate io_ticks
Date: Tue, 17 Nov 2020 15:40:39 +0800	[thread overview]
Message-ID: <20201117074039.GA74954@T590> (raw)
In-Reply-To: <CAA70yB4G_1jHYRyVsf_mhHQA-_mGXzaZ6n4Bgtq9n-x1_Yz4rg@mail.gmail.com>

On Tue, Nov 17, 2020 at 12:59:46PM +0800, Weiping Zhang wrote:
> On Tue, Nov 17, 2020 at 11:28 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Nov 17, 2020 at 11:01:49AM +0800, Weiping Zhang wrote:
> > > Hi Jens,
> > >
> > > Ping
> >
> > Hello Weiping,
> >
> > Not sure we have to fix this issue, and adding blk_mq_queue_inflight()
> > back to IO path brings cost which turns out to be visible, and I did
> > get soft lockup report on Azure NVMe because of this kind of cost.
> >
> Have you test v5, this patch is different from v1, the v1 gets
> inflight for each IO,
> v5 has changed to get inflight every jiffer.

I meant the issue can be reproduced on kernel before 5b18b5a73760("block:
delete part_round_stats and switch to less precise counting").

Also do we really need to fix this issue? I understand device
utilization becomes not accurate at very small load, is it really
worth of adding runtime load in fast path for fixing this issue?

> 
> If for v5, can we reproduce it on null_blk ?

No, I just saw report on Azure NVMe.

> 
> > BTW, suppose the io accounting issue needs to be fixed, just wondering
> > why not simply revert 5b18b5a73760 ("block: delete part_round_stats and
> > switch to less precise counting"), and the original way had been worked
> > for decades.
> >
> This patch is more better than before, it will break early when find there is
> inflight io on any cpu, for the worst case(the io in running on the last cpu),
> it iterates all cpus.

Please see the following case:

1) one device has 256 hw queues, and the system has 256 cpu cores, and
each hw queue's depth is 1k.

2) there isn't any io load on CPUs(0 ~ 254)

3) heavy io load is run on CPU 255

So with your trick the code still need to iterate hw queues from 0 to 254, and
the load isn't something which can be ignored. Especially it is just for
io accounting.


Thanks,
Ming

next prev parent reply	other threads:[~2020-11-17  7:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-27  4:54 [PATCH v5 0/2] fix inaccurate io_ticks Weiping Zhang
2020-11-04  3:26 ` Weiping Zhang
2020-11-17  3:01   ` Weiping Zhang
2020-11-17  3:27     ` Ming Lei
2020-11-17  4:59       ` Weiping Zhang
2020-11-17  7:40         ` Ming Lei [this message]
2020-11-18  5:55           ` Weiping Zhang
2020-11-26 11:23             ` Weiping Zhang
2020-12-17 17:03               ` Weiping Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201117074039.GA74954@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=snitzer@redhat.com \
    --cc=zwp10758@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.