All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: linux-block@vger.kernel.org, dm-devel@redhat.com,
	Vijayendra Suman <vijayendra.suman@oracle.com>
Subject: Re: Revert "dm: always call blk_queue_split() in dm_process_bio()"
Date: Fri, 11 Sep 2020 12:13:45 -0400	[thread overview]
Message-ID: <20200911161344.GA28614@redhat.com> (raw)
In-Reply-To: <20200911122038.GA167338@T590>

On Fri, Sep 11 2020 at  8:20am -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Thu, Sep 10, 2020 at 10:24:39AM -0400, Mike Snitzer wrote:
> > [cc'ing dm-devel and linux-block because this is upstream concern too]
> > 
> > On Wed, Sep 09 2020 at  1:00pm -0400,
> > Vijayendra Suman <vijayendra.suman@oracle.com> wrote:
> > 
> > >    Hello Mike,
> > > 
> > >    While Running pgbench tool with  5.4.17 kernel build
> > > 
> > >    Following performance degrade is found out
> > > 
> > >    buffer read/write metric : -17.2%
> > >    cache read/write metric : -18.7%
> > >    disk read/write metric : -19%
> > > 
> > >    buffer
> > >    number of transactions actually processed: 840972
> > >    latency average = 24.013 ms
> > >    tps = 4664.153934 (including connections establishing)
> > >    tps = 4664.421492 (excluding connections establishing)
> > > 
> > >    cache
> > >    number of transactions actually processed: 551345
> > >    latency average = 36.949 ms
> > >    tps = 3031.223905 (including connections establishing)
> > >    tps = 3031.402581 (excluding connections establishing)
> > > 
> > >    After revert of Commit
> > >    2892100bc85ae446088cebe0c00ba9b194c0ac9d ( Revert "dm: always call
> > >    blk_queue_split() in dm_process_bio()")
> > 
> > I assume 2892100bc85ae446088cebe0c00ba9b194c0ac9d is 5.4-stable's
> > backport of upstream commit 120c9257f5f19e5d1e87efcbb5531b7cd81b7d74 ?
> > 
> > >    Performance is Counter measurement
> > > 
> > >    buffer ->
> > >    number of transactions actually processed: 1135735
> > >    latency average = 17.799 ms
> > >    tps = 6292.586749 (including connections establishing)
> > >    tps = 6292.875089 (excluding connections establishing)
> > > 
> > >    cache ->
> > >    number of transactions actually processed: 648177
> > >    latency average = 31.217 ms
> > >    tps = 3587.755975 (including connections establishing)
> > >    tps = 3587.966359 (excluding connections establishing)
> > > 
> > >    Following is your commit
> > > 
> > >    diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > >    index cf71a2277d60..1e6e0c970e19 100644
> > >    --- a/drivers/md/dm.c
> > >    +++ b/drivers/md/dm.c
> > >    @@ -1760,8 +1760,9 @@ static blk_qc_t dm_process_bio(struct mapped_device
> > >    *md,
> > >             * won't be imposed.
> > >             */
> > >            if (current->bio_list) {
> > >    -               blk_queue_split(md->queue, &bio);
> > >    -               if (!is_abnormal_io(bio))
> > >    +               if (is_abnormal_io(bio))
> > >    +                       blk_queue_split(md->queue, &bio);
> > >    +               else
> > >                            dm_queue_split(md, ti, &bio);
> > >            }
> > > 
> > >    Could you have a look if it is safe to revert this commit.
> > 
> > No, it really isn't a good idea given what was documented in the commit
> > header for commit 120c9257f5f19e5d1e87efcbb5531b7cd81b7d74 -- the
> > excessive splitting is not conducive to performance either.
> > 
> > So I think we need to identify _why_ reverting this commit is causing
> > such a performance improvement.  Why is calling blk_queue_split() before
> > dm_queue_split() benefiting your pgbench workload?
> 
> blk_queue_split() takes every queue's limit into account, and dm_queue_split()
> only splits bio according to max len(offset, chunk size), so the
> splitted bio may not be optimal one from device viewpoint.

Yes, but the issue is blk_queue_split() is doing the wrong thing for the
case described in the header for commit
120c9257f5f19e5d1e87efcbb5531b7cd81b7d74
 
> Maybe DM can switch to blk_queue_split() if 'chunk_sectors' limit is power-2
> aligned.

Not seeing relation to chunk_sectors being power of 2 -- other than that
is all block core supports.  But chunk_sectors isn't set for DM, you
added chunk_sectors for MD or something and DM was caught out, so
blk_queue_split() falls back to splitting on max_sectors.

You're saying DM should set 'chunk_sectors' IFF it'd be a power of 2?
While I could do that, it seems like just continuing a sequence of
hacks around earlier imposed chunk_sectors infrastructure that was a
half-measure to begin with.  Think chunk_sectors logic in block core
needs to be enhanced -- but I'll take a closer look.

Mike

WARNING: multiple messages have this Message-ID (diff)
From: Mike Snitzer <snitzer@redhat.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Vijayendra Suman <vijayendra.suman@oracle.com>,
	dm-devel@redhat.com, linux-block@vger.kernel.org
Subject: Re: Revert "dm: always call blk_queue_split() in dm_process_bio()"
Date: Fri, 11 Sep 2020 12:13:45 -0400	[thread overview]
Message-ID: <20200911161344.GA28614@redhat.com> (raw)
In-Reply-To: <20200911122038.GA167338@T590>

On Fri, Sep 11 2020 at  8:20am -0400,
Ming Lei <ming.lei@redhat.com> wrote:

> On Thu, Sep 10, 2020 at 10:24:39AM -0400, Mike Snitzer wrote:
> > [cc'ing dm-devel and linux-block because this is upstream concern too]
> > 
> > On Wed, Sep 09 2020 at  1:00pm -0400,
> > Vijayendra Suman <vijayendra.suman@oracle.com> wrote:
> > 
> > >    Hello Mike,
> > > 
> > >    While Running pgbench tool with  5.4.17 kernel build
> > > 
> > >    Following performance degrade is found out
> > > 
> > >    buffer read/write metric : -17.2%
> > >    cache read/write metric : -18.7%
> > >    disk read/write metric : -19%
> > > 
> > >    buffer
> > >    number of transactions actually processed: 840972
> > >    latency average = 24.013 ms
> > >    tps = 4664.153934 (including connections establishing)
> > >    tps = 4664.421492 (excluding connections establishing)
> > > 
> > >    cache
> > >    number of transactions actually processed: 551345
> > >    latency average = 36.949 ms
> > >    tps = 3031.223905 (including connections establishing)
> > >    tps = 3031.402581 (excluding connections establishing)
> > > 
> > >    After revert of Commit
> > >    2892100bc85ae446088cebe0c00ba9b194c0ac9d ( Revert "dm: always call
> > >    blk_queue_split() in dm_process_bio()")
> > 
> > I assume 2892100bc85ae446088cebe0c00ba9b194c0ac9d is 5.4-stable's
> > backport of upstream commit 120c9257f5f19e5d1e87efcbb5531b7cd81b7d74 ?
> > 
> > >    Performance is Counter measurement
> > > 
> > >    buffer ->
> > >    number of transactions actually processed: 1135735
> > >    latency average = 17.799 ms
> > >    tps = 6292.586749 (including connections establishing)
> > >    tps = 6292.875089 (excluding connections establishing)
> > > 
> > >    cache ->
> > >    number of transactions actually processed: 648177
> > >    latency average = 31.217 ms
> > >    tps = 3587.755975 (including connections establishing)
> > >    tps = 3587.966359 (excluding connections establishing)
> > > 
> > >    Following is your commit
> > > 
> > >    diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > >    index cf71a2277d60..1e6e0c970e19 100644
> > >    --- a/drivers/md/dm.c
> > >    +++ b/drivers/md/dm.c
> > >    @@ -1760,8 +1760,9 @@ static blk_qc_t dm_process_bio(struct mapped_device
> > >    *md,
> > >             * won't be imposed.
> > >             */
> > >            if (current->bio_list) {
> > >    -               blk_queue_split(md->queue, &bio);
> > >    -               if (!is_abnormal_io(bio))
> > >    +               if (is_abnormal_io(bio))
> > >    +                       blk_queue_split(md->queue, &bio);
> > >    +               else
> > >                            dm_queue_split(md, ti, &bio);
> > >            }
> > > 
> > >    Could you have a look if it is safe to revert this commit.
> > 
> > No, it really isn't a good idea given what was documented in the commit
> > header for commit 120c9257f5f19e5d1e87efcbb5531b7cd81b7d74 -- the
> > excessive splitting is not conducive to performance either.
> > 
> > So I think we need to identify _why_ reverting this commit is causing
> > such a performance improvement.  Why is calling blk_queue_split() before
> > dm_queue_split() benefiting your pgbench workload?
> 
> blk_queue_split() takes every queue's limit into account, and dm_queue_split()
> only splits bio according to max len(offset, chunk size), so the
> splitted bio may not be optimal one from device viewpoint.

Yes, but the issue is blk_queue_split() is doing the wrong thing for the
case described in the header for commit
120c9257f5f19e5d1e87efcbb5531b7cd81b7d74
 
> Maybe DM can switch to blk_queue_split() if 'chunk_sectors' limit is power-2
> aligned.

Not seeing relation to chunk_sectors being power of 2 -- other than that
is all block core supports.  But chunk_sectors isn't set for DM, you
added chunk_sectors for MD or something and DM was caught out, so
blk_queue_split() falls back to splitting on max_sectors.

You're saying DM should set 'chunk_sectors' IFF it'd be a power of 2?
While I could do that, it seems like just continuing a sequence of
hacks around earlier imposed chunk_sectors infrastructure that was a
half-measure to begin with.  Think chunk_sectors logic in block core
needs to be enhanced -- but I'll take a closer look.

Mike


  reply	other threads:[~2020-09-11 16:13 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <529c2394-1b58-b9d8-d462-1f3de1b78ac8@oracle.com>
2020-09-10 14:24 ` Revert "dm: always call blk_queue_split() in dm_process_bio()" Mike Snitzer
2020-09-10 14:24   ` Mike Snitzer
2020-09-10 19:29   ` Vijayendra Suman
2020-09-15  1:33     ` Mike Snitzer
2020-09-15 17:03       ` Mike Snitzer
2020-09-16 14:56       ` Vijayendra Suman
2020-09-11 12:20   ` Ming Lei
2020-09-11 16:13     ` Mike Snitzer [this message]
2020-09-11 16:13       ` Mike Snitzer
2020-09-11 21:53       ` [PATCH 0/3] block: a few chunk_sectors fixes/improvements Mike Snitzer
2020-09-11 21:53         ` [PATCH 1/3] block: fix blk_rq_get_max_sectors() to flow more carefully Mike Snitzer
2020-09-12 13:52           ` Ming Lei
2020-09-14  0:43             ` Damien Le Moal
2020-09-14 14:52               ` Mike Snitzer
2020-09-14 23:28                 ` Damien Le Moal
2020-09-15  2:03               ` Ming Lei
2020-09-15  2:15                 ` Damien Le Moal
2020-09-14 14:49             ` Mike Snitzer
2020-09-14 14:49               ` Mike Snitzer
2020-09-15  1:50               ` Ming Lei
2020-09-14  0:46           ` Damien Le Moal
2020-09-14 15:03             ` Mike Snitzer
2020-09-14 15:03               ` Mike Snitzer
2020-09-15  1:09               ` Damien Le Moal
2020-09-15  4:21                 ` Damien Le Moal
2020-09-15  8:01                   ` Ming Lei
2020-09-15  8:01                     ` Ming Lei
2020-09-11 21:53         ` [PATCH 2/3] block: use lcm_not_zero() when stacking chunk_sectors Mike Snitzer
2020-09-12 13:58           ` Ming Lei
2020-09-12 13:58             ` Ming Lei
2020-09-11 21:53         ` [PATCH 3/3] block: allow 'chunk_sectors' to be non-power-of-2 Mike Snitzer
2020-09-12 14:06           ` Ming Lei
2020-09-12 14:06             ` Ming Lei
2020-09-14  2:43             ` Keith Busch
2020-09-14  0:55           ` Damien Le Moal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200911161344.GA28614@redhat.com \
    --to=snitzer@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=vijayendra.suman@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.