linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, dm-devel@redhat.com,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Sun, 17 Apr 2022 10:22:28 +0800	[thread overview]
Message-ID: <Ylt55PHHu6XShdfA@T590> (raw)
In-Reply-To: <Ylneb0NWsCab0HqI@redhat.com>

On Fri, Apr 15, 2022 at 05:06:55PM -0400, Mike Snitzer wrote:
> On Thu, Apr 14 2022 at  8:14P -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Thu, Apr 14, 2022 at 01:45:33PM -0400, Mike Snitzer wrote:
> > > On Wed, Apr 13 2022 at 11:57P -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Wed, Apr 13, 2022 at 10:25:45PM -0400, Mike Snitzer wrote:
> > > > > On Wed, Apr 13 2022 at  8:36P -0400,
> > > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > > 
> > > > > > On Wed, Apr 13, 2022 at 01:58:54PM -0400, Mike Snitzer wrote:
> > > > > > > 
> > > > > > > The bigger issue with this patch is that you've caused
> > > > > > > dm_submit_bio_remap() to go back to accounting the entire original bio
> > > > > > > before any split occurs.  That is a problem because you'll end up
> > > > > > > accounting that bio for every split, so in split heavy workloads the
> > > > > > > IO accounting won't reflect when the IO is actually issued and we'll
> > > > > > > regress back to having very inaccurate and incorrect IO accounting for
> > > > > > > dm_submit_bio_remap() heavy targets (e.g. dm-crypt).
> > > > > > 
> > > > > > Good catch, but we know the length of mapped part in original bio before
> > > > > > calling __map_bio(), so io->sectors/io->offset_sector can be setup here,
> > > > > > something like the following delta change should address it:
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index db23efd6bbf6..06b554f3104b 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > > @@ -1558,6 +1558,13 @@ static int __split_and_process_bio(struct clone_info *ci)
> > > > > >  
> > > > > >  	len = min_t(sector_t, max_io_len(ti, ci->sector), ci->sector_count);
> > > > > >  	clone = alloc_tio(ci, ti, 0, &len, GFP_NOIO);
> > > > > > +
> > > > > > +	if (ci->sector_count > len) {
> > > > > > +		/* setup the mapped part for accounting */
> > > > > > +		dm_io_set_flag(ci->io, DM_IO_SPLITTED);
> > > > > > +		ci->io->sectors = len;
> > > > > > +		ci->io->sector_offset = bio_end_sector(ci->bio) - ci->sector;
> > > > > > +	}
> > > > > >  	__map_bio(clone);
> > > > > >  
> > > > > >  	ci->sector += len;
> > > > > > @@ -1603,11 +1610,6 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> > > > > >  	if (error || !ci.sector_count)
> > > > > >  		goto out;
> > > > > >  
> > > > > > -	/* setup the mapped part for accounting */
> > > > > > -	dm_io_set_flag(ci.io, DM_IO_SPLITTED);
> > > > > > -	ci.io->sectors = bio_sectors(bio) - ci.sector_count;
> > > > > > -	ci.io->sector_offset = bio_end_sector(bio) - bio->bi_iter.bi_sector;
> > > > > > -
> > > > > >  	bio_trim(bio, ci.io->sectors, ci.sector_count);
> > > > > >  	trace_block_split(bio, bio->bi_iter.bi_sector);
> > > > > >  	bio_inc_remaining(bio);
> > > > > > 
> > > > > > -- 
> > > > > > Ming
> > > > > > 
> > > > > 
> > > > > Unfortunately we do need splitting after __map_bio() because a dm
> > > > > target's ->map can use dm_accept_partial_bio() to further reduce a
> > > > > bio's mapped part.
> > > > > 
> > > > > But I think dm_accept_partial_bio() could be trained to update
> > > > > tio->io->sectors?
> > > > 
> > > > ->orig_bio is just for serving io accounting, but ->orig_bio isn't
> > > > passed to dm_accept_partial_bio(), and not gets updated after
> > > > dm_accept_partial_bio() is called.
> > > > 
> > > > If that is one issue, it must be one existed issue in dm io accounting
> > > > since ->orig_bio isn't updated when dm_accept_partial_bio() is called.
> > > 
> > > Recall that ->orig_bio is updated after the bio_split() at the bottom of
> > > dm_split_and_process_bio().
> > > 
> > > That bio_split() is based on ci->sector_count, which is reduced as a
> > > side-effect of dm_accept_partial_bio() reducing tio->len_ptr.  It is
> > > pretty circuitous so I can absolutely understand why you didn't
> > > immediately appreciate the interface.  The block comment above
> > > dm_accept_partial_bio() does a pretty comprehensive job of explaining.
> > 
> > Go it now, thanks for the explanation.
> > 
> > As you mentioned, it can be addressed in dm_accept_partial_bio()
> > by updating ti->io->sectors.
> 
> Yes, I rebased your patchset ontop of dm-5.19 and fixed up your
> splitting like we discussed.  I'll be rebasing ontop of v5.18-rc3 once
> it is released but please have a look at this 'dm-5.19-v2' branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-5.19-v2
> 
> And this commit in particular:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.19-v2&id=fe5a99da8b0d0518342f5cdb522a06b0f123ca09
> 
> Once I've verified with you that it looks OK I'll fold it into your
> commit (at the same time I rebase on v5.18-rc3 early next week).

Hi Mike,

Your delta change looks good, thanks for fixing it!

Thanks,
Ming


  reply	other threads:[~2022-04-17  2:22 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12  8:56 [PATCH 0/8] dm: io accounting & polling improvement Ming Lei
2022-04-12  8:56 ` [PATCH 1/8] block: replace disk based account with bdev's Ming Lei
2022-04-16  5:57   ` Christoph Hellwig
2022-04-12  8:56 ` [PATCH 2/8] dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct Ming Lei
2022-04-12  8:56 ` [PATCH 3/8] dm: pass 'dm_io' instance to dm_io_acct directly Ming Lei
2022-04-12 20:28   ` Mike Snitzer
2022-04-13  1:43     ` Ming Lei
2022-04-12  8:56 ` [PATCH 4/8] dm: switch to bdev based io accounting interface Ming Lei
2022-04-12  8:56 ` [PATCH 5/8] dm: always setup ->orig_bio in alloc_io Ming Lei
2022-04-12 20:52   ` Mike Snitzer
2022-04-12 22:38     ` Damien Le Moal
2022-04-12 23:00       ` Mike Snitzer
2022-04-12 23:31         ` Damien Le Moal
2022-04-13  0:00         ` Damien Le Moal
2022-04-13  1:56     ` Ming Lei
2022-04-13  6:12       ` Mike Snitzer
2022-04-13 12:26         ` Ming Lei
2022-04-13 17:58           ` Mike Snitzer
2022-04-14  0:36             ` Ming Lei
2022-04-14  2:25               ` Mike Snitzer
2022-04-14  3:57                 ` Ming Lei
2022-04-14 17:45                   ` Mike Snitzer
2022-04-15  0:14                     ` Ming Lei
2022-04-15 21:06                       ` Mike Snitzer
2022-04-17  2:22                         ` Ming Lei [this message]
2022-04-12  8:56 ` [PATCH 6/8] dm: don't grab target io reference in dm_zone_map_bio Ming Lei
2022-04-12  8:56 ` [PATCH 7/8] dm: improve target io referencing Ming Lei
2022-04-12  8:56 ` [PATCH 8/8] dm: put all polled io into one single list Ming Lei
2022-04-12 17:15 ` [PATCH 0/8] dm: io accounting & polling improvement Mike Snitzer
2022-04-16  5:58   ` Christoph Hellwig
2022-04-17  1:23     ` Mike Snitzer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ylt55PHHu6XShdfA@T590 \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-block@vger.kernel.org \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).