From: Ming Lei <ming.lei@redhat.com>
To: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, dm-devel@redhat.com,
Damien Le Moal <damien.lemoal@opensource.wdc.com>
Subject: Re: [PATCH 5/8] dm: always setup ->orig_bio in alloc_io
Date: Wed, 13 Apr 2022 20:26:39 +0800 [thread overview]
Message-ID: <YlbBf0mJa/BPHSSq@T590> (raw)
In-Reply-To: <YlZp3+VrP930VjIQ@redhat.com>
On Wed, Apr 13, 2022 at 02:12:47AM -0400, Mike Snitzer wrote:
> On Tue, Apr 12 2022 at 9:56P -0400,
> Ming Lei <ming.lei@redhat.com> wrote:
>
> > On Tue, Apr 12, 2022 at 04:52:40PM -0400, Mike Snitzer wrote:
> > > On Tue, Apr 12 2022 at 4:56P -0400,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > >
> > > > The current DM codes setup ->orig_bio after __map_bio() returns,
> > > > and not only cause kernel panic for dm zone, but also a bit ugly
> > > > and tricky, especially the waiting until ->orig_bio is set in
> > > > dm_submit_bio_remap().
> > > >
> > > > The reason is that one new bio is cloned from original FS bio to
> > > > represent the mapped part, which just serves io accounting.
> > > >
> > > > Now we have switched to bdev based io accounting interface, and we
> > > > can retrieve sectors/bio_op from both the real original bio and the
> > > > added fields of .sector_offset & .sectors easily, so the new cloned
> > > > bio isn't necessary any more.
> > > >
> > > > Not only fixes dm-zone's kernel panic, but also cleans up dm io
> > > > accounting & split a bit.
> > >
> > > You're conflating quite a few things here. DM zone really has no
> > > business accessing io->orig_bio (dm-zone.c can just as easily inspect
> > > the tio->clone, because it hasn't been remapped yet it reflects the
> > > io->origin_bio, so there is no need to look at io->orig_bio) -- but
> > > yes I clearly broke things during the 5.18 merge and it needs fixing
> > > ASAP.
> >
> > You can just consider the cleanup part of this patches, :-)
>
> I will. But your following list doesn't reflect any "cleanup" that I
> saw in your patchset. Pretty fundamental changes that are similar,
> but different, to the dm-5.19 changes I've staged.
>
> > 1) no late assignment of ->orig_bio, and always set it in alloc_io()
> >
> > 2) no waiting on on ->origi_bio, especially the waiting is done in
> > fast path of dm_submit_bio_remap().
>
> For 5.18 waiting on io->orig_bio just enables a signal that the IO was
> split and can be accounted.
>
> For 5.19 I also plan on using late io->orig_bio assignment as an
> alternative to the full-blown refcounting currently done with
> io->io_count. I've yet to quantify the gains with focused testing but
> in theory this approach should scale better on large systems with many
> concurrent IO threads to the same device (RCU is primary constraint
> now).
>
> I'll try to write a bpfrace script to measure how frequently "waiting on
> io->orig_bio" occurs for dm_submit_bio_remap() heavy usage (like
> dm-crypt). But I think we'll find it is very rarely, if ever, waited
> on in the fast path.
The waiting depends on CPU and device's speed, if device is quicker than
CPU, the wait should be longer. Testing in one environment is usually
not enough.
>
> > 3) no split for io accounting
>
> DM's more recent approach to splitting has never been done for benefit
> or use of IO accounting, see this commit for its origin:
> 18a25da84354c6b ("dm: ensure bio submission follows a depth-first tree walk")
>
> Not sure why you keep poking fun at DM only doing a single split when:
> that is the actual design. DM splits off orig_bio then recurses to
> handle the remainder of the bio that wasn't issued. Storing it in
> io->orig_bio (previously io->bio) was always a means of reflecting
> things properly. And yes IO accounting is one use, the other is IO
> completion. But unfortunately DM's IO accounting has always been a
> mess ever since the above commit. Changes in 5.18 fixed that.
>
> But again, DM's splitting has _nothing_ to do with IO accounting.
> Splitting only happens when needed for IO submission given constraints
> of DM target(s) or underlying layers.
What I meant is that the bio returned from bio_split() is only for
io accounting. Yeah, the comment said it can be for io completion too,
but that is easily done without the splitted bio.
>
> All said, I will look closer at your entire set and see if it better
> to go with your approach. This patch in particular is interesting
> (avoids cloning and other complexity of bio_split + bio_chain):
> https://patchwork.kernel.org/project/dm-devel/patch/20220412085616.1409626-6-ming.lei@redhat.com/
That patch shows we can avoid the extra split, also shows that the
splitted bio from bio_split() is for io accounting only.
thanks,
Ming
next prev parent reply other threads:[~2022-04-13 12:26 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-12 8:56 [PATCH 0/8] dm: io accounting & polling improvement Ming Lei
2022-04-12 8:56 ` [PATCH 1/8] block: replace disk based account with bdev's Ming Lei
2022-04-16 5:57 ` Christoph Hellwig
2022-04-12 8:56 ` [PATCH 2/8] dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct Ming Lei
2022-04-12 8:56 ` [PATCH 3/8] dm: pass 'dm_io' instance to dm_io_acct directly Ming Lei
2022-04-12 20:28 ` Mike Snitzer
2022-04-13 1:43 ` Ming Lei
2022-04-12 8:56 ` [PATCH 4/8] dm: switch to bdev based io accounting interface Ming Lei
2022-04-12 8:56 ` [PATCH 5/8] dm: always setup ->orig_bio in alloc_io Ming Lei
2022-04-12 20:52 ` Mike Snitzer
2022-04-12 22:38 ` Damien Le Moal
2022-04-12 23:00 ` Mike Snitzer
2022-04-12 23:31 ` Damien Le Moal
2022-04-13 0:00 ` Damien Le Moal
2022-04-13 1:56 ` Ming Lei
2022-04-13 6:12 ` Mike Snitzer
2022-04-13 12:26 ` Ming Lei [this message]
2022-04-13 17:58 ` Mike Snitzer
2022-04-14 0:36 ` Ming Lei
2022-04-14 2:25 ` Mike Snitzer
2022-04-14 3:57 ` Ming Lei
2022-04-14 17:45 ` Mike Snitzer
2022-04-15 0:14 ` Ming Lei
2022-04-15 21:06 ` Mike Snitzer
2022-04-17 2:22 ` Ming Lei
2022-04-12 8:56 ` [PATCH 6/8] dm: don't grab target io reference in dm_zone_map_bio Ming Lei
2022-04-12 8:56 ` [PATCH 7/8] dm: improve target io referencing Ming Lei
2022-04-12 8:56 ` [PATCH 8/8] dm: put all polled io into one single list Ming Lei
2022-04-12 17:15 ` [PATCH 0/8] dm: io accounting & polling improvement Mike Snitzer
2022-04-16 5:58 ` Christoph Hellwig
2022-04-17 1:23 ` Mike Snitzer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YlbBf0mJa/BPHSSq@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=damien.lemoal@opensource.wdc.com \
--cc=dm-devel@redhat.com \
--cc=linux-block@vger.kernel.org \
--cc=snitzer@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox