From: David Sterba <dsterba@suse.cz>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: "dsterba@suse.cz" <dsterba@suse.cz>,
Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
Christoph Hellwig <hch@lst.de>,
Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/2] btrfs: fix and document the zoned device choice in alloc_new_bio
Date: Wed, 30 Mar 2022 17:10:16 +0200 [thread overview]
Message-ID: <20220330151016.GG2237@twin.jikos.cz> (raw)
In-Reply-To: <20220328230426.n3aanogu7at7hnsj@naota-xeon>
On Mon, Mar 28, 2022 at 11:04:26PM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 09:12:40PM +0200, David Sterba wrote:
> > On Fri, Mar 25, 2022 at 09:09:56AM +0000, Johannes Thumshirn wrote:
> > > On 24/03/2022 17:54, Christoph Hellwig wrote:
> > > > Zone Append bios only need a valid block device in struct bio, but
> > > > not the device in the btrfs_bio. Use the information from
> > > > btrfs_zoned_get_device to set up bi_bdev and fix zoned writes on
> > > > multi-device file system with non-homogeneous capabilities and remove
> > > > the pointless btrfs_bio.device assignment.
> > > >
> > > > Add big fat comments explaining what is going on here.
> > >
> > > Looks like the old code worked by sheer luck, as we had wbc set and thus
> > > always assigned fs_info->fs_devices->latest_dev->bdev to the bio. Which
> > > would obviously not work on a multi device FS.
> >
> > No, it worked fine because the real bio is set just before writing the
> > data somewhere deep in the io submit path in submit_stripe_bio().
> >
> > That it has to be set here is because of the cgroup implementation that
> > accesses it, see 429aebc0a9a0 ("btrfs: get bdev directly from fs_devices
> > in submit_extent_page").
> >
> > Which brings me to the question if Christoph's fix is correct because
> > the comment for the wbc + zoned append is assuming something that's not
> > true.
>
> While the real bio is setup in submit_stripe_bio(), we need to set the
Oh sorry I actually wanted to say that the real 'bdev' is set in
submit_stripe_bio (ie. the one where the write is going to be done).
> device destination for bio_add_zone_append_page() called in
> btrfs_bio_add_page(). The bio_add_zone_append_page() checks that the
> bio length is not exceeding max_zone_append_sectors() of the device,
> and checks other hardware restrictions.
Yeah, but can this still mean that it's checking potentially different
devices with different hw restrictions? In alloc_new_bio() it's one and in
submit_stripe_bio() it's a different one.
Before the cgroup writeback was added to bios, the only reason why
bio_set_bdev required the block device is to check if it's the same one
as before and drop some bit:
static inline void bio_set_dev(struct bio *bio, struct block_device *bdev)
{
bio_clear_flag(bio, BIO_REMAPPED);
if (bio->bi_bdev != bdev)
bio_clear_flag(bio, BIO_THROTTLED);
bio->bi_bdev = bdev;
bio_associate_blkg(bio); <-- this was not here
}
So the latest_dev was just a stub to satisfy the bio API requirements.
Please note that its existence spans a long time and things have
changed, I remember that Chris' answer to why we need the latest_dev was
"to put something to the bios". Ie. we don't need it because we have to
write same data to different block devices and distribute that in
submit_stripe_bio(), while the bios have to be set much earlier
expecting a block device.
I'm not sure we have a 1:1 match in what the APIs provide and expect and
what btrfs wants to do. At this point multi-device support for zoned
mode is not complete so we probably won't observe any problems with
hardware with different restrictions.
next prev parent reply other threads:[~2022-03-30 15:14 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-24 16:52 btrfs zoned fixlets Christoph Hellwig
2022-03-24 16:52 ` [PATCH 1/2] btrfs: remove the zoned/zone_size union in struct btrfs_fs_info Christoph Hellwig
2022-03-25 7:33 ` Johannes Thumshirn
2022-03-28 14:37 ` David Sterba
2022-03-28 12:46 ` Naohiro Aota
2022-03-28 19:01 ` David Sterba
2022-03-24 16:52 ` [PATCH 2/2] btrfs: fix and document the zoned device choice in alloc_new_bio Christoph Hellwig
2022-03-25 9:09 ` Johannes Thumshirn
2022-03-28 19:12 ` David Sterba
2022-03-28 23:04 ` Naohiro Aota
2022-03-30 15:10 ` David Sterba [this message]
2022-03-28 13:31 ` Naohiro Aota
2022-03-25 7:35 ` btrfs zoned fixlets Johannes Thumshirn
2022-03-25 7:50 ` Christoph Hellwig
2022-04-08 16:41 ` Christoph Hellwig
2022-04-08 16:50 ` David Sterba
2022-04-11 16:39 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220330151016.GG2237@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Naohiro.Aota@wdc.com \
--cc=dsterba@suse.com \
--cc=hch@lst.de \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox