public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: David Sterba <dsterba@suse.cz>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: "dsterba@suse.cz" <dsterba@suse.cz>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>,
	Christoph Hellwig <hch@lst.de>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 2/2] btrfs: fix and document the zoned device choice in alloc_new_bio
Date: Wed, 30 Mar 2022 17:10:16 +0200	[thread overview]
Message-ID: <20220330151016.GG2237@twin.jikos.cz> (raw)
In-Reply-To: <20220328230426.n3aanogu7at7hnsj@naota-xeon>

On Mon, Mar 28, 2022 at 11:04:26PM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 09:12:40PM +0200, David Sterba wrote:
> > On Fri, Mar 25, 2022 at 09:09:56AM +0000, Johannes Thumshirn wrote:
> > > On 24/03/2022 17:54, Christoph Hellwig wrote:
> > > > Zone Append bios only need a valid block device in struct bio, but
> > > > not the device in the btrfs_bio.  Use the information from
> > > > btrfs_zoned_get_device to set up bi_bdev and fix zoned writes on
> > > > multi-device file system with non-homogeneous capabilities and remove
> > > > the pointless btrfs_bio.device assignment.
> > > > 
> > > > Add big fat comments explaining what is going on here.
> > > 
> > > Looks like the old code worked by sheer luck, as we had wbc set and thus
> > > always assigned fs_info->fs_devices->latest_dev->bdev to the bio. Which 
> > > would obviously not work on a multi device FS.
> > 
> > No, it worked fine because the real bio is set just before writing the
> > data somewhere deep in the io submit path in submit_stripe_bio().
> > 
> > That it has to be set here is because of the cgroup implementation that
> > accesses it, see 429aebc0a9a0 ("btrfs: get bdev directly from fs_devices
> > in submit_extent_page").
> > 
> > Which brings me to the question if Christoph's fix is correct because
> > the comment for the wbc + zoned append is assuming something that's not
> > true.
> 
> While the real bio is setup in submit_stripe_bio(), we need to set the

Oh sorry I actually wanted to say that the real 'bdev' is set in
submit_stripe_bio (ie. the one where the write is going to be done).

> device destination for bio_add_zone_append_page() called in
> btrfs_bio_add_page(). The bio_add_zone_append_page() checks that the
> bio length is not exceeding max_zone_append_sectors() of the device,
> and checks other hardware restrictions.

Yeah, but can this still mean that it's checking potentially different
devices with different hw restrictions? In alloc_new_bio() it's one and in
submit_stripe_bio() it's a different one.

Before the cgroup writeback was added to bios, the only reason why
bio_set_bdev required the block device is to check if it's the same one
as before and drop some bit:

static inline void bio_set_dev(struct bio *bio, struct block_device *bdev)
{
	bio_clear_flag(bio, BIO_REMAPPED);
	if (bio->bi_bdev != bdev)
		bio_clear_flag(bio, BIO_THROTTLED);
	bio->bi_bdev = bdev;
	bio_associate_blkg(bio);		<-- this was not here
}

So the latest_dev was just a stub to satisfy the bio API requirements.
Please note that its existence spans a long time and things have
changed, I remember that Chris' answer to why we need the latest_dev was
"to put something to the bios". Ie. we don't need it because we have to
write same data to different block devices and distribute that in
submit_stripe_bio(), while the bios have to be set much earlier
expecting a block device.

I'm not sure we have a 1:1 match in what the APIs provide and expect and
what btrfs wants to do. At this point multi-device support for zoned
mode is not complete so we probably won't observe any problems with
hardware with different restrictions.

  reply	other threads:[~2022-03-30 15:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-24 16:52 btrfs zoned fixlets Christoph Hellwig
2022-03-24 16:52 ` [PATCH 1/2] btrfs: remove the zoned/zone_size union in struct btrfs_fs_info Christoph Hellwig
2022-03-25  7:33   ` Johannes Thumshirn
2022-03-28 14:37     ` David Sterba
2022-03-28 12:46   ` Naohiro Aota
2022-03-28 19:01   ` David Sterba
2022-03-24 16:52 ` [PATCH 2/2] btrfs: fix and document the zoned device choice in alloc_new_bio Christoph Hellwig
2022-03-25  9:09   ` Johannes Thumshirn
2022-03-28 19:12     ` David Sterba
2022-03-28 23:04       ` Naohiro Aota
2022-03-30 15:10         ` David Sterba [this message]
2022-03-28 13:31   ` Naohiro Aota
2022-03-25  7:35 ` btrfs zoned fixlets Johannes Thumshirn
2022-03-25  7:50   ` Christoph Hellwig
2022-04-08 16:41 ` Christoph Hellwig
2022-04-08 16:50   ` David Sterba
2022-04-11 16:39 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220330151016.GG2237@twin.jikos.cz \
    --to=dsterba@suse.cz \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=Naohiro.Aota@wdc.com \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox