linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: "hch@infradead.org" <hch@infradead.org>,
	Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Cc: Linux FS Devel <linux-fsdevel@vger.kernel.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Any bio_clone_slow() implementation which doesn't share bi_io_vec?
Date: Wed, 24 Nov 2021 07:07:18 +0800	[thread overview]
Message-ID: <133792e9-b89b-bc82-04fe-41202c3453a5@gmx.com> (raw)
In-Reply-To: <YZz6jAVXun8yC/6k@infradead.org>



On 2021/11/23 22:28, hch@infradead.org wrote:
> On Tue, Nov 23, 2021 at 11:39:11AM +0000, Johannes Thumshirn wrote:
>> I think we have to differentiate two cases here:
>> A "regular" REQ_OP_ZONE_APPEND bio and a RAID stripe REQ_OP_ZONE_APPEND
>> bio. The 1st one (i.e. the regular REQ_OP_ZONE_APPEND bio) can't be split
>> because we cannot guarantee the order the device writes the data to disk.

That's correct.

But if we want to move all bio split into chunk layer, we want a initial
bio without any limitation, and then using that bio to create real
REQ_OP_ZONE_APPEND bios with proper size limitations.

>> For the RAID stripe bio we can split it into the two (or more) parts that
>> will end up on _different_ devices. All we need to do is a) ensure it
>> doesn't cross the device's zone append limit and b) clamp all
>> bi_iter.bi_sector down to the start of the target zone, a.k.a sticking to
>> the rules of REQ_OP_ZONE_APPEND.
>
> Exactly.  A stacking driver must never split a REQ_OP_ZONE_APPEND bio.
> But the file system itself can of course split it as long as each split
> off bio has it's own bi_end_io handler to record where it has been
> written to.
>

This makes me wonder, can we really forget the zone thing for the
initial bio so we just create a plain bio without any special
limitation, and let every split condition be handled in the lower layer?

Including raid stripe boundary, zone limitations etc.

(yeah, it's still not pure stacking driver, but it's more
stacking-driver like).

In that case, the missing piece seems to be a way to convert a splitted
plain bio into a REQ_OP_ZONE_APPEND bio.

Can this be done without slow bvec copying?

Thanks,
Qu

  reply	other threads:[~2021-11-23 23:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-23  6:44 Any bio_clone_slow() implementation which doesn't share bi_io_vec? Qu Wenruo
2021-11-23  7:43 ` Christoph Hellwig
2021-11-23  8:10   ` Qu Wenruo
2021-11-23  8:13     ` Christoph Hellwig
2021-11-23 11:09       ` Qu Wenruo
2021-11-23 11:39         ` Johannes Thumshirn
2021-11-23 14:28           ` hch
2021-11-23 23:07             ` Qu Wenruo [this message]
2021-11-24  6:09               ` hch
2021-11-24  6:18                 ` Qu Wenruo
2021-11-24  7:02                   ` hch
2021-11-24  7:22                     ` hch
2021-11-24  7:25               ` Naohiro Aota
2021-11-24  7:39                 ` Qu Wenruo
2021-11-26 12:33       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=133792e9-b89b-bc82-04fe-41202c3453a5@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=Johannes.Thumshirn@wdc.com \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).