Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: dsterba@suse.cz, Qu Wenruo <wqu@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 0/4] btrfs: cleanups and preparation for the incoming RAID56J features
Date: Fri, 13 May 2022 20:21:40 +0800	[thread overview]
Message-ID: <f119b7c7-aa86-b8f9-b5eb-79a16cdd58a4@gmx.com> (raw)
In-Reply-To: <20220513113826.GV18596@twin.jikos.cz>



On 2022/5/13 19:38, David Sterba wrote:
> On Fri, May 13, 2022 at 04:34:27PM +0800, Qu Wenruo wrote:
>> Since I'm going to introduce two new chunk profiles, RAID5J and RAID6J
>> (J for journal),
>
> Do you need to introduce completely new profiles? IIRC in my drafts I
> was more inclined to reuse a dedicated raid1c3 block group, of an
> arbitrary size and not used by anything else. A new profile would
> technically work too but it brings other issues.

The point here is, I want to reserve some space for each data stripe of
a RAID56 chunk.

Unfortunately, we need some on-disk format change anyway, for reusing
btrfs_chunk::io_align to represent how many bytes are reserved for each
dev extent, before where the data really starts.

So new RAID56 types are needed anyway.

Sure, this will make dev_extent mismatch from pure calculated extent
length from a chunk map.

But all the involved change are not that huge, still reviewable.

 From my latest version, already under fstests (with new RAID56J
profiles added to related test cases), but without real journal code
yet, the patch looks like this:

  fs/btrfs/block-group.c          |  23 ++++++-
  fs/btrfs/ctree.h                |   7 +-
  fs/btrfs/scrub.c                |  15 ++--
  fs/btrfs/tree-checker.c         |   4 ++
  fs/btrfs/volumes.c              | 118 +++++++++++++++++++++++++++++---
  fs/btrfs/volumes.h              |   7 +-
  fs/btrfs/zoned.c                |   2 +
  include/uapi/linux/btrfs.h      |   1 +
  include/uapi/linux/btrfs_tree.h |  30 +++++++-
  9 files changed, 185 insertions(+), 22 deletions(-)

Most changes are involved in dev extent handling, and some sites can not
use btrfs_raid_array[] directly.

I guess you'll see that kernel patch soon, along with needed progs patch
in next week.

Thanks,
Qu
>
> As a related feature to the raid56, I was working on the striped
> raid10c34 profiles but was not able to make it work. In a sense this is
> easier as it reuses existing code, but if you make the journal work we
> won't need that.
>
>> if we're relying on ad-hoc if () else if () branches to
>> calculate thing like number of p/q stripes, it will cause a lot of
>> problems.
>>
>> In fact, during my development, I have hit tons of bugs related to this.
>>
>> One example is alloc_rbio(), it will assign rbio->nr_data, if we forgot
>> to update the check for RAID5 and RAID6 profiles, we will got a bad
>> nr_data == num_stripes, and screw up later writeback.
>>
>> 90% of my suffering comes from such ad-hoc usage doing manual checks on
>> RAID56.
>>
>> Another example is, scrub_stripe() which due to the extra per-device
>> reservation, @dev_extent_len is no longer the same the data stripe
>> length calculated from extent_map.
>>
>> So this patchset will do the following cleanups preparing for the
>> incoming RAID56J (already finished coding, functionality and on-disk
>> format are fine, although no journal yet):
>>
>> - Calculate device stripe length in-house inside scrub_stripe()
>>    This removes one of the nasty mismatch which is less obvious.
>>
>> - Use btrfs_raid_array[] based calculation instead of ad-hoc check
>>    The only exception is scrub_nr_raid_mirrors(), which has several
>>    difference against btrfs_num_copies():
>>
>>    * No iteration on all RAID6 combinations
>>      No sure if it's planned or not.
>>
>>    * Use bioc->num_stripes directly
>>      In that context, bioc is already all the mirrors for the same
>>      stripe, thus no need to lookup using btrfs_raid_array[].
>>
>> With all these cleanups, the RAID56J will be much easier to implement.
>>
>> Qu Wenruo (4):
>>    btrfs: remove @dev_extent_len argument from scrub_stripe() function
>>    btrfs: use btrfs_chunk_max_errors() to replace weird tolerance
>>      calculation
>>    btrfs: use btrfs_raid_array[] to calculate the number of parity
>>      stripes
>>    btrfs: use btrfs_raid_array[].ncopies in btrfs_num_copies()
>
> The preparatory patches look good to me.

  reply	other threads:[~2022-05-13 12:21 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-13  8:34 [PATCH 0/4] btrfs: cleanups and preparation for the incoming RAID56J features Qu Wenruo
2022-05-13  8:34 ` [PATCH 1/4] btrfs: remove @dev_extent_len argument from scrub_stripe() function Qu Wenruo
2022-05-13  8:47   ` Johannes Thumshirn
2022-05-13  8:34 ` [PATCH 2/4] btrfs: use btrfs_chunk_max_errors() to replace weird tolerance calculation Qu Wenruo
2022-05-13  8:45   ` Johannes Thumshirn
2022-05-13  8:34 ` [PATCH 3/4] btrfs: use btrfs_raid_array[] to calculate the number of parity stripes Qu Wenruo
2022-05-13  8:56   ` Johannes Thumshirn
2022-05-13  8:34 ` [PATCH 4/4] btrfs: use btrfs_raid_array[].ncopies in btrfs_num_copies() Qu Wenruo
2022-05-13  9:15   ` Johannes Thumshirn
2022-05-13  9:22     ` Qu Wenruo
2022-05-13  9:24       ` Johannes Thumshirn
2022-05-13  9:33         ` Qu Wenruo
2022-05-13 11:38 ` [PATCH 0/4] btrfs: cleanups and preparation for the incoming RAID56J features David Sterba
2022-05-13 12:21   ` Qu Wenruo [this message]
2022-05-13 15:14 ` Forza
2022-05-13 22:58   ` Qu Wenruo
2022-06-15 11:45 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f119b7c7-aa86-b8f9-b5eb-79a16cdd58a4@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox