From: Daniel Kiper <dkiper@net-space.pl>
To: Goffredo Baroncelli <kreijack@inwind.it>
Cc: Daniel Kiper <dkiper@net-space.pl>,
	The development of GNU GRUB <grub-devel@gnu.org>
Subject: Re: [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile.
Date: Mon, 3 Sep 2018 14:52:45 +0200	[thread overview]
Message-ID: <20180903125245.GC25567@router-fw-old.i.net-space.pl> (raw)
In-Reply-To: <4c0797f2-3570-1f15-4728-d13a432a2ab7@inwind.it>
On Wed, Jul 18, 2018 at 08:24:50AM +0200, Goffredo Baroncelli wrote:
> On 07/12/2018 03:46 PM, Daniel Kiper wrote:
> > On Tue, Jun 19, 2018 at 07:39:48PM +0200, Goffredo Baroncelli wrote:
> >> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
> >> ---
> >>  grub-core/fs/btrfs.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 70 insertions(+)
> >>
> >> diff --git a/grub-core/fs/btrfs.c b/grub-core/fs/btrfs.c
> >> index be195448d..fbabaebbe 100644
> >> --- a/grub-core/fs/btrfs.c
> >> +++ b/grub-core/fs/btrfs.c
> >> @@ -119,6 +119,8 @@ struct grub_btrfs_chunk_item
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID1         0x10
> >>  #define GRUB_BTRFS_CHUNK_TYPE_DUPLICATED    0x20
> >>  #define GRUB_BTRFS_CHUNK_TYPE_RAID10        0x40
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID5         0x80
> >> +#define GRUB_BTRFS_CHUNK_TYPE_RAID6         0x100
> >>    grub_uint8_t dummy2[0xc];
> >>    grub_uint16_t nstripes;
> >>    grub_uint16_t nsubstripes;
> >> @@ -764,6 +766,74 @@ grub_btrfs_read_logical (struct grub_btrfs_data *data, grub_disk_addr_t addr,
> >>  	      stripe_offset = low + chunk_stripe_length
> >>  		* high;
> >>  	      csize = chunk_stripe_length - low;
> >> +	      break;
> >> +	    }
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID5:
> >> +	  case GRUB_BTRFS_CHUNK_TYPE_RAID6:
> >> +	    {
> >> +	      grub_uint64_t nparities, stripe_nr, high, low;
> >> +
> >> +	      redundancy = 1;	/* no redundancy for now */
> >> +
> >> +	      if (grub_le_to_cpu64 (chunk->type) & GRUB_BTRFS_CHUNK_TYPE_RAID5)
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID5\n");
> >> +		  nparities = 1;
> >> +		}
> >> +	      else
> >> +		{
> >> +		  grub_dprintf ("btrfs", "RAID6\n");
> >> +		  nparities = 2;
> >> +		}
> >> +
> >> +	      /*
> >> +	       * Below is an example of a RAID 6 layout and the meaning of the
> >> +	       * variables. The same applies to RAID 5. The only differences is
> >> +	       * that there is only one parity disk instead of two.
> >> +	       *
> >> +	       * A RAID 6 layout consists of several stripes spread
> >> +	       * on the disks, following a layout like the one below
> >> +	       *
> >> +	       *   Disk0  Disk1  Disk2  Ddisk3
> >
> > s/Ddisk3/Disk3/
> >
> >> +	       *
> >> +	       *    A1     B1     P1     Q1
> >> +	       *    Q2     A2     B2     P2
> >> +	       *    P3     Q3     A3     B3
> >> +	       *  [...]
> >> +	       *
> >> +	       *  Note that the placement of the parities depends on row index.
> >> +	       *  In the code below:
> >> +	       *  - stripe_nr is the stripe number not considering the parities
> >> +	       *    (A1 = 0, B1 = 1, A2 = 2, B2 = 3, ...),
> >> +	       *  - high is the row number (0 for A1...Q1, 1 for Q2...P2, ...),
> >> +	       *  - stripen is the column number (or disk number),
> >> +	       *  - off is the logical address to read (from the beginning of
> >> +	       *    the chunk space),
> >
> > What do you mean by chunk?
>
> Is a specific btrfs data structure (see
> https://btrfs.wiki.kernel.org/index.php/Manpage/mkfs.btrfs#BLOCK_GROUPS.2C_CHUNKS.2C_RAID).
> Most of the BTRFS internal structures (e.g. where the file data is
> stored) are in terms of *logical address*. At this level there is no
> concept of redundancy or raid nor disks. You can see the logical
> address as a flat space, large as the sum of the disks sizes (minus
> the space used by the parities or mirroring).
>
> The "logical space" is divided in "slices" called chunk. A chunk
> typically has a size in terms of hundred of megabytes (IIRC up to 1GB)
>
> A chunk (or block group) is a mapping. A chunk maps a "logical
> address" to a "physical address" on the disks.
>
> You can think a chunk as a structure like:
>
> 	u64 profile
> 	u64 logical_start, logical_end
> 	u64 disk1_id, disk1_start, disk1_end
> 	u64 disk2_id, disk2_start, disk2_end
> 	u64 disk3_id, disk3_start, disk3_end
> 	[...]
>
> In order to translate a "logical_address" to the pair of "disk" and "physical address:
> - find the chunk which maps the logical address (looking to the "logical_start", "logical_end")
> - then on the basis of "profile" value you can have the following cases:
> 	profile == SINGLE:
> 		is the simplest one: the logical address is mapped to the range
> 		(disk1_start...disk1_end) of the disk "disk_id1"
>
> 	profile == RAID1
> 		like the previous one; however if the disk1 is not available,
> 		you can find the data in the range
> 		(disk2_start...disk2_end) of the disk "disk_id2". NOTE: the two ranges
> 		must have the same sizes, but may have different starts
>
> 	profile == RAID0
> 		the data is without redundancy and it is spread on different disk each
> 		"chunk_stripe_length" bytes; so:
> 			off = logical_address - logical_start // this is what I call
> 							      // address from the
> 							      // beginning of the chunk
>
>
>
>
>
>
> 	and so on....
>
>
>
> >
> >> +	       *  - chunk_stripe_length is the size of a stripe (typically 64k),
> >> +	       *  - nstripes is the number of disks,
> >> +	       *  - low is the offset of the data inside a stripe,
> >> +	       *  - stripe_offset is the disk offset from the beginning
> >> +	       *    of the disk chunk mapping start,
> >
> > Err... I am confused here...
>
> As described above, the chunk data structure translate from "logical
> address" to the pair (disk, physical address on the disk). Each chunk
> translate a small portion of the space (typically 1GB); so it is more
> correct to say that a chunk maps a range in the logical address to a
> range in the disks. 'off' is the offset from the beginning of the
> range in the logical address. 'stripe_offset' is the offset from the
> beginning of the range in the physical space
>
> >> +	       *  - csize is the "potential" data to read. It will be reduced to
> >> +	       *    size if the latter is smaller.
> >> +	       */
> >> +	      stripe_nr = grub_divmod64 (off, chunk_stripe_length, &low);
> >
> > Here it seems to me that off is from the beginning of RAID volume space.
> > So, it does not agree with the comment above...
>
> RAID volume space == chunk logical address space
>
> >> +	      /*
> >> +	       * stripen is evaluated without considering
> >> +	       * the parities (0 for A1, A2, A3... 1 for B1, B2...).
> >> +	       */
> >> +	      high = grub_divmod64 (stripe_nr, nstripes - nparities, &stripen);
> >
> > This looks good.
> >
> >> +	      /*
> >> +	       * stripen now considers also the parities (0 for A1, 1 for A2,
> >> +	       * 2 for A3....). The math is performed modulo number of disks.
> >> +	       */
> >> +	      grub_divmod64 (high + stripen, nstripes, &stripen);
> >
> > Ditto.
> >
> >> +	      stripe_offset = low + chunk_stripe_length * high;
> >
> > And here I am confused. Why "* high"? If chunk_stripe_length is stripe
> > length then stripe_offset leads to nowhere. Am I missing something?
> > I have a feeling that comments does not agree with the code somehow.
>
> The "physical range" is divided in small range called(?) "stripe".
> These stripe are spread on the disks. 'high' is the offset from the
> beginning of the physical chunk range in unit of "chunk_stripe_length"
>
> I found quite confuse the btrfs terminology around the "stripe" word.
> And I have to point out that these variables were named before my
> patch.
>
> I am afraid that my comment doesn't help to "de-obfuscate" my code. I
It does. Please extend the code comment accordingly. You can add a link
to the BTRFS Wiki page. Though please try to use terms only from BRFS
specs/docs. Do not introduce new ones if it is not really needed. This
whole thing will ease reading new and old code.
> know that my English is far to be perfect. However this code is not so
Do not worry about that. Many people here learn English. Just keep doing it.
> different that the previous one (i.e. handling the btrfs
> raid1/dup/raid0....). So I am thinking to remove all the comments
> because it seems to me that the my comments increase the confusion
> instead of reducing it.
No, please do not do that. We needed it. If you take into account my
comments then everything should be OK.
Anyway, the code starts looking good. I think that we need 1-3
iterations and I will be able to get it into GRUB2 git repo.
So, please keep working on this feature.
Daniel
next prev parent reply	other threads:[~2018-09-03 12:54 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-19 17:39 [PATCH V6] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-06-19 17:39 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-07-08 15:51   ` Goffredo Baroncelli
2018-07-09 10:20     ` Daniel Kiper
2018-07-09 16:29       ` Goffredo Baroncelli
2018-07-12 13:46   ` Daniel Kiper
2018-07-18  6:24     ` Goffredo Baroncelli
2018-09-03 12:52       ` Daniel Kiper [this message]
2018-06-19 17:39 ` [PATCH 2/9] btrfs: Add helper to check the btrfs header Goffredo Baroncelli
2018-07-12 13:50   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 3/9] btrfs: Move the error logging from find_device() to its caller Goffredo Baroncelli
2018-07-12 13:51   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 5/9] btrfs: Move logging code in grub_btrfs_read_logical() Goffredo Baroncelli
2018-07-12 14:06   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 6/9] btrfs: Refactor the code that read from disk Goffredo Baroncelli
2018-07-12 14:10   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 7/9] btrfs: Add support for recovery for a RAID 5 btrfs profiles Goffredo Baroncelli
2018-07-12 14:35   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 8/9] btrfs: Make more generic the code for RAID 6 rebuilding Goffredo Baroncelli
2018-07-12 14:41   ` Daniel Kiper
2018-06-19 17:39 ` [PATCH 9/9] btrfs: Add RAID 6 recovery for a btrfs filesystem Goffredo Baroncelli
2018-07-12 14:47   ` Daniel Kiper
2018-06-19 18:22 ` [PATCH 4/9] btrfs: Avoid a rescan for a device which was already not found Goffredo Baroncelli
2018-07-12 14:02   ` Daniel Kiper
2018-07-18 18:59     ` Goffredo Baroncelli
  -- strict thread matches above, loose matches on Subject: below --
2018-10-22 17:29 [PATCH V11] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-22 17:29 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-18 17:55 [PATCH V10] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-18 17:55 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-11 18:50 [PATCH V9] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-10-11 18:50 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-17 13:46   ` Daniel Kiper
2018-09-27 18:34 [PATCH V8] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-09-27 18:34 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-10-09 17:51   ` Daniel Kiper
2018-10-11 13:17     ` Daniel Kiper
2018-09-19 18:40 [PATCH V7] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-09-19 18:40 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-09-25 15:31   ` Daniel Kiper
2018-09-26 20:40     ` Goffredo Baroncelli
2018-09-27 15:47       ` Daniel Kiper
2018-09-19 18:36 Goffredo Baroncelli
2018-09-19 18:42 ` Goffredo Baroncelli
2018-06-03 18:53 [PATCH V5] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-06-03 18:53 ` [PATCH 1/9] btrfs: Add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-06-14 11:17   ` Daniel Kiper
2018-06-14 18:55     ` Goffredo Baroncelli
2018-06-19 17:30       ` Goffredo Baroncelli
2018-05-16 18:48 [PATCH V4] Add support for BTRFS raid5/6 to GRUB Goffredo Baroncelli
2018-05-16 18:48 ` [PATCH 1/9] btrfs: add support for reading a filesystem with a RAID 5 or RAID 6 profile Goffredo Baroncelli
2018-05-30 10:07   ` Daniel Kiper
2018-06-01 18:50     ` Goffredo Baroncelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to=20180903125245.GC25567@router-fw-old.i.net-space.pl \
    --to=dkiper@net-space.pl \
    --cc=grub-devel@gnu.org \
    --cc=kreijack@inwind.it \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).