linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* How to dump/find parity of RAID-5 file?
@ 2017-02-03 10:44 Lakshmipathi.G
  2017-02-06 20:40 ` Goffredo Baroncelli
       [not found] ` <8c5cece7-29cf-82df-0739-ef4f0fe8bf70@cn.fujitsu.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Lakshmipathi.G @ 2017-02-03 10:44 UTC (permalink / raw)
  To: linux-btrfs

Hi.

Came across this thread https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55161.html 
Exploring possibility of adding test-scripts around these area using dump-tree & corrupt-block.But
unable to figure-out how to get parity of file or find its location.  dump-tree output gave,

    item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 145096704) itemoff 15557 itemsize 144
        length 134217728 owner 2 stripe_len 65536 type DATA|RAID5
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 3 sub_stripes 0
            stripe 0 devid 3 offset 63111168                 # Is this parity?Seems empty?
            dev_uuid f62df114-186c-4e48-8152-9ed15aa078b4
            stripe 1 devid 2 offset 63111168                 # Contains file data-stripe-1
            dev_uuid c0aeaab0-e57e-4f7a-9356-db1878876d9f
            stripe 2 devid 1 offset 83034112                 # Contains file data-stripe-2
            dev_uuid 637b3666-9d8f-4ec4-9969-53b0b933b9b1
thanks.

Cheers.
Lakshmipathi.G

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
  2017-02-03 10:44 How to dump/find parity of RAID-5 file? Lakshmipathi.G
@ 2017-02-06 20:40 ` Goffredo Baroncelli
  2017-02-14 20:09   ` Lakshmipathi.G
  2018-04-17 19:20   ` Goffredo Baroncelli
       [not found] ` <8c5cece7-29cf-82df-0739-ef4f0fe8bf70@cn.fujitsu.com>
  1 sibling, 2 replies; 7+ messages in thread
From: Goffredo Baroncelli @ 2017-02-06 20:40 UTC (permalink / raw)
  To: Lakshmipathi.G, linux-btrfs

On 2017-02-03 11:44, Lakshmipathi.G wrote:
> Hi.
> 
> Came across this thread https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55161.html 
> Exploring possibility of adding test-scripts around these area using dump-tree & corrupt-block.But
> unable to figure-out how to get parity of file or find its location.  dump-tree output gave,
> 
>     item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 145096704) itemoff 15557 itemsize 144
>         length 134217728 owner 2 stripe_len 65536 type DATA|RAID5
>         io_align 65536 io_width 65536 sector_size 4096
>         num_stripes 3 sub_stripes 0
>             stripe 0 devid 3 offset 63111168                 # Is this parity?Seems empty?
>             dev_uuid f62df114-186c-4e48-8152-9ed15aa078b4
>             stripe 1 devid 2 offset 63111168                 # Contains file data-stripe-1
>             dev_uuid c0aeaab0-e57e-4f7a-9356-db1878876d9f
>             stripe 2 devid 1 offset 83034112                 # Contains file data-stripe-2
>             dev_uuid 637b3666-9d8f-4ec4-9969-53b0b933b9b1
> thanks.

IIRC, the parity is spread across the disk stripes of the chunk.

So first you have to find the logical-offset [LO] where the the file begins. Then you have to map this offset to the chunk which holds the data. The chunk has the following info:
- chunk start [CS], chunk length [CL]
- for each stripe:
	where the stripe starts

If you subtract the chunk-start from the logical-offset [ CO == LO-CS], you will find the offset where the data belongs in the chunk.

As stated above, the PARITY is spread across the chunk stripes. So (supposing that the stripe size is 64K, the raid level is 5, the disks are three), 

- the first 64k of stripe 0, is data [0..64K)
- the first 64k of stripe 1, is data [64..128K)
- the first 64k of stripe 2 is parity, 

- the 2nd 64k of stripe 0 is parity, 
- the 2nd 64k of stripe 1, is data [128..196K)
- the 2nd 64k of stripe 2, is data [192..256K)

- the 3rd 64k of stripe 0, is data [256..320K)
- the 3rd 64k of stripe 1 is parity, 
- the 3rd 64k of stripe 2, is data [320..384K)
and so on,

To find the data, You have to compare the CO to the data [...) range.

If you look to an my old patch (unfinished :-( ), you can find some example to dump the different stripe

[BTRFS-PROGS][PATCH][V2] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'






> Cheers.
> Lakshmipathi.G
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
       [not found] ` <8c5cece7-29cf-82df-0739-ef4f0fe8bf70@cn.fujitsu.com>
@ 2017-02-14 20:06   ` Lakshmipathi.G
  0 siblings, 0 replies; 7+ messages in thread
From: Lakshmipathi.G @ 2017-02-14 20:06 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

On Mon, Feb 06, 2017 at 09:36:58AM +0800, Qu Wenruo wrote:
> 
> Please note the following things when calculating RAID5/6 data and P/Q
> location:
> 
> 1) Basic layout
> The chunk stripe only shows the *first* full stripe layout.
> And it always follows the sequence of "Data0, Data1, ... Data N, Parity"
> 
> And one full stripe is consistent of N * 64K (fixed yet).
> 
> 
> 2) Device rotation
> RAID5/6 do device rotation.
> So the *second* full stripe will have the following layout:
> "Data 1, Data2, ... Data N, Parity, Data 0"
> 
> 
> You can refer to my offline-scrub patchset to see how it assemble the
> RAID5/6 mapping, which I believe it is easier than current btrfs_map_block()
> implementation.
> [PATCH v2 19/19] btrfs-progs: fsck: Introduce offline scrub function
> 
> Thanks,
> Qu
Sorry for the delay in response, I was offline. Thanks for the details. I think now I have 
better idea about (1)Basic layout  and (2)Device rotation. Will look into offline-scrub patches 
to explore more about exact mapping and on-disk layouts. I'm trying to understand layout for
simple Raid-5 with 3-drives with 128KB file-size. Will try to understand Raid-6 little later. 
thanks.

Cheers.
Lakshmipathi.G

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
  2017-02-06 20:40 ` Goffredo Baroncelli
@ 2017-02-14 20:09   ` Lakshmipathi.G
  2017-02-15 18:24     ` Goffredo Baroncelli
  2018-04-17 19:20   ` Goffredo Baroncelli
  1 sibling, 1 reply; 7+ messages in thread
From: Lakshmipathi.G @ 2017-02-14 20:09 UTC (permalink / raw)
  To: Goffredo Baroncelli, linux-btrfs

On Mon, Feb 06, 2017 at 09:40:47PM +0100, Goffredo Baroncelli wrote:
> 
> IIRC, the parity is spread across the disk stripes of the chunk.
> 
> So first you have to find the logical-offset [LO] where the the file begins. Then you have to map this offset to the chunk which holds the data. The chunk has the following info:
> - chunk start [CS], chunk length [CL]
> - for each stripe:
> 	where the stripe starts
> 
> If you subtract the chunk-start from the logical-offset [ CO == LO-CS], you will find the offset where the data belongs in the chunk.
> 
> As stated above, the PARITY is spread across the chunk stripes. So (supposing that the stripe size is 64K, the raid level is 5, the disks are three), 
> 
> - the first 64k of stripe 0, is data [0..64K)
> - the first 64k of stripe 1, is data [64..128K)
> - the first 64k of stripe 2 is parity, 
> 
> - the 2nd 64k of stripe 0 is parity, 
> - the 2nd 64k of stripe 1, is data [128..196K)
> - the 2nd 64k of stripe 2, is data [192..256K)
> 
> - the 3rd 64k of stripe 0, is data [256..320K)
> - the 3rd 64k of stripe 1 is parity, 
> - the 3rd 64k of stripe 2, is data [320..384K)
> and so on,
> 
> To find the data, You have to compare the CO to the data [...) range.
> 
> If you look to an my old patch (unfinished :-( ), you can find some example to dump the different stripe
> 
> [BTRFS-PROGS][PATCH][V2] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
> 
> 
Sorry for the delay, I was offline. Thanks for the details. I can understood "partiy spread across the chunk stripes" part.
But unable to figure-out the first part regarding calculations.

Raid5 With 3-devices each 512MB. Create single 128KB file("print 'Ab'+'a'*65534+'aB'+'b'*65533"). 'debug-tree' shows chunk tree as:

	item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 145096704) itemoff 15557 itemsize 144
		length 134217728 owner 2 stripe_len 65536 type DATA|RAID5
		io_align 65536 io_width 65536 sector_size 4096
		num_stripes 3 sub_stripes 0
			stripe 0 devid 3 offset 63111168
			dev_uuid 9a2a18f1-6193-44b9-aafc-23d161d66110
			stripe 1 devid 2 offset 63111168
			dev_uuid e45ab907-c3a8-4dff-af9f-2ae5fd38ffd6
			stripe 2 devid 1 offset 83034112
			dev_uuid 428c04d9-37da-454a-b7b2-f6fe88580de2
and fs-tree shows:
	item 13 key (145227776 EXTENT_ITEM 131072) itemoff 15788 itemsize 53
		extent refs 1 gen 7 flags DATA
		extent data backref root 5 objectid 257 offset 0 count 1

>From above, I assume: 
LO=145227776  CS=145096704 and CL=134217728
CO=145227776 - 145096704  => CO = 131072

Quite confused from here :s  I'll look into your patches to understand more. I hope sometime in future we will 
have your finished patches :) 'physical-find' and 'physical-find' commands will be really useful for debugging/testing and 
learning purposes. thanks.

Cheers.
Lakshmipathi.G

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
  2017-02-14 20:09   ` Lakshmipathi.G
@ 2017-02-15 18:24     ` Goffredo Baroncelli
  2017-02-16 15:14       ` Lakshmipathi.G
  0 siblings, 1 reply; 7+ messages in thread
From: Goffredo Baroncelli @ 2017-02-15 18:24 UTC (permalink / raw)
  To: Lakshmipathi.G, linux-btrfs

On 2017-02-14 21:09, Lakshmipathi.G wrote:
> On Mon, Feb 06, 2017 at 09:40:47PM +0100, Goffredo Baroncelli wrote:
>>
>> IIRC, the parity is spread across the disk stripes of the chunk.
>>
>> So first you have to find the logical-offset [LO] where the the file begins. Then you have to map this offset to the chunk which holds the data. The chunk has the following info:
>> - chunk start [CS], chunk length [CL]
>> - for each stripe:
>> 	where the stripe starts
>>
>> If you subtract the chunk-start from the logical-offset [ CO == LO-CS], you will find the offset where the data belongs in the chunk.
>>
>> As stated above, the PARITY is spread across the chunk stripes. So (supposing that the stripe size is 64K, the raid level is 5, the disks are three), 
>>
>> - the first 64k of stripe 0, is data [0..64K)
>> - the first 64k of stripe 1, is data [64..128K)
>> - the first 64k of stripe 2 is parity, 
>>
>> - the 2nd 64k of stripe 0 is parity, 
>> - the 2nd 64k of stripe 1, is data [128..196K)
>> - the 2nd 64k of stripe 2, is data [192..256K)
>>
>> - the 3rd 64k of stripe 0, is data [256..320K)
>> - the 3rd 64k of stripe 1 is parity, 
>> - the 3rd 64k of stripe 2, is data [320..384K)
>> and so on,
>>
>> To find the data, You have to compare the CO to the data [...) range.
>>
>> If you look to an my old patch (unfinished :-( ), you can find some example to dump the different stripe
>>
>> [BTRFS-PROGS][PATCH][V2] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
>>
>>
> Sorry for the delay, I was offline. Thanks for the details. I can understood "partiy spread across the chunk stripes" part.
> But unable to figure-out the first part regarding calculations.
> 
> Raid5 With 3-devices each 512MB. Create single 128KB file("print 'Ab'+'a'*65534+'aB'+'b'*65533"). 'debug-tree' shows chunk tree as:
> 
> 	item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 145096704) itemoff 15557 itemsize 144
> 		length 134217728 owner 2 stripe_len 65536 type DATA|RAID5
> 		io_align 65536 io_width 65536 sector_size 4096
> 		num_stripes 3 sub_stripes 0
> 			stripe 0 devid 3 offset 63111168
> 			dev_uuid 9a2a18f1-6193-44b9-aafc-23d161d66110
> 			stripe 1 devid 2 offset 63111168
> 			dev_uuid e45ab907-c3a8-4dff-af9f-2ae5fd38ffd6
> 			stripe 2 devid 1 offset 83034112
> 			dev_uuid 428c04d9-37da-454a-b7b2-f6fe88580de2
> and fs-tree shows:
> 	item 13 key (145227776 EXTENT_ITEM 131072) itemoff 15788 itemsize 53
> 		extent refs 1 gen 7 flags DATA
> 		extent data backref root 5 objectid 257 offset 0 count 1
> 
>>From above, I assume: 
> LO=145227776  CS=145096704 and CL=134217728
> CO=145227776 - 145096704  => CO = 131072
> 
> Quite confused from here :s  I'll look into your patches to understand more. I hope sometime in future we will 
> have your finished patches :) 'physical-find' and 'physical-find' commands will be really useful for debugging/testing and 
> learning purposes. thanks.

The chunk-tree maps the logical address [145096704...145096704+134217728) [size=128MB] to the physical ones 
	devid3 : [63111168..63111168+67108864) [size=64MB]
	devid1 : [63111168..63111168+67108864) [size=64MB]
	devid2 : [83034112..83034112+67108864) [size=64MB]

So because the logical address is divided in pieces of 64k, interleaved by the parity, we know that:
* first 128kb
logical address [145096704      ..145096704+64k)   -> devid1, [63111168    ..63111168+64k)
logical address [145096704+64k  ..145096704+2x64k) -> devid2, [83034112    ..83034112+64k)
                parity:                            -> devid3, [63111168    ..63111168+64k)
* second 128kb
logical address [145096704+2x64k..145096704+3x64k) -> devid2, [83034112+64k..83034112+2x64k)
logical address [145096704+3x64k..145096704+4x64k) -> devid3, [63111168+64k..63111168+2x64k)
                parity:                            -> devid1, [63111168+64k..63111168+2x64k)
And so on...

(NB: 145096704+2x64k == 145227776)

The fs-tree, maps the file content [0..131072) [size=128k] to the logical address [145227776..145227776+131072) [size=128k]

So the file content is stored starting from the disk devid2, at 83034112+64k=83099648 (first 64k). The second 64k is placed in disk devid3 at 63111168+64k=63176704; the parity is stored at disk1, 63111168+64k = 63176704


BR
G.Baroncelli

> 
> Cheers.
> Lakshmipathi.G
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
  2017-02-15 18:24     ` Goffredo Baroncelli
@ 2017-02-16 15:14       ` Lakshmipathi.G
  0 siblings, 0 replies; 7+ messages in thread
From: Lakshmipathi.G @ 2017-02-16 15:14 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: linux-btrfs

On Wed, Feb 15, 2017 at 07:24:55PM +0100, Goffredo Baroncelli wrote:
> The chunk-tree maps the logical address [145096704...145096704+134217728) [size=128MB] to the physical ones 
> 	devid3 : [63111168..63111168+67108864) [size=64MB]
> 	devid1 : [63111168..63111168+67108864) [size=64MB]
> 	devid2 : [83034112..83034112+67108864) [size=64MB]
> 
> So because the logical address is divided in pieces of 64k, interleaved by the parity, we know that:
> * first 128kb
> logical address [145096704      ..145096704+64k)   -> devid1, [63111168    ..63111168+64k)
> logical address [145096704+64k  ..145096704+2x64k) -> devid2, [83034112    ..83034112+64k)
>                 parity:                            -> devid3, [63111168    ..63111168+64k)
> * second 128kb
> logical address [145096704+2x64k..145096704+3x64k) -> devid2, [83034112+64k..83034112+2x64k)
> logical address [145096704+3x64k..145096704+4x64k) -> devid3, [63111168+64k..63111168+2x64k)
>                 parity:                            -> devid1, [63111168+64k..63111168+2x64k)
> And so on...
> 
> (NB: 145096704+2x64k == 145227776)
> 
> The fs-tree, maps the file content [0..131072) [size=128k] to the logical address [145227776..145227776+131072) [size=128k]
> 
> So the file content is stored starting from the disk devid2, at 83034112+64k=83099648 (first 64k). The second 64k is placed in disk devid3 at 63111168+64k=63176704; the parity is stored at disk1, 63111168+64k = 63176704
> 
> 
> BR
> G.Baroncelli
> 

Thanks for the detailed example with exact numbers. Now understood the address mapping better. With this as a reference, I think
it should be possible to access parity/data-stripes more sensible manner instead using expensive "cat /device/ | hexdump | grep"
combination. thanks.

Cheers.
Lakshmipathi.G

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: How to dump/find parity of RAID-5 file?
  2017-02-06 20:40 ` Goffredo Baroncelli
  2017-02-14 20:09   ` Lakshmipathi.G
@ 2018-04-17 19:20   ` Goffredo Baroncelli
  1 sibling, 0 replies; 7+ messages in thread
From: Goffredo Baroncelli @ 2018-04-17 19:20 UTC (permalink / raw)
  To: Lakshmipathi.G, linux-btrfs

On 02/06/2017 09:40 PM, Goffredo Baroncelli wrote:
> On 2017-02-03 11:44, Lakshmipathi.G wrote:
>> Hi.
>>
>> Came across this thread https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg55161.html 
>> Exploring possibility of adding test-scripts around these area using dump-tree & corrupt-block.But
>> unable to figure-out how to get parity of file or find its location.  dump-tree output gave,
>>
>>     item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 145096704) itemoff 15557 itemsize 144
>>         length 134217728 owner 2 stripe_len 65536 type DATA|RAID5
>>         io_align 65536 io_width 65536 sector_size 4096
>>         num_stripes 3 sub_stripes 0
>>             stripe 0 devid 3 offset 63111168                 # Is this parity?Seems empty?
>>             dev_uuid f62df114-186c-4e48-8152-9ed15aa078b4
>>             stripe 1 devid 2 offset 63111168                 # Contains file data-stripe-1
>>             dev_uuid c0aeaab0-e57e-4f7a-9356-db1878876d9f
>>             stripe 2 devid 1 offset 83034112                 # Contains file data-stripe-2
>>             dev_uuid 637b3666-9d8f-4ec4-9969-53b0b933b9b1
>> thanks.
> 
> IIRC, the parity is spread across the disk stripes of the chunk.
> 
> So first you have to find the logical-offset [LO] where the the file begins. Then you have to map this offset to the chunk which holds the data. The chunk has the following info:
> - chunk start [CS], chunk length [CL]
> - for each stripe:
> 	where the stripe starts
> 
> If you subtract the chunk-start from the logical-offset [ CO == LO-CS], you will find the offset where the data belongs in the chunk.
> 
> As stated above, the PARITY is spread across the chunk stripes. So (supposing that the stripe size is 64K, the raid level is 5, the disks are three), 
> 
> - the first 64k of stripe 0, is data [0..64K)
> - the first 64k of stripe 1, is data [64..128K)
> - the first 64k of stripe 2 is parity, 
> 
> - the 2nd 64k of stripe 0 is parity, 
> - the 2nd 64k of stripe 1, is data [128..196K)
> - the 2nd 64k of stripe 2, is data [192..256K)
> 
> - the 3rd 64k of stripe 0, is data [256..320K)
> - the 3rd 64k of stripe 1 is parity, 
> - the 3rd 64k of stripe 2, is data [320..384K)
> and so on,

I was wrong !
- the 3rd 64k of stripe 0, is data [320..384K)
- the 3rd 64k of stripe 1 is parity, 
- the 3rd 64k of stripe 2, is data [256..320K)

Basically stripe 0 and 2 were swapped. The idea is that after parity there is stripe 1, then stripe 2, and so on ....

D    D    D
I    I    I
S    S    S
K    K    K
1    2    3

0    1    P     
P    2    3     
5    P    4     
6    7    P     

Where 
  P = parity
  0...7 are the slices of data

So 'P' parity move from left to right starting from the last disk; the data are in the increasing address from left to right *starting* from parity in a circular buffer.

I am trying to implement raid5/6 in grub, so I went deep in the raid5 layout

BR
G.Baroncelli




> 
> To find the data, You have to compare the CO to the data [...) range.
> 
> If you look to an my old patch (unfinished :-( ), you can find some example to dump the different stripe
> 
> [BTRFS-PROGS][PATCH][V2] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
> 
> 
> 
> 
> 
> 
>> Cheers.
>> Lakshmipathi.G
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-04-17 19:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-03 10:44 How to dump/find parity of RAID-5 file? Lakshmipathi.G
2017-02-06 20:40 ` Goffredo Baroncelli
2017-02-14 20:09   ` Lakshmipathi.G
2017-02-15 18:24     ` Goffredo Baroncelli
2017-02-16 15:14       ` Lakshmipathi.G
2018-04-17 19:20   ` Goffredo Baroncelli
     [not found] ` <8c5cece7-29cf-82df-0739-ef4f0fe8bf70@cn.fujitsu.com>
2017-02-14 20:06   ` Lakshmipathi.G

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).