* Re: Btrfs RAID space utilization and bitrot reconstruction
2012-07-01 11:50 Btrfs RAID space utilization and bitrot reconstruction Waxhead
@ 2012-07-01 12:27 ` Hugo Mills
2012-07-02 18:00 ` Martin Steigerwald
1 sibling, 0 replies; 3+ messages in thread
From: Hugo Mills @ 2012-07-01 12:27 UTC (permalink / raw)
To: Waxhead; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2440 bytes --]
On Sun, Jul 01, 2012 at 01:50:39PM +0200, Waxhead wrote:
> As far as I understand btrfs stores all data in huge chunks that are
> striped, mirrored or "raid5/6'ed" throughout all the disks added to
> the filesystem/volume.
Well, RAID-5/6 hasn't landed yet, but yes.
> How does btrfs deal with different sized disks? let's say that you
> for example have 10 different disks that are
> 100GB,200GB,300GB...1000GB and you create a btrfs filesystem with
> all the disks. How will the raid5 implementation distribute chunks
> in such a setup.
We haven't seen the code for that bit yet.
> I assume the stripe+stripe+parity are separate chunks that are
> placed on separate disks but how does btrfs select the best disk to
> store a chunk on? In short will a slow disk slow down the entire
> "array", parts of it or will btrfs attempt to use the fastest disks
> first?
Chunks are allocated by ordering the devices by the amount of free
(=unallocated) space left on each, and picking the chunks from devices
in that order. For RAID-1 chunks are picked in pairs. For RAID-0, "as
many as possible" are picked, down to a minimum of 2 (I think). For
RAID-10, the largest even number possible is picked, down to a minimum
of 4. I _believe_ that RAID-5 and -6 will pick as many as possible,
down to some minimum -- but as I said, we haven't seen the code yet.
> Also since btrfs checksums both data and metadata I am thinking that
> at least the raid6 implementation perhaps can (try to) reconstruct
> corrupt data (and try to rewrite it) before reading an alternate
> copy. Can someone please fill me in on the details here?
Yes, it should be possible to do that with RAID-5 as well. (Read
the data stripes, verify checksums, if one fails, read the parity,
verify that, and reconstruct the bad block from the known-good data).
> Finaly how does btrfs deals with advanced format (4k sectors) drives
> when the entire drive (and not a partition) is used to build a btrfs
> filesystem. Is proper alignment achieved?
I don't know about that. However, the native block size in btrfs is
4k, so I'd imagine that it's all good.
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- You stay in the theatre because you're afraid of having no ---
money? There's irony...
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread