Re: Why does Btrfs allow raid1 with mismatched drives? Also: How to look behind the curtain

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Martin Steigerwald <Martin@lichtvoll.de>
To: linux-btrfs@vger.kernel.org
Cc: Fabian Zeindl <fabian.zeindl@gmail.com>, Hugo Mills <hugo@carfax.org.uk>
Subject: Re: Why does Btrfs allow raid1 with mismatched drives? Also: How to look behind the curtain
Date: Thu, 5 Jan 2012 11:39:20 +0100	[thread overview]
Message-ID: <201201051139.20334.Martin@lichtvoll.de> (raw)
In-Reply-To: <DFD486DD-C324-43EE-82BC-87DDCF432DB6@gmail.com>

Am Donnerstag, 5. Januar 2012 schrieb Fabian Zeindl:
> On Thursday, January 5, 2012 at 10:44 , Hugo Mills wrote:
> > You should probably read the mis-named "Sysadmin's Guide"
> > on the wiki[1], which explains what btrfs actually does with its
> > replication.
> >=20
> > You should also probably read the FAQ entries on free space[2],
> > since using plain "df" for btrfs is usually misleading.
>=20
> I read both, but it doesn't answer my question on how btrfs behaves
> when it can't actually do a raid1, because there's not enough data on
> an "other" disk for a chunk-copy.

=46rom my reading that Sysadmin Guide answers your question:

BTRFS with RAID-1 will allocate chunks on two devices:

> Btrfs's "RAID" implementation bears only passing resemblance to
> traditional RAID implementations. Instead, btrfs replicates data on a=
=20
> per-chunk basis. If the filesystem is configured to use "RAID-1", for=
=20
> example, chunks are allocated in pairs, with each chunk of the pair=20
> being taken from a different block device. Data written to such a chu=
nk=20
> pair will be duplicated across both chunks.
>=20
> Stripe-based "RAID" levels (RAID-0, RAID-10) work in a similar way,=20
> allocating as many chunks as can fit across the drives with free spac=
e,=20
> and then perform striping of data at a level smaller than a chunk. So=
,=20
> for a RAID-10 filesystem on 4 disks, data may be stored like this:

[=E2=80=A6 quoted from the Wiki page =E2=80=A6]

"Allocating as many chunks as can fit across the drives" is also pretty=
=20
clear to me. So if BTRFS can=C2=B4t allocate a new chunk on two devices=
, its=20
full. To me it seems obvious that BTRFS will not break the RAID-1=20
redundancy guarentee unless a drive fails.

Thus when using a RAID-1 with two devices, the smaller one should defin=
e=20
the maximum capacity of the device. But when you use a RAID-1 with one =
500=20
GB and two 250 GB drives, BTRFS can replicate each chunk on the 500 GB=20
drive on *one* of the both 250 GB drives.

Thus is makes perfect sense to support differently sized drives in a BT=
RFS=20
pool.

My own observations with a RAID-10 across 4 devices support this. I ech=
o=C2=B4d=20
"1" > /sys/block/sdX/delete to remove one harddisk while a dd was runni=
ng=20
to the RAID. BTRFS used the remaining disks. On next reboot all disks=20
where available again. While BTRFS didn=C2=B4t start rebalancing the RA=
ID=20
automatically a btrfs filesystem balance made it fill up the previously=
=20
failed device until all devices had the same usage. This is also descri=
bed=20
in the sysadmin guide: So this is what you have to care for manually. I=
f a=20
drive failed, you have to balance the filesystem so that it creates=20
replicas where they are missing.

Now anyone deeper into BTRFS please check at whether my understanding=20
matches what BTRFS is doing=E2=80=A6

> > You could run a scrub, which will verify all of the data mirrors on
> > the volume, and fix anything that's not redundant.
>=20
> Will this command fail then for example?

No, unless more than the allowed number of disks are failing.

--=20
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2012-01-05 10:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-05  9:21 Why does Btrfs allow raid1 with mismatched drives? Also: How to look behind the curtain Fabian Zeindl
2012-01-05  9:43 ` Fabian Zeindl
2012-01-05  9:44 ` Hugo Mills
2012-01-05  9:53   ` Fabian Zeindl
2012-01-05 10:39     ` Martin Steigerwald [this message]
2012-01-05 12:26       ` Fabian Zeindl
2012-01-05 13:01         ` Martin Steigerwald
2012-01-05 13:35         ` Roman Kapusta
2012-01-05 13:47           ` Fabian Zeindl
2012-01-05 14:40             ` Martin Steigerwald
  -- strict thread matches above, loose matches on Subject: below --
2012-01-05 14:41 Martin Steigerwald

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201201051139.20334.Martin@lichtvoll.de \
    --to=martin@lichtvoll.de \
    --cc=fabian.zeindl@gmail.com \
    --cc=hugo@carfax.org.uk \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.