linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Feature request: true RAID-1 mode
@ 2012-06-20 16:27 H. Peter Anvin
  2012-06-21  0:35 ` Marios Titas
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-06-20 16:27 UTC (permalink / raw)
  To: linux-btrfs

Yet another boot loader support request.

Right now btrfs' definition of "RAID-1" with more than two devices is a
bit unorthodox: it stores on any two drives.  "True RAID-1" would
instead store N copies on each of N devices, the same way an actual
RAID-1 would operate with an arbitrary number of devices.

This means that a bootloader can consider a single device in isolation:
if the firmware gives access only to a single device, it can be booted.
 Since /boot is usually a very small amount of data, this is a very
reasonable tradeoff.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-20 16:27 Feature request: true RAID-1 mode H. Peter Anvin
@ 2012-06-21  0:35 ` Marios Titas
  2012-06-21  0:50   ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Marios Titas @ 2012-06-21  0:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-btrfs

On Wed, Jun 20, 2012 at 12:27 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> Yet another boot loader support request.
>
> Right now btrfs' definition of "RAID-1" with more than two devices is a
> bit unorthodox: it stores on any two drives.  "True RAID-1" would
> instead store N copies on each of N devices, the same way an actual
> RAID-1 would operate with an arbitrary number of devices.
>
> This means that a bootloader can consider a single device in isolation:
> if the firmware gives access only to a single device, it can be booted.
>  Since /boot is usually a very small amount of data, this is a very
> reasonable tradeoff.

+1

In fact, the current RAID-1 should not have been called RAID-1 at all,
it is confusing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-21  0:35 ` Marios Titas
@ 2012-06-21  0:50   ` Chris Mason
  2012-06-21  1:34     ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Mason @ 2012-06-21  0:50 UTC (permalink / raw)
  To: Marios Titas; +Cc: H. Peter Anvin, linux-btrfs

On Wed, Jun 20, 2012 at 06:35:30PM -0600, Marios Titas wrote:
> On Wed, Jun 20, 2012 at 12:27 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> > Yet another boot loader support request.
> >
> > Right now btrfs' definition of "RAID-1" with more than two devices is a
> > bit unorthodox: it stores on any two drives.  "True RAID-1" would
> > instead store N copies on each of N devices, the same way an actual
> > RAID-1 would operate with an arbitrary number of devices.
> >
> > This means that a bootloader can consider a single device in isolation:
> > if the firmware gives access only to a single device, it can be booted.
> >  Since /boot is usually a very small amount of data, this is a very
> > reasonable tradeoff.
> 
> +1
> 
> In fact, the current RAID-1 should not have been called RAID-1 at all,
> it is confusing.

With the raid5/6 code, I'm changing raid1 (and raid10) to have a
configurable number of copies.  So, you'll be able to have N copies on M
drives, where N <= M.

-chris


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-21  0:50   ` Chris Mason
@ 2012-06-21  1:34     ` H. Peter Anvin
  2012-06-25 15:21       ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-06-21  1:34 UTC (permalink / raw)
  To: Chris Mason, Marios Titas; +Cc: linux-btrfs

Could you have a mode, though, where M = N at all times, so a user doesn't end up adding a new drive and get a nasty surprise?

Chris Mason <chris.mason@fusionio.com> wrote:

>On Wed, Jun 20, 2012 at 06:35:30PM -0600, Marios Titas wrote:
>> On Wed, Jun 20, 2012 at 12:27 PM, H. Peter Anvin <hpa@zytor.com>
>wrote:
>> > Yet another boot loader support request.
>> >
>> > Right now btrfs' definition of "RAID-1" with more than two devices
>is a
>> > bit unorthodox: it stores on any two drives.  "True RAID-1" would
>> > instead store N copies on each of N devices, the same way an actual
>> > RAID-1 would operate with an arbitrary number of devices.
>> >
>> > This means that a bootloader can consider a single device in
>isolation:
>> > if the firmware gives access only to a single device, it can be
>booted.
>> >  Since /boot is usually a very small amount of data, this is a very
>> > reasonable tradeoff.
>> 
>> +1
>> 
>> In fact, the current RAID-1 should not have been called RAID-1 at
>all,
>> it is confusing.
>
>With the raid5/6 code, I'm changing raid1 (and raid10) to have a
>configurable number of copies.  So, you'll be able to have N copies on
>M
>drives, where N <= M.
>
>-chris

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-21  1:34     ` H. Peter Anvin
@ 2012-06-25 15:21       ` Chris Mason
  2012-06-25 17:46         ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Mason @ 2012-06-25 15:21 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Chris L. Mason, Marios Titas, linux-btrfs

Yes and no.  If you have 2 drives and you add one more, we can make it
do all new chunks over 3 drives.  But, turning the existing double
mirror chunks into a triple mirror requires a balance.

-chris

On Wed, Jun 20, 2012 at 07:34:27PM -0600, H. Peter Anvin wrote:
> Could you have a mode, though, where M = N at all times, so a user doesn't end up adding a new drive and get a nasty surprise?
> 
> Chris Mason <chris.mason@fusionio.com> wrote:
> 
> >On Wed, Jun 20, 2012 at 06:35:30PM -0600, Marios Titas wrote:
> >> On Wed, Jun 20, 2012 at 12:27 PM, H. Peter Anvin <hpa@zytor.com>
> >wrote:
> >> > Yet another boot loader support request.
> >> >
> >> > Right now btrfs' definition of "RAID-1" with more than two devices
> >is a
> >> > bit unorthodox: it stores on any two drives.  "True RAID-1" would
> >> > instead store N copies on each of N devices, the same way an actual
> >> > RAID-1 would operate with an arbitrary number of devices.
> >> >
> >> > This means that a bootloader can consider a single device in
> >isolation:
> >> > if the firmware gives access only to a single device, it can be
> >booted.
> >> >  Since /boot is usually a very small amount of data, this is a very
> >> > reasonable tradeoff.
> >> 
> >> +1
> >> 
> >> In fact, the current RAID-1 should not have been called RAID-1 at
> >all,
> >> it is confusing.
> >
> >With the raid5/6 code, I'm changing raid1 (and raid10) to have a
> >configurable number of copies.  So, you'll be able to have N copies on
> >M
> >drives, where N <= M.
> >
> >-chris
> 
> -- 
> Sent from my mobile phone. Please excuse brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 15:21       ` Chris Mason
@ 2012-06-25 17:46         ` H. Peter Anvin
  2012-06-25 22:34           ` Gareth Pye
                             ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: H. Peter Anvin @ 2012-06-25 17:46 UTC (permalink / raw)
  To: Chris Mason, Chris L. Mason, Marios Titas, linux-btrfs

On 06/25/2012 08:21 AM, Chris Mason wrote:
> Yes and no.  If you have 2 drives and you add one more, we can make it
> do all new chunks over 3 drives.  But, turning the existing double
> mirror chunks into a triple mirror requires a balance.
> 
> -chris

So trigger one.  This is the exact analogue to the resync pass that is
required in classic RAID after adding new media.

	-hpa


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 17:46         ` H. Peter Anvin
@ 2012-06-25 22:34           ` Gareth Pye
       [not found]           ` <CA+WRLO87LvTrRJRMNwYQ3SwmZWF7WzO_FDi1=jqt1kwX=YSnWQ@mail.gmail.com>
  2012-06-25 22:54           ` Hugo Mills
  2 siblings, 0 replies; 12+ messages in thread
From: Gareth Pye @ 2012-06-25 22:34 UTC (permalink / raw)
  To: linux-btrfs

On Tue, Jun 26, 2012 at 3:46 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 06/25/2012 08:21 AM, Chris Mason wrote:
> > Yes and no.  If you have 2 drives and you add one more, we can make it
> > do all new chunks over 3 drives.  But, turning the existing double
> > mirror chunks into a triple mirror requires a balance.
> >
> > -chris
>
> So trigger one.  This is the exact analogue to the resync pass that is
> required in classic RAID after adding new media.
>
>        -hpa
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

To me one doesn't have to be triggered, a user expects to have to tell
the disks to rebuild/resync/balance after adding a disk, they may want
to wait till they've added all 4 disks and run a few extra commands
before they run the rebalance. What is important is having a mode that
doesn't require the user to remember that what they had used as the
closest analogue to RAID1 that BTRFS supports requires them to run
another command to change the 'RAID level' to be the RAID1 analogue
for the new number of disks.

Users will forget that and they will lose data because of it. At least
with a M=N mode BTRFS can say they tried to make it easy to avoid that
pitfall.

(resend in plain text for mailing list, CC list received the HTML version)

--
Gareth Pye
Level 2 Judge, Melbourne, Australia
Australian MTG Forum: mtgau.com
gareth@cerberos.id.au - www.rockpaperdynamite.wordpress.com
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
       [not found]           ` <CA+WRLO87LvTrRJRMNwYQ3SwmZWF7WzO_FDi1=jqt1kwX=YSnWQ@mail.gmail.com>
@ 2012-06-25 22:37             ` H. Peter Anvin
  2012-06-25 22:46               ` Gareth Pye
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-06-25 22:37 UTC (permalink / raw)
  To: Gareth Pye; +Cc: Chris Mason, Chris L. Mason, Marios Titas, linux-btrfs

On 06/25/2012 03:28 PM, Gareth Pye wrote:
> To me one doesn't have to be triggered, a user expects to have to tell
> the disks to rebuild/resync/balance after adding a disk, they may want
> to wait till they've added all 4 disks and run a few extra commands
> before they run the rebalance.

They do?  E.g. mdadm doesn't make them...

> What is important is having a mode that
> doesn't require the user to remember that what they had used as the
> closest analogue to RAID1 that BTRFS supports requires them to run
> another command to change the 'RAID level' to be the RAID1 analogue for
> the new number of disks. 
> 
> Users will forget that and they will lose data because of it. At least
> with a M=N mode BTRFS can say they tried to make it easy to avoid that
> pitfall.

Doesn't that contradict your previous statement?  In either case, I
agree with the latter...
	
	-hpa

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 22:37             ` H. Peter Anvin
@ 2012-06-25 22:46               ` Gareth Pye
  0 siblings, 0 replies; 12+ messages in thread
From: Gareth Pye @ 2012-06-25 22:46 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Chris Mason, Chris L. Mason, Marios Titas, linux-btrfs

On Tue, Jun 26, 2012 at 8:37 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> They do?  E.g. mdadm doesn't make them...

Hrm, you are right. It is something I always confirm is happening
though. Without a M=N mode there would need to be two balances as the
first balance would be doing it wrong :(


-- 
Gareth Pye
Level 2 Judge, Melbourne, Australia
Australian MTG Forum: mtgau.com
gareth@cerberos.id.au - www.rockpaperdynamite.wordpress.com
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 17:46         ` H. Peter Anvin
  2012-06-25 22:34           ` Gareth Pye
       [not found]           ` <CA+WRLO87LvTrRJRMNwYQ3SwmZWF7WzO_FDi1=jqt1kwX=YSnWQ@mail.gmail.com>
@ 2012-06-25 22:54           ` Hugo Mills
  2012-06-25 23:00             ` H. Peter Anvin
  2 siblings, 1 reply; 12+ messages in thread
From: Hugo Mills @ 2012-06-25 22:54 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Chris Mason, Chris L. Mason, Marios Titas, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1364 bytes --]

On Mon, Jun 25, 2012 at 10:46:01AM -0700, H. Peter Anvin wrote:
> On 06/25/2012 08:21 AM, Chris Mason wrote:
> > Yes and no.  If you have 2 drives and you add one more, we can make it
> > do all new chunks over 3 drives.  But, turning the existing double
> > mirror chunks into a triple mirror requires a balance.
> > 
> > -chris
> 
> So trigger one.  This is the exact analogue to the resync pass that is
> required in classic RAID after adding new media.

   You'd have to cancel and restart if a second new disk was added
while the first balance was ongoing. Fortunately, this isn't a problem
these days.

   Also, it occurs to me that I should just check -- are you aware
that the btrfs implementation of RAID-1 makes no guarantees about the
location of any given piece of data? i.e. if I have a piece of data
stored at block X on disk 1, it's not guaranteed to be stored at block
X on disks 2, 3, 4, ... I'm not sure if this is important to you, but
it's a significant difference between the btrfs implementation of
RAID-1 and the MD implementation.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
       --- Never underestimate the bandwidth of a Volvo filled ---       
                           with backup tapes.                            

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 22:54           ` Hugo Mills
@ 2012-06-25 23:00             ` H. Peter Anvin
  2012-07-02 15:59               ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2012-06-25 23:00 UTC (permalink / raw)
  To: Hugo Mills, Chris Mason, Chris L. Mason, Marios Titas,
	linux-btrfs

On 06/25/2012 03:54 PM, Hugo Mills wrote:
> On Mon, Jun 25, 2012 at 10:46:01AM -0700, H. Peter Anvin wrote:
>> On 06/25/2012 08:21 AM, Chris Mason wrote:
>>> Yes and no.  If you have 2 drives and you add one more, we can
>>> make it do all new chunks over 3 drives.  But, turning the
>>> existing double mirror chunks into a triple mirror requires a
>>> balance.
>>> 
>>> -chris
>> 
>> So trigger one.  This is the exact analogue to the resync pass
>> that is required in classic RAID after adding new media.
> 
> You'd have to cancel and restart if a second new disk was added 
> while the first balance was ongoing. Fortunately, this isn't a
> problem these days.
> 
> Also, it occurs to me that I should just check -- are you aware 
> that the btrfs implementation of RAID-1 makes no guarantees about
> the location of any given piece of data? i.e. if I have a piece of
> data stored at block X on disk 1, it's not guaranteed to be stored
> at block X on disks 2, 3, 4, ... I'm not sure if this is important
> to you, but it's a significant difference between the btrfs
> implementation of RAID-1 and the MD implementation.
> 

I am aware of that, and it is not a problem... the one-device
bootloader can find out *which* disk it is talking to by comparing
uuids, and the btrfs data structures will tell it how to find the data
on that specific disk.  It does of course mean the bootloader needs to
be aware of the multidisk nature of btrfs, but that isn't a problem in
itself.

	-hpa


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Feature request: true RAID-1 mode
  2012-06-25 23:00             ` H. Peter Anvin
@ 2012-07-02 15:59               ` H. Peter Anvin
  0 siblings, 0 replies; 12+ messages in thread
From: H. Peter Anvin @ 2012-07-02 15:59 UTC (permalink / raw)
  To: Hugo Mills, Chris Mason, Chris L. Mason, Marios Titas,
	linux-btrfs

On 06/25/2012 04:00 PM, H. Peter Anvin wrote:
>
> I am aware of that, and it is not a problem... the one-device
> bootloader can find out *which* disk it is talking to by comparing
> uuids, and the btrfs data structures will tell it how to find the data
> on that specific disk.  It does of course mean the bootloader needs to
> be aware of the multidisk nature of btrfs, but that isn't a problem in
> itself.
>

So, also, let me address the question why we should care about a 
one-device bootloader.  It is quite common, especially in fileservers, 
for a subset of the boot devices to be inaccessible by the firmware, due 
to bugs, boot time concerns (spinning up all the media in the firmware 
is SLOW) or just plain lack of support of plug-in cards.  As such, the 
reliable thing to do is to make sure that any disk being seen is enough 
to bring up the system; since this is such a small amount of data with 
modern standards, there is just no reason to do anything less robust.

Once the kernel comes up it has all the device drivers, of course.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.




^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-07-02 15:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-20 16:27 Feature request: true RAID-1 mode H. Peter Anvin
2012-06-21  0:35 ` Marios Titas
2012-06-21  0:50   ` Chris Mason
2012-06-21  1:34     ` H. Peter Anvin
2012-06-25 15:21       ` Chris Mason
2012-06-25 17:46         ` H. Peter Anvin
2012-06-25 22:34           ` Gareth Pye
     [not found]           ` <CA+WRLO87LvTrRJRMNwYQ3SwmZWF7WzO_FDi1=jqt1kwX=YSnWQ@mail.gmail.com>
2012-06-25 22:37             ` H. Peter Anvin
2012-06-25 22:46               ` Gareth Pye
2012-06-25 22:54           ` Hugo Mills
2012-06-25 23:00             ` H. Peter Anvin
2012-07-02 15:59               ` H. Peter Anvin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).