* RAID1, hot-swap and boot integrity
@ 2007-03-02 14:04 Mike Accetta
2007-03-02 15:40 ` Justin Piszcz
` (5 more replies)
0 siblings, 6 replies; 14+ messages in thread
From: Mike Accetta @ 2007-03-02 14:04 UTC (permalink / raw)
To: linux-raid
We are using a RAID1 setup with two SATA disks on x86, using the whole
disks as the array components. I'm pondering the following scenario.
We will boot from whichever drive the BIOS has first in its boot list
(the other drive will be second). In the steady state this choice is
immaterial. However, if after we boot that first drive fails and has
to be hot-swapped *and* the system crashes or is rebooted while the
re-sync operation is still running, it seems possible (perhaps even
likely) that the BIOS will to choose to boot from the same disk slot.
However, the drive in that slot is still being recovered and may not be
intact enough to boot from yet.
I've been considering trying something like having the re-sync algorithm
on a whole disk array defer the copy for sector 0 to the very end of the
re-sync operation. Assuming the BIOS makes at least a minimal consistency
check on sector 0 before electing to boot from the drive, this would keep
it from selecting a partially re-sync'd drive that was not previously bootable.
Another wrinkle might be to also have re-sync zap sector 0 initially so that
a previously bootable disk added as the replacement would not be booted
in an inconsistent state, although this could not eliminate the window
in which a crash or reboot before the re-sync even started could cause
the replacement disk with who knows what contents to be booted. This
also seems a fairly x86 centric solution, although that is fine for our
current application which is x86-based.
Thoughts or other suggestions anyone?
--
Mike Accetta
ECI Telecom Ltd.
Data Networking Division (previously Laurel Networks)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-02 14:04 RAID1, hot-swap and boot integrity Mike Accetta
@ 2007-03-02 15:40 ` Justin Piszcz
2007-03-02 16:02 ` Gabor Gombas
2007-03-02 16:10 ` Gabor Gombas
` (4 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Justin Piszcz @ 2007-03-02 15:40 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
On Fri, 2 Mar 2007, Mike Accetta wrote:
> We are using a RAID1 setup with two SATA disks on x86, using the whole
> disks as the array components. I'm pondering the following scenario.
> We will boot from whichever drive the BIOS has first in its boot list
> (the other drive will be second). In the steady state this choice is
> immaterial. However, if after we boot that first drive fails and has
> to be hot-swapped *and* the system crashes or is rebooted while the
> re-sync operation is still running, it seems possible (perhaps even
> likely) that the BIOS will to choose to boot from the same disk slot.
> However, the drive in that slot is still being recovered and may not be
> intact enough to boot from yet.
>
> I've been considering trying something like having the re-sync algorithm
> on a whole disk array defer the copy for sector 0 to the very end of the
> re-sync operation. Assuming the BIOS makes at least a minimal consistency
> check on sector 0 before electing to boot from the drive, this would keep
> it from selecting a partially re-sync'd drive that was not previously
> bootable.
> Another wrinkle might be to also have re-sync zap sector 0 initially so that
> a previously bootable disk added as the replacement would not be booted
> in an inconsistent state, although this could not eliminate the window
> in which a crash or reboot before the re-sync even started could cause
> the replacement disk with who knows what contents to be booted. This
> also seems a fairly x86 centric solution, although that is fine for our
> current application which is x86-based.
>
> Thoughts or other suggestions anyone?
> --
> Mike Accetta
>
> ECI Telecom Ltd.
> Data Networking Division (previously Laurel Networks)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
"is rebooted while the re-sync operation is still running"
AFAIK mdadm/kernel raid can handle this, I had a number of occaisons when
my UPS shut my machine down when I was rebuilding a RAID5 array, when the
box came back up, the rebuild picked up where it left off.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-02 15:40 ` Justin Piszcz
@ 2007-03-02 16:02 ` Gabor Gombas
0 siblings, 0 replies; 14+ messages in thread
From: Gabor Gombas @ 2007-03-02 16:02 UTC (permalink / raw)
To: Justin Piszcz; +Cc: Mike Accetta, linux-raid
On Fri, Mar 02, 2007 at 10:40:32AM -0500, Justin Piszcz wrote:
> AFAIK mdadm/kernel raid can handle this, I had a number of occaisons when
> my UPS shut my machine down when I was rebuilding a RAID5 array, when the
> box came back up, the rebuild picked up where it left off.
_If_ the resync got far enough that the kernel image is already copied.
The original mail is about the case when the sectors where the
kernel/initramfs should be are not yet synced...
Gabor
--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-02 14:04 RAID1, hot-swap and boot integrity Mike Accetta
2007-03-02 15:40 ` Justin Piszcz
@ 2007-03-02 16:10 ` Gabor Gombas
2007-03-02 20:57 ` Bill Davidsen
2007-03-04 17:31 ` H. Peter Anvin
` (3 subsequent siblings)
5 siblings, 1 reply; 14+ messages in thread
From: Gabor Gombas @ 2007-03-02 16:10 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote:
> Thoughts or other suggestions anyone?
This is a case where a very small /boot partition is still a very good
idea... 50-100MB is a good choice (some initramfs generators require
quite a bit of space under /boot while generating the initramfs image
esp. if you use distro-provided "contains-everything-and-the-kitchen-sink"
kernels, so it is not wise to make /boot _too_ small).
But if you do not want /boot to be separate a moderately sized root
partition is equally good. What you want to avoid is the "whole disk is
a single partition/file system" kind of setup.
Gabor
--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-02 16:10 ` Gabor Gombas
@ 2007-03-02 20:57 ` Bill Davidsen
0 siblings, 0 replies; 14+ messages in thread
From: Bill Davidsen @ 2007-03-02 20:57 UTC (permalink / raw)
To: Gabor Gombas; +Cc: Mike Accetta, linux-raid
Gabor Gombas wrote:
> On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote:
>
>
>> Thoughts or other suggestions anyone?
>>
>
> This is a case where a very small /boot partition is still a very good
> idea... 50-100MB is a good choice (some initramfs generators require
> quite a bit of space under /boot while generating the initramfs image
> esp. if you use distro-provided "contains-everything-and-the-kitchen-sink"
> kernels, so it is not wise to make /boot _too_ small).
>
You are exactly right on that! Some (many) BIOS implementations will
read the boot sector off the drive, and if there is no error will run
the boot sector.
> But if you do not want /boot to be separate a moderately sized root
> partition is equally good. What you want to avoid is the "whole disk is
> a single partition/file system" kind of setup.
>
>
Actually, the solution is moderately simple, install the replacement
drive, create the partitions, and **don't mark the boot partition
active** until the copy is complete. The BIOS will boot from the 1st
active partition it finds (again, in sane cases).
I never have anything changing in /boot in normal operation, so I admit
to using dd to do a copy with the array stopped. No particular reason to
think it works better than just a rebuild. After the partition is valid
I set the active flag in the partition.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-02 14:04 RAID1, hot-swap and boot integrity Mike Accetta
2007-03-02 15:40 ` Justin Piszcz
2007-03-02 16:10 ` Gabor Gombas
@ 2007-03-04 17:31 ` H. Peter Anvin
[not found] ` <9782.20070302161004.GE31010@boogie.lpds.sztaki.hu>
` (2 subsequent siblings)
5 siblings, 0 replies; 14+ messages in thread
From: H. Peter Anvin @ 2007-03-04 17:31 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
Mike Accetta wrote:
>
> I've been considering trying something like having the re-sync algorithm
> on a whole disk array defer the copy for sector 0 to the very end of the
> re-sync operation. Assuming the BIOS makes at least a minimal consistency
> check on sector 0 before electing to boot from the drive, this would keep
> it from selecting a partially re-sync'd drive that was not previously
> bootable.
The only check that it will make is to look for 55 AA at the end of the MBR.
Note that typically the MBR is not part of any of your MD volumes.
-hpa
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
[not found] ` <9782.20070302161004.GE31010@boogie.lpds.sztaki.hu>
@ 2007-03-05 23:32 ` Mike Accetta
2007-03-06 10:01 ` Gabor Gombas
0 siblings, 1 reply; 14+ messages in thread
From: Mike Accetta @ 2007-03-05 23:32 UTC (permalink / raw)
To: Gabor Gombas; +Cc: linux-raid
Gabor Gombas wrote:
> On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote:
>
>> Thoughts or other suggestions anyone?
>
> This is a case where a very small /boot partition is still a very good
> idea... 50-100MB is a good choice (some initramfs generators require
> quite a bit of space under /boot while generating the initramfs image
> esp. if you use distro-provided "contains-everything-and-the-kitchen-sink"
> kernels, so it is not wise to make /boot _too_ small).
>
> But if you do not want /boot to be separate a moderately sized root
> partition is equally good. What you want to avoid is the "whole disk is
> a single partition/file system" kind of setup.
Yes, we actually have a separate (smallish) boot partition at the front of
the array. This does reduce the at-risk window substantially. I'll have to
ponder whether it reduces it close enough to negligible to then ignore, but
that is indeed a good point to consider.
--
Mike Accetta
ECI Telecom Ltd.
Data Networking Division (previously Laurel Networks)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
[not found] ` <12470.45E88FAA.5020106@tmr.com>
@ 2007-03-05 23:38 ` Mike Accetta
2007-03-06 23:17 ` Bill Davidsen
2007-03-07 20:48 ` H. Peter Anvin
0 siblings, 2 replies; 14+ messages in thread
From: Mike Accetta @ 2007-03-05 23:38 UTC (permalink / raw)
To: Bill Davidsen; +Cc: linux-raid
Bill Davidsen wrote:
> Gabor Gombas wrote:
>> On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote:
>>
>>
>>> Thoughts or other suggestions anyone?
>>>
>>
>> This is a case where a very small /boot partition is still a very good
>> idea... 50-100MB is a good choice (some initramfs generators require
>> quite a bit of space under /boot while generating the initramfs image
>> esp. if you use distro-provided
>> "contains-everything-and-the-kitchen-sink"
>> kernels, so it is not wise to make /boot _too_ small).
>>
> You are exactly right on that! Some (many) BIOS implementations will
> read the boot sector off the drive, and if there is no error will run
> the boot sector.
>> But if you do not want /boot to be separate a moderately sized root
>> partition is equally good. What you want to avoid is the "whole disk is
>> a single partition/file system" kind of setup.
>>
>>
> Actually, the solution is moderately simple, install the replacement
> drive, create the partitions, and **don't mark the boot partition
> active** until the copy is complete. The BIOS will boot from the 1st
> active partition it finds (again, in sane cases).
>
> I never have anything changing in /boot in normal operation, so I admit
> to using dd to do a copy with the array stopped. No particular reason to
> think it works better than just a rebuild. After the partition is valid
> I set the active flag in the partition.
>
I gathered the impression somewhere, perhaps incorrectly, that the active
flag was a function of the boot block, not the BIOS. We use Grub in the MBR
and don't even have an active flag set in the partition table. The system
still boots.
--
Mike Accetta
ECI Telecom Ltd.
Data Networking Division (previously Laurel Networks)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
[not found] ` <12721.45EB026B.1040503@zytor.com>
@ 2007-03-05 23:47 ` Mike Accetta
2007-03-05 23:55 ` H. Peter Anvin
0 siblings, 1 reply; 14+ messages in thread
From: Mike Accetta @ 2007-03-05 23:47 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: linux-raid
H. Peter Anvin wrote:
> Mike Accetta wrote:
>>
>> I've been considering trying something like having the re-sync algorithm
>> on a whole disk array defer the copy for sector 0 to the very end of the
>> re-sync operation. Assuming the BIOS makes at least a minimal
>> consistency
>> check on sector 0 before electing to boot from the drive, this would keep
>> it from selecting a partially re-sync'd drive that was not previously
>> bootable.
>
> The only check that it will make is to look for 55 AA at the end of the
> MBR.
>
> Note that typically the MBR is not part of any of your MD volumes.
Yes, that is also what I've observed in the case of our BIOS. I'm still
trying to get our BIOS vendor to confirm that it will fail over to the next
drive in the boot list on a read error of sector 0. We're contemplating
some GRUB hacking to fail-over to the "other" drive once it is in control
and sees problems.
I wonder if having the MBR typically outside of the array and the relative
newness of partitioned arrays are related? When I was considering how to
architect the RAID1 layout it seemed like a partitioned array on the
entire disk worked most naturally.
--
Mike Accetta
ECI Telecom Ltd.
Data Networking Division (previously Laurel Networks)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-05 23:47 ` Mike Accetta
@ 2007-03-05 23:55 ` H. Peter Anvin
2007-03-06 23:18 ` Bill Davidsen
0 siblings, 1 reply; 14+ messages in thread
From: H. Peter Anvin @ 2007-03-05 23:55 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
Mike Accetta wrote:
> I wonder if having the MBR typically outside of the array and the relative
> newness of partitioned arrays are related? When I was considering how to
> architect the RAID1 layout it seemed like a partitioned array on the
> entire disk worked most naturally.
It's one way to do it, for sure. The main problem with that, of course,
is that it's not compatible with other operating systems.
-hpa
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-05 23:32 ` Mike Accetta
@ 2007-03-06 10:01 ` Gabor Gombas
0 siblings, 0 replies; 14+ messages in thread
From: Gabor Gombas @ 2007-03-06 10:01 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
On Mon, Mar 05, 2007 at 06:32:32PM -0500, Mike Accetta wrote:
> Yes, we actually have a separate (smallish) boot partition at the front of
> the array. This does reduce the at-risk window substantially. I'll have to
> ponder whether it reduces it close enough to negligible to then ignore, but
> that is indeed a good point to consider.
Replacing a failed disk requires a human to pull out the old disk and
insert the new one. The /boot partition should resync in less than 1
minute (if not, it's _way_ too big), so the same human should still be
around to kick the machine if something goes wrong.
Gabor
--
---------------------------------------------------------
MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
---------------------------------------------------------
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-05 23:38 ` Mike Accetta
@ 2007-03-06 23:17 ` Bill Davidsen
2007-03-07 20:48 ` H. Peter Anvin
1 sibling, 0 replies; 14+ messages in thread
From: Bill Davidsen @ 2007-03-06 23:17 UTC (permalink / raw)
To: Mike Accetta; +Cc: linux-raid
Mike Accetta wrote:
> Bill Davidsen wrote:
>> Gabor Gombas wrote:
>>> On Fri, Mar 02, 2007 at 09:04:40AM -0500, Mike Accetta wrote:
>>>
>>>
>>>> Thoughts or other suggestions anyone?
>>>>
>>>
>>> This is a case where a very small /boot partition is still a very good
>>> idea... 50-100MB is a good choice (some initramfs generators require
>>> quite a bit of space under /boot while generating the initramfs image
>>> esp. if you use distro-provided
>>> "contains-everything-and-the-kitchen-sink"
>>> kernels, so it is not wise to make /boot _too_ small).
>>>
>> You are exactly right on that! Some (many) BIOS implementations will
>> read the boot sector off the drive, and if there is no error will run
>> the boot sector.
>>> But if you do not want /boot to be separate a moderately sized root
>>> partition is equally good. What you want to avoid is the "whole disk is
>>> a single partition/file system" kind of setup.
>>>
>>>
>> Actually, the solution is moderately simple, install the replacement
>> drive, create the partitions, and **don't mark the boot partition
>> active** until the copy is complete. The BIOS will boot from the 1st
>> active partition it finds (again, in sane cases).
>>
>> I never have anything changing in /boot in normal operation, so I
>> admit to using dd to do a copy with the array stopped. No particular
>> reason to think it works better than just a rebuild. After the
>> partition is valid I set the active flag in the partition.
>>
>
> I gathered the impression somewhere, perhaps incorrectly, that the active
> flag was a function of the boot block, not the BIOS. We use Grub in
> the MBR
> and don't even have an active flag set in the partition table. The
> system
> still boots.
Now that you mention that, I have been doing the active bit thing since
pre-grub (possibly pre-lilo) days, so it may not be effective with grub
in the MBR. Something to add to my list of "when I'm bored and want to
try something" list. Thanks for making the point.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-05 23:55 ` H. Peter Anvin
@ 2007-03-06 23:18 ` Bill Davidsen
0 siblings, 0 replies; 14+ messages in thread
From: Bill Davidsen @ 2007-03-06 23:18 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Mike Accetta, linux-raid
H. Peter Anvin wrote:
> Mike Accetta wrote:
>> I wonder if having the MBR typically outside of the array and the
>> relative
>> newness of partitioned arrays are related? When I was considering
>> how to
>> architect the RAID1 layout it seemed like a partitioned array on the
>> entire disk worked most naturally.
>
> It's one way to do it, for sure. The main problem with that, of
> course, is that it's not compatible with other operating systems.
sed s/problem/advantage/ ;-)
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: RAID1, hot-swap and boot integrity
2007-03-05 23:38 ` Mike Accetta
2007-03-06 23:17 ` Bill Davidsen
@ 2007-03-07 20:48 ` H. Peter Anvin
1 sibling, 0 replies; 14+ messages in thread
From: H. Peter Anvin @ 2007-03-07 20:48 UTC (permalink / raw)
To: Mike Accetta; +Cc: Bill Davidsen, linux-raid
Mike Accetta wrote:
> I gathered the impression somewhere, perhaps incorrectly, that the active
> flag was a function of the boot block, not the BIOS. We use Grub in the
> MBR and don't even have an active flag set in the partition table. The system
> still boots.
The active flag is indeed an MBR issue.
-hpa
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-03-07 20:48 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-02 14:04 RAID1, hot-swap and boot integrity Mike Accetta
2007-03-02 15:40 ` Justin Piszcz
2007-03-02 16:02 ` Gabor Gombas
2007-03-02 16:10 ` Gabor Gombas
2007-03-02 20:57 ` Bill Davidsen
2007-03-04 17:31 ` H. Peter Anvin
[not found] ` <9782.20070302161004.GE31010@boogie.lpds.sztaki.hu>
2007-03-05 23:32 ` Mike Accetta
2007-03-06 10:01 ` Gabor Gombas
[not found] ` <12470.45E88FAA.5020106@tmr.com>
2007-03-05 23:38 ` Mike Accetta
2007-03-06 23:17 ` Bill Davidsen
2007-03-07 20:48 ` H. Peter Anvin
[not found] ` <12721.45EB026B.1040503@zytor.com>
2007-03-05 23:47 ` Mike Accetta
2007-03-05 23:55 ` H. Peter Anvin
2007-03-06 23:18 ` Bill Davidsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).