* Bug with RAID1 hot spares?
@ 2006-10-21 1:46 Chase Venters
2006-10-23 4:13 ` Neil Brown
2006-10-25 1:41 ` Bill Davidsen
0 siblings, 2 replies; 5+ messages in thread
From: Chase Venters @ 2006-10-21 1:46 UTC (permalink / raw)
To: linux-raid; +Cc: jason.meinzer
Greetings,
I was just testing a server I was about to send into production on kernel
2.6.18.1. The server has three SCSI disks with "md1" set to a RAID1 with 2
mirrors and 1 spare. The mirrors are sda3 and sdb3, spare is sdc3. I manually
failed sdb3, and as expected, sdc3 was activated. Strangely
enough, /proc/mdstat did not indicate that sdc3 was being synced. I thought
these spares weren't kept mirrored until needed?
In order to further test my theory, I manually failed sda3, leaving only sdc3
(the original spare) active. I ran "find /" for a bit to see if any errors
cropped up and none did; however, when I added sda3 and sdb3 back to the
array and a resync started, I was soon faced with what appeared to be a
_very_ corrupted reiserfs.
Strangely enough, after booting on a livecd and assembling md1 with just
sda3, I was able to add sdb3 and sdc3, after which the array resynced and
left sdb3 a mirror and sdc3 a spare.
So there's definitely something odd happening here... why did no resync to
the sdc3 spare start when I failed sdb3?
Thanks,
Chase
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bug with RAID1 hot spares?
2006-10-21 1:46 Bug with RAID1 hot spares? Chase Venters
@ 2006-10-23 4:13 ` Neil Brown
2006-10-25 1:41 ` Bill Davidsen
1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2006-10-23 4:13 UTC (permalink / raw)
To: Chase Venters; +Cc: linux-raid, jason.meinzer
On Friday October 20, chase.venters@clientec.com wrote:
> Greetings,
> I was just testing a server I was about to send into production on kernel
> 2.6.18.1. The server has three SCSI disks with "md1" set to a RAID1 with 2
> mirrors and 1 spare. The mirrors are sda3 and sdb3, spare is sdc3. I manually
> failed sdb3, and as expected, sdc3 was activated. Strangely
> enough, /proc/mdstat did not indicate that sdc3 was being synced. I thought
> these spares weren't kept mirrored until needed?
Correct. They are not kept mirrored.
> So there's definitely something odd happening here... why did no resync to
> the sdc3 spare start when I failed sdb3?
yes... can you check that this fixes it please?
Thanks,
NeilBrown
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./drivers/md/md.c | 1 +
1 file changed, 1 insertion(+)
diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c 2006-10-23 12:09:08.000000000 +1000
+++ ./drivers/md/md.c 2006-10-23 14:10:46.000000000 +1000
@@ -2003,6 +2003,7 @@ static mdk_rdev_t *md_import_device(dev_
kobject_init(&rdev->kobj);
rdev->desc_nr = -1;
+ rdev->saved_raid_disk = -1;
rdev->flags = 0;
rdev->data_offset = 0;
rdev->sb_events = 0;
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bug with RAID1 hot spares?
2006-10-21 1:46 Bug with RAID1 hot spares? Chase Venters
2006-10-23 4:13 ` Neil Brown
@ 2006-10-25 1:41 ` Bill Davidsen
2006-10-25 4:43 ` Chase Venters
1 sibling, 1 reply; 5+ messages in thread
From: Bill Davidsen @ 2006-10-25 1:41 UTC (permalink / raw)
To: Chase Venters; +Cc: linux-raid, jason.meinzer
Chase Venters wrote:
>Greetings,
> I was just testing a server I was about to send into production on kernel
>2.6.18.1. The server has three SCSI disks with "md1" set to a RAID1 with 2
>mirrors and 1 spare.
>
I have to ask, why? If the array is mostly written you might save a bit
of bus time, but for reads having another copy of the data to read
(usually) helps the performance by reducing wait for read occurences.
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bug with RAID1 hot spares?
2006-10-25 1:41 ` Bill Davidsen
@ 2006-10-25 4:43 ` Chase Venters
2006-10-25 10:41 ` Mario 'BitKoenig' Holbe
0 siblings, 1 reply; 5+ messages in thread
From: Chase Venters @ 2006-10-25 4:43 UTC (permalink / raw)
To: Bill Davidsen; +Cc: linux-raid, jason.meinzer
On Tuesday 24 October 2006 20:41, Bill Davidsen wrote:
> Chase Venters wrote:
> >Greetings,
> > I was just testing a server I was about to send into production on kernel
> >2.6.18.1. The server has three SCSI disks with "md1" set to a RAID1 with 2
> >mirrors and 1 spare.
>
> I have to ask, why? If the array is mostly written you might save a bit
> of bus time, but for reads having another copy of the data to read
> (usually) helps the performance by reducing wait for read occurences.
The main idea is to not exercise the spare as much as the other disks. All
three disks are from the same lot. Having three disks fail at once is
admittedly unlikely, but keeping one disk as a spare rather than full mirror
should probably reduce the wear on that disk so if there is some
manufacturing defect the third drive wouldn't be as close to failing and
could hopefully keep the box online until someone makes it to the datacenter
to do a swap.
Thanks,
Chase
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Bug with RAID1 hot spares?
2006-10-25 4:43 ` Chase Venters
@ 2006-10-25 10:41 ` Mario 'BitKoenig' Holbe
0 siblings, 0 replies; 5+ messages in thread
From: Mario 'BitKoenig' Holbe @ 2006-10-25 10:41 UTC (permalink / raw)
To: linux-raid
Chase Venters <chase.venters@clientec.com> wrote:
> The main idea is to not exercise the spare as much as the other disks. All
Btw. you can also keep the spare-disk spinned down most of the time.
You should probably just make sure to spin it up from time to time to
see if it's still okay - I spin up my spares one hour per night when
smartd issues short selftests and a few more hours when smartd issues
long selftests.
regards
Mario
--
<jv> Oh well, config
<jv> one actually wonders what force in the universe is holding it
<jv> and makes it working
<Beeth> chances and accidents :)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-10-25 10:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-21 1:46 Bug with RAID1 hot spares? Chase Venters
2006-10-23 4:13 ` Neil Brown
2006-10-25 1:41 ` Bill Davidsen
2006-10-25 4:43 ` Chase Venters
2006-10-25 10:41 ` Mario 'BitKoenig' Holbe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).