about faulty spare-disk interrupts synchronization

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* about faulty spare-disk interrupts synchronization
@ 2009-12-23 11:51 spren.gm
  2009-12-23 23:31 ` Neil Brown
  0 siblings, 1 reply; 2+ messages in thread
From: spren.gm @ 2009-12-23 11:51 UTC (permalink / raw)
  To: linux-raid@vger.kernel.org

Hi,
Is it intended that when a spare disk status gets faulty (detached from raid or really faulty) 
synchronization is interrupted ? We found that case several days ago with kernel version of 2.6.24, 
after we unplugged a spare disk of a raid5 which had bitmap and was recovering, the spare disk 
status became faulty  and synchronization restarted from 0%. 

Looking into the md code, i find that in md/md.c/md_error(), it doesn't make a difference between 
spare disks and normal disks. Should we make a faulty spare disk not interrupt raid synchronization ?
Disks nowadays have become much larger, and recovering one disk may cost several hours or even longer.

--------------
spren.gm@gmail.com
2009-12-23

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: about faulty spare-disk interrupts synchronization
  2009-12-23 11:51 about faulty spare-disk interrupts synchronization spren.gm
@ 2009-12-23 23:31 ` Neil Brown
  0 siblings, 0 replies; 2+ messages in thread
From: Neil Brown @ 2009-12-23 23:31 UTC (permalink / raw)
  To: spren.gm@gmail.com; +Cc: linux-raid@vger.kernel.org

On Wed, 23 Dec 2009 19:51:33 +0800
"spren.gm@gmail.com" <spren.gm@gmail.com> wrote:

> Hi,
> Is it intended that when a spare disk status gets faulty (detached from raid or really faulty) 
> synchronization is interrupted ? We found that case several days ago with kernel version of 2.6.24, 
> after we unplugged a spare disk of a raid5 which had bitmap and was recovering, the spare disk 
> status became faulty  and synchronization restarted from 0%. 
> 
> Looking into the md code, i find that in md/md.c/md_error(), it doesn't make a difference between 
> spare disks and normal disks. Should we make a faulty spare disk not interrupt raid synchronization ?
> Disks nowadays have become much larger, and recovering one disk may cost several hours or even longer.
>

Yes, it is intended that any synchronisation is interrupted when any device
fails.
However if the device was just an inactive spare, then the synchronisation
should start again from the same place that it was up to, it at least it
should repeat the already-done part very very quickly.

Can you test on a more recent kernel?

Can you give precise details of steps and kernel log messages and mdstat
output?

NeilBrown

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2009-12-23 23:31 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-23 11:51 about faulty spare-disk interrupts synchronization spren.gm
2009-12-23 23:31 ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).