* raid0 fail to detect drive failure
@ 2005-11-01 16:02 Ming Zhang
2005-11-02 9:08 ` Michael Tokarev
0 siblings, 1 reply; 3+ messages in thread
From: Ming Zhang @ 2005-11-01 16:02 UTC (permalink / raw)
To: Linux RAID
Hi folks
I have a raid0 on top of 2 sata disk sda and sdb. after i hot unplug
sda, the raid0 still shows online and active. run dd to write to it will
fail and dmesg shows scsi io error. but /proc/mdstat shows everything is
ok.
checked 2.4.27 and 2.6.11.2, both show same problem.
mdadm is 1.8
any hint?
Ming
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid0 fail to detect drive failure
2005-11-01 16:02 raid0 fail to detect drive failure Ming Zhang
@ 2005-11-02 9:08 ` Michael Tokarev
2005-11-02 13:40 ` Ming Zhang
0 siblings, 1 reply; 3+ messages in thread
From: Michael Tokarev @ 2005-11-02 9:08 UTC (permalink / raw)
To: mingz; +Cc: Linux RAID
Ming Zhang wrote:
> Hi folks
>
> I have a raid0 on top of 2 sata disk sda and sdb. after i hot unplug
> sda, the raid0 still shows online and active. run dd to write to it will
> fail and dmesg shows scsi io error. but /proc/mdstat shows everything is
> ok.
Since raid0 isn't relly raid (as Redundrand) and can't really do
anything with IO errors on component devices, this behaviour
(returning IO errors to the application) is the only sane way
to go. It should not fail just like when your disk drive has
a bad sector on it, the whole partition (or whole disk) with
that bad sector isn't "marked as failed". So what you see is
exactly correct behaviour, in my opinion anyway.
/mjt
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: raid0 fail to detect drive failure
2005-11-02 9:08 ` Michael Tokarev
@ 2005-11-02 13:40 ` Ming Zhang
0 siblings, 0 replies; 3+ messages in thread
From: Ming Zhang @ 2005-11-02 13:40 UTC (permalink / raw)
To: Michael Tokarev; +Cc: Linux RAID
On Wed, 2005-11-02 at 12:08 +0300, Michael Tokarev wrote:
> Ming Zhang wrote:
> > Hi folks
> >
> > I have a raid0 on top of 2 sata disk sda and sdb. after i hot unplug
> > sda, the raid0 still shows online and active. run dd to write to it will
> > fail and dmesg shows scsi io error. but /proc/mdstat shows everything is
> > ok.
>
> Since raid0 isn't relly raid (as Redundrand) and can't really do
> anything with IO errors on component devices, this behaviour
> (returning IO errors to the application) is the only sane way
> to go. It should not fail just like when your disk drive has
> a bad sector on it, the whole partition (or whole disk) with
> that bad sector isn't "marked as failed". So what you see is
> exactly correct behaviour, in my opinion anyway.
>
> /mjt
after I sent email, I read the raid0 code and there is no error handling
at all, so i knew why it looks like that.
for my case, 2 disk raid0, 1 disk broke mean 50% sectors on a disk are
bad. i would like to call that disk a failed disk. and i bet you will
not use such disk any more even you call it not-failed disk. ;)
but as you said, raid0 is not a real raid, so maybe this is why no error
check here.
thanks!
Ming
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-11-02 13:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-01 16:02 raid0 fail to detect drive failure Ming Zhang
2005-11-02 9:08 ` Michael Tokarev
2005-11-02 13:40 ` Ming Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).