* [PATCH-2.4] raid5 oopses on 2nd failed device
@ 2003-04-15 10:42 Alex Tomas
0 siblings, 0 replies; only message in thread
From: Alex Tomas @ 2003-04-15 10:42 UTC (permalink / raw)
To: neilb; +Cc: linux-kernel
hi!
it looks like raid5 is buggy. as 2nd device is failed, I got oops.
it's BUG at raid5.c:212. this is because sh->written[N] isn't NULL.
raid5d thread try to handle this stripe, but following condition
skip handling:
/* might be able to return some write requests if the parity block
* is safe, or on a failed drive
*/
bh = sh->bh_cache[sh->pd_idx];
if ( written &&
( (conf->disks[sh->pd_idx].operational && !buffer_locked(bh) && buffer_uptodate(bh))
|| (failed == 1 && failed_num == sh->pd_idx))
) {
I suggest to fail requests which can't be returned as successful.
with best regards
diff -puNr linux/drivers/md/raid5.c edited/drivers/md/raid5.c
--- linux/drivers/md/raid5.c Wed Jan 15 20:18:37 2003
+++ edited/drivers/md/raid5.c Tue Apr 15 12:59:24 2003
@@ -938,7 +938,19 @@ static void handle_stripe(struct stripe_
}
}
}
-
+
+ /* if already written requests can't be returned as successful fail them */
+ if (failed > 1 && written) {
+ for (i=disks; i--; ) {
+ if (sh->bh_written[i]) written--;
+ while ((bh = sh->bh_written[i])) {
+ sh->bh_written[i] = bh->b_reqnext;
+ bh->b_reqnext = return_fail;
+ return_fail = bh;
+ }
+ }
+ }
+
/* Now we might consider reading some blocks, either to check/generate
* parity, or to satisfy requests
*/
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2003-04-15 10:31 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-04-15 10:42 [PATCH-2.4] raid5 oopses on 2nd failed device Alex Tomas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox