* detecting read errors after RAID1 check operation @ 2007-08-15 22:03 Mike Accetta 2007-08-15 22:55 ` Neil Brown 0 siblings, 1 reply; 6+ messages in thread From: Mike Accetta @ 2007-08-15 22:03 UTC (permalink / raw) To: linux-raid We run a "check" operation periodically to try and turn up problems with drives about to go bad before they become too severe. In particularly, if there were any drive read errors during the check operation I would like to be able to notice and raise an alarm for human attention so that the failing drive can be replaced sooner than later. I'm looking for a programatic way to detect this reliably without having to grovel through the log files looking for kernel hard drive error messages that may have occurred during the check operation. There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys which would be very convenient to consult but according to the kernel driver implementation the error counts reported there are apparently for corrected errors and not relevant for read errors during a "check" operation. I am contemplating adding a parallel /sys file that would report all errors, not just the corrected ones. Does this seem reasonable? Are there other alternatives that might make sense here? -- Mike Accetta ECI Telecom Ltd. Transport Networking Division, US (previously Laurel Networks) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: detecting read errors after RAID1 check operation 2007-08-15 22:03 detecting read errors after RAID1 check operation Mike Accetta @ 2007-08-15 22:55 ` Neil Brown 2007-08-16 3:27 ` Mike Accetta 0 siblings, 1 reply; 6+ messages in thread From: Neil Brown @ 2007-08-15 22:55 UTC (permalink / raw) To: Mike Accetta; +Cc: linux-raid On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys > which would be very convenient to consult but according to the kernel > driver implementation the error counts reported there are apparently > for corrected errors and not relevant for read errors during a "check" > operation. > When 'check' hits a read error, an attempt is made to 'correct' it by over-writing with correct data. This will either increase the 'errors' count or fail the drive completely. What 'check' doesn't do (and 'repair' does) it react when it find that successful reads of all drives (in a raid1) do not match. So just use the 'errors' number - it is exactly what you want. NeilBrown ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: detecting read errors after RAID1 check operation 2007-08-15 22:55 ` Neil Brown @ 2007-08-16 3:27 ` Mike Accetta 2007-08-17 2:09 ` Neil Brown 0 siblings, 1 reply; 6+ messages in thread From: Mike Accetta @ 2007-08-16 3:27 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Neil Brown writes: > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > > > There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys > > which would be very convenient to consult but according to the kernel > > driver implementation the error counts reported there are apparently > > for corrected errors and not relevant for read errors during a "check" > > operation. > > > > When 'check' hits a read error, an attempt is made to 'correct' it by > over-writing with correct data. This will either increase the > 'errors' count or fail the drive completely. > > What 'check' doesn't do (and 'repair' does) it react when it find that > successful reads of all drives (in a raid1) do not match. > > So just use the 'errors' number - it is exactly what you want. This happens in our old friend sync_request_write()? I'm dealing with simulated errors and will dig further to make sure that is not perturbing the results but I don't see any 'errors' effect. This is with our patched 2.6.20 raid1.c. The logic doesn't seem to be any different in 2.6.22 from what I can tell, though. This fragment if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { sbio->bi_end_io = NULL; rdev_dec_pending(conf->mirrors[i].rdev, mddev); } else { /* fixup the bio for reuse */ ... } looks suspicously like any correction attempt for 'check' is being short-circuited to me, regardless of whether or not there was a read error. Actually, even if the rewrite was not being short-circuited, I still don't see the path that would update 'corrected_errors' in this case. There are only two raid1.c sites that touch 'corrected_errors', one is in fix_read_errors() and the other is later in sync_request_write(). With my limited understanding of how this all works, neither of these paths would seem to apply here. -- Mike Accetta ECI Telecom Ltd. Transport Networking Division, US (previously Laurel Networks) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: detecting read errors after RAID1 check operation 2007-08-16 3:27 ` Mike Accetta @ 2007-08-17 2:09 ` Neil Brown 2007-08-17 21:31 ` Mike Accetta 0 siblings, 1 reply; 6+ messages in thread From: Neil Brown @ 2007-08-17 2:09 UTC (permalink / raw) To: Mike Accetta; +Cc: linux-raid On Wednesday August 15, maccetta@laurelnetworks.com wrote: > Neil Brown writes: > > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > > > > > There are already files like /sys/block/md_d0/md/dev-sdb/errors in /sys > > > which would be very convenient to consult but according to the kernel > > > driver implementation the error counts reported there are apparently > > > for corrected errors and not relevant for read errors during a "check" > > > operation. > > > > > > > When 'check' hits a read error, an attempt is made to 'correct' it by > > over-writing with correct data. This will either increase the > > 'errors' count or fail the drive completely. > > > > What 'check' doesn't do (and 'repair' does) it react when it find that > > successful reads of all drives (in a raid1) do not match. > > > > So just use the 'errors' number - it is exactly what you want. > > This happens in our old friend sync_request_write()? I'm dealing with Yes, that would be the place. > simulated errors and will dig further to make sure that is not perturbing > the results but I don't see any 'errors' effect. This is with our > patched 2.6.20 raid1.c. The logic doesn't seem to be any different in > 2.6.22 from what I can tell, though. > > This fragment > > if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { > sbio->bi_end_io = NULL; > rdev_dec_pending(conf->mirrors[i].rdev, mddev); > } else { > /* fixup the bio for reuse */ > ... > } > > looks suspicously like any correction attempt for 'check' is being > short-circuited to me, regardless of whether or not there was a read > error. Actually, even if the rewrite was not being short-circuited, > I still don't see the path that would update 'corrected_errors' in this > case. There are only two raid1.c sites that touch 'corrected_errors', one > is in fix_read_errors() and the other is later in sync_request_write(). > With my limited understanding of how this all works, neither of these > paths would seem to apply here. hmmm.... yes.... I guess I was thinking of the RAID5 code rather than the RAID1 code. It doesn't do the right thing does it? Maybe this patch is what we need. I think it is right. Thanks, NeilBrown Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./drivers/md/raid1.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c --- .prev/drivers/md/raid1.c 2007-08-16 10:29:58.000000000 +1000 +++ ./drivers/md/raid1.c 2007-08-17 12:07:35.000000000 +1000 @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t * j = 0; if (j >= 0) mddev->resync_mismatches += r1_bio->sectors; - if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { + if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mddev->recovery) + && text_bit(BIO_UPTODATE, &sbio->bi_flags))) { sbio->bi_end_io = NULL; rdev_dec_pending(conf->mirrors[i].rdev, mddev); } else { ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: detecting read errors after RAID1 check operation 2007-08-17 2:09 ` Neil Brown @ 2007-08-17 21:31 ` Mike Accetta 2007-08-25 14:27 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Mike Accetta @ 2007-08-17 21:31 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Neil Brown writes: > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > Neil Brown writes: > > > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > > > > > ... > > This happens in our old friend sync_request_write()? I'm dealing with > > Yes, that would be the place. > > > ... > > This fragment > > > > if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { > > sbio->bi_end_io = NULL; > > rdev_dec_pending(conf->mirrors[i].rdev, mddev); > > } else { > > /* fixup the bio for reuse */ > > ... > > } > > > > looks suspicously like any correction attempt for 'check' is being > > short-circuited to me, regardless of whether or not there was a read > > error. Actually, even if the rewrite was not being short-circuited, > > I still don't see the path that would update 'corrected_errors' in this > > case. There are only two raid1.c sites that touch 'corrected_errors', one > > is in fix_read_errors() and the other is later in sync_request_write(). > > With my limited understanding of how this all works, neither of these > > paths would seem to apply here. > > hmmm.... yes.... > I guess I was thinking of the RAID5 code rather than the RAID1 code. > It doesn't do the right thing does it? > Maybe this patch is what we need. I think it is right. > > Thanks, > NeilBrown > > > Signed-off-by: Neil Brown <neilb@suse.de> > > ### Diffstat output > ./drivers/md/raid1.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c > --- .prev/drivers/md/raid1.c 2007-08-16 10:29:58.000000000 +1000 > +++ ./drivers/md/raid1.c 2007-08-17 12:07:35.000000000 +1000 > @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t * > j = 0; > if (j >= 0) > mddev->resync_mismatches += r1_bio->sec > tors; > - if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev > ->recovery)) { > + if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mdde > v->recovery) > + && text_bit(BIO_UPTODATE, &sbio-> > bi_flags))) { > sbio->bi_end_io = NULL; > rdev_dec_pending(conf->mirrors[i].rdev, > mddev); > } else { I tried this (with the typo fixed) and it indeed issues a re-write. However, it doesn't seem to do anything with the corrected errors count if the rewrite succeeds. Since end_sync_write() is only used one other place when !In_sync, I tried the following and it seems to work to get the error count updated. I don't know whether this belongs in end_sync_write() but I'd think it needs to come after the write actually succeeds so that seems like the earliest it could be done. --- BUILD/kernel/drivers/md/raid1.c 2007-06-04 13:52:42.000000000 -0400 +++ drivers/md/raid1.c 2007-08-17 16:52:14.219364000 -0400 @@ -1166,6 +1166,7 @@ static int end_sync_write(struct bio *bi conf_t *conf = mddev_to_conf(mddev); int i; int mirror=0; + mdk_rdev_t *rdev; if (bio->bi_size) return 1; @@ -1175,6 +1176,8 @@ static int end_sync_write(struct bio *bi mirror = i; break; } + + rdev = conf->mirrors[mirror].rdev; if (!uptodate) { int sync_blocks = 0; sector_t s = r1_bio->sector; @@ -1186,7 +1189,13 @@ static int end_sync_write(struct bio *bi s += sync_blocks; sectors_to_go -= sync_blocks; } while (sectors_to_go > 0); - md_error(mddev, conf->mirrors[mirror].rdev); + md_error(mddev, rdev); + } else if (test_bit(In_sync, &rdev->flags)) { + /* + * If we are currently in sync, this was a re-write to + * correct a read error and we should account for it. + */ + atomic_add(r1_bio->sectors, &rdev->corrected_errors); } update_head_pos(mirror, r1_bio); @@ -1251,7 +1260,8 @@ static void sync_request_write(mddev_t * } if (j >= 0) mddev->resync_mismatches += r1_bio->sectors; - if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { + if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mddev->recovery) + && test_bit(BIO_UPTODATE, &sbio->bi_flags))) { sbio->bi_end_io = NULL; rdev_dec_pending(conf->mirrors[i].rdev, mddev); } else { -- Mike Accetta ECI Telecom Ltd. Transport Networking Division, US (previously Laurel Networks) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: detecting read errors after RAID1 check operation 2007-08-17 21:31 ` Mike Accetta @ 2007-08-25 14:27 ` Mike Snitzer 0 siblings, 0 replies; 6+ messages in thread From: Mike Snitzer @ 2007-08-25 14:27 UTC (permalink / raw) To: Neil Brown; +Cc: Mike Accetta, linux-raid On 8/17/07, Mike Accetta <maccetta@laurelnetworks.com> wrote: > > Neil Brown writes: > > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > > Neil Brown writes: > > > > On Wednesday August 15, maccetta@laurelnetworks.com wrote: > > > > > > > > ... > > > This happens in our old friend sync_request_write()? I'm dealing with > > > > Yes, that would be the place. > > > > > ... > > > This fragment > > > > > > if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) { > > > sbio->bi_end_io = NULL; > > > rdev_dec_pending(conf->mirrors[i].rdev, mddev); > > > } else { > > > /* fixup the bio for reuse */ > > > ... > > > } > > > > > > looks suspicously like any correction attempt for 'check' is being > > > short-circuited to me, regardless of whether or not there was a read > > > error. Actually, even if the rewrite was not being short-circuited, > > > I still don't see the path that would update 'corrected_errors' in this > > > case. There are only two raid1.c sites that touch 'corrected_errors', one > > > is in fix_read_errors() and the other is later in sync_request_write(). > > > With my limited understanding of how this all works, neither of these > > > paths would seem to apply here. > > > > hmmm.... yes.... > > I guess I was thinking of the RAID5 code rather than the RAID1 code. > > It doesn't do the right thing does it? > > Maybe this patch is what we need. I think it is right. > > > > Thanks, > > NeilBrown > > > > > > Signed-off-by: Neil Brown <neilb@suse.de> > > > > ### Diffstat output > > ./drivers/md/raid1.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c > > --- .prev/drivers/md/raid1.c 2007-08-16 10:29:58.000000000 +1000 > > +++ ./drivers/md/raid1.c 2007-08-17 12:07:35.000000000 +1000 > > @@ -1260,7 +1260,8 @@ static void sync_request_write(mddev_t * > > j = 0; > > if (j >= 0) > > mddev->resync_mismatches += r1_bio->sec > > tors; > > - if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev > > ->recovery)) { > > + if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mdde > > v->recovery) > > + && text_bit(BIO_UPTODATE, &sbio-> > > bi_flags))) { > > sbio->bi_end_io = NULL; > > rdev_dec_pending(conf->mirrors[i].rdev, > > mddev); > > } else { > > I tried this (with the typo fixed) and it indeed issues a re-write. > However, it doesn't seem to do anything with the corrected errors > count if the rewrite succeeds. Since end_sync_write() is only used one > other place when !In_sync, I tried the following and it seems to work > to get the error count updated. I don't know whether this belongs in > end_sync_write() but I'd think it needs to come after the write actually > succeeds so that seems like the earliest it could be done. Neil, Any feedback on Mike's patch? thanks, Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-08-25 14:27 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-08-15 22:03 detecting read errors after RAID1 check operation Mike Accetta 2007-08-15 22:55 ` Neil Brown 2007-08-16 3:27 ` Mike Accetta 2007-08-17 2:09 ` Neil Brown 2007-08-17 21:31 ` Mike Accetta 2007-08-25 14:27 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).