* Time spent waiting for uncorrectable errors
@ 2004-02-16 8:21 Alex Goller
2004-02-19 15:13 ` Hans-Peter Jansen
0 siblings, 1 reply; 4+ messages in thread
From: Alex Goller @ 2004-02-16 8:21 UTC (permalink / raw)
To: linux-ide
Hi,
is there any data regarding how long current disks try to read a
sector before quitting with an uncorrectable error? The problem is,
that i have no reliable way to reproduce the error, i will try to read
from a hopefully (!) broken disk this afternoon and try to measure the
time spent for that.
bye, alex
--
alexander goller alex@vivien.franken.de
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Time spent waiting for uncorrectable errors
2004-02-16 8:21 Time spent waiting for uncorrectable errors Alex Goller
@ 2004-02-19 15:13 ` Hans-Peter Jansen
2004-02-19 15:39 ` Bartlomiej Zolnierkiewicz
0 siblings, 1 reply; 4+ messages in thread
From: Hans-Peter Jansen @ 2004-02-19 15:13 UTC (permalink / raw)
To: Alex Goller, linux-ide
On Monday 16 February 2004 09:21, Alex Goller wrote:
> Hi,
>
> is there any data regarding how long current disks try to read a
> sector before quitting with an uncorrectable error? The problem is,
> that i have no reliable way to reproduce the error, i will try to
> read from a hopefully (!) broken disk this afternoon and try to
> measure the time spent for that.
IIRC, the kernel tries to read a defect block exactly 8 times.
The problem is (according to some Maxtor guy), a drive, that returns
a hard sector error has tried to read it internally a few tausand
times (~2650), which results in about 21200 physical retries.
Unfortunately this renders an unpatched linux kernel useless for data
recovery tasks. Well, not useless in general, but you simply need a
_lot_ of patience.
Last time I've done it myself, it took about 40 hours to copy a
defective 80 GB HD with dd_rescue. Fortunately, the damage was in
some mpeg2 streams, which is quite robust in handling long runs of
zeros ;-)..
Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Time spent waiting for uncorrectable errors
2004-02-19 15:13 ` Hans-Peter Jansen
@ 2004-02-19 15:39 ` Bartlomiej Zolnierkiewicz
2004-02-19 16:30 ` Hans-Peter Jansen
0 siblings, 1 reply; 4+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2004-02-19 15:39 UTC (permalink / raw)
To: Hans-Peter Jansen; +Cc: Alex Goller, linux-ide
On Thursday 19 of February 2004 16:13, Hans-Peter Jansen wrote:
> On Monday 16 February 2004 09:21, Alex Goller wrote:
> > Hi,
> >
> > is there any data regarding how long current disks try to read a
> > sector before quitting with an uncorrectable error? The problem is,
> > that i have no reliable way to reproduce the error, i will try to
> > read from a hopefully (!) broken disk this afternoon and try to
> > measure the time spent for that.
>
> IIRC, the kernel tries to read a defect block exactly 8 times.
>
> The problem is (according to some Maxtor guy), a drive, that returns
> a hard sector error has tried to read it internally a few tausand
> times (~2650), which results in about 21200 physical retries.
>
> Unfortunately this renders an unpatched linux kernel useless for data
> recovery tasks. Well, not useless in general, but you simply need a
> _lot_ of patience.
So where are the patches? :-)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Time spent waiting for uncorrectable errors
2004-02-19 15:39 ` Bartlomiej Zolnierkiewicz
@ 2004-02-19 16:30 ` Hans-Peter Jansen
0 siblings, 0 replies; 4+ messages in thread
From: Hans-Peter Jansen @ 2004-02-19 16:30 UTC (permalink / raw)
To: Bartlomiej Zolnierkiewicz; +Cc: Alex Goller, linux-ide
On Thursday 19 February 2004 16:39, Bartlomiej Zolnierkiewicz wrote:
> On Thursday 19 of February 2004 16:13, Hans-Peter Jansen wrote:
> >
> > IIRC, the kernel tries to read a defect block exactly 8 times.
> >
> > The problem is (according to some Maxtor guy), a drive, that
> > returns a hard sector error has tried to read it internally a few
> > tausand times (~2650), which results in about 21200 physical
> > retries.
> >
> > Unfortunately this renders an unpatched linux kernel useless for
> > data recovery tasks. Well, not useless in general, but you simply
> > need a _lot_ of patience.
>
> So where are the patches? :-)
First of all, I'm still on 2.4 here. Sorry..
When I looked into this last autumn, I discovered serveral problems:
- crc-errors are also used for PIO fall back
- ide crc-error handling is scattered over several modules for disk/
cdrom/ide-scsi
- differenciate medium/transport errors correctly (cable...)
and I'm a real dummy in those concerns.
Here's something I hacked up for internal use, only. Please don't
consume with empty/full stomach:
--- linux-x/include/linux/ide.h 2003-09-23 21:13:31.000000000 +0200
+++ linux/include/linux/ide.h 2003-10-21 18:06:25.000000000 +0200
@@ -793,6 +793,9 @@
int forced_lun; /* if hdxlun was given at boot */
int lun; /* logical unit */
int crc_count; /* crc counter to reduce drive speed */
+ int crc_err; /* count crc errors */
+ u8 fail_on_crc_err; /* fail on all crc errors, will prevent */
+ /* retries and automatic speed reduce */
char special_buf[4]; /* IDE_DRIVE_CMD, free use */
} ide_drive_t;
--- linux-x/drivers/ide/ide-disk.c 2003-09-23 20:03:19.000000000 +0200
+++ linux/drivers/ide/ide-disk.c 2003-10-21 18:00:55.000000000 +0200
@@ -905,9 +905,21 @@
hwif->INB(IDE_COMMAND_REG) == WIN_SPECIFY)
return ide_stopped;
} else if ((err & BAD_CRC) == BAD_CRC) {
- /* UDMA crc error, just retry the operation */
- drive->crc_count++;
+ if (drive->crc_err++ == 0x7fffffff)
+ /* we have > MAX_INT-1 errors, stop counting
+ to avoid a wrap around */
+ drive->crc_err--;
+ if (drive->fail_on_crc_err)
+ /* no retries */
+ rq->errors = ERROR_MAX;
+ else
+ /* UDMA crc error, just retry the operation */
+ drive->crc_count++;
} else if (err & (BBD_ERR | ECC_ERR)) {
+ if (drive->crc_err++ == 0x7fffffff)
+ /* we have > MAX_INT-1 errors, stop counting
+ to avoid a wrap around */
+ drive->crc_err--;
/* retries won't help these */
rq->errors = ERROR_MAX;
} else if (err & TRK0_ERR) {
@@ -1565,6 +1577,8 @@
ide_add_setting(drive, "acoustic", SETTING_RW, HDIO_GET_ACOUSTIC, HDIO_SET_ACOUSTIC, TYPE_BYTE, 0, 254, 1, 1, &drive->acoustic, set_acoustic);
ide_add_setting(drive, "failures", SETTING_RW, -1, -1, TYPE_INT, 0, 65535, 1, 1, &drive->failures, NULL);
ide_add_setting(drive, "max_failures", SETTING_RW, -1, -1, TYPE_INT, 0, 65535, 1, 1, &drive->max_failures, NULL);
+ ide_add_setting(drive, "crc_err", SETTING_READ, -1, -1, TYPE_INT, 0, 0x7fffffff, 1, 1, &drive->crc_err, NULL);
+ ide_add_setting(drive, "fail_on_crc_err", SETTING_RW, -1, -1, TYPE_BYTE, 0, 1, 1, 1, &drive->fail_on_crc_err, NULL);
}
static int idedisk_ioctl (ide_drive_t *drive, struct inode *inode,
Here's an equally butt ugly attempt for ide-scsi:
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-02-19 16:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-16 8:21 Time spent waiting for uncorrectable errors Alex Goller
2004-02-19 15:13 ` Hans-Peter Jansen
2004-02-19 15:39 ` Bartlomiej Zolnierkiewicz
2004-02-19 16:30 ` Hans-Peter Jansen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).