Faulty drive data recovery

All of lore.kernel.org
 help / color / mirror / Atom feed

* Faulty drive data recovery
@ 2010-11-24 14:04 Atila
  2010-11-24 16:03 ` Artem Bokhan
  0 siblings, 1 reply; 5+ messages in thread
From: Atila @ 2010-11-24 14:04 UTC (permalink / raw)
  To: linux-ide

HI, I'm trying to recover data from a damaged hard disk, which has
plenty of bad sectors, but also has many good ones. The problem is that
when a bad sector is found, the drive keeps trying to read it, instead
of giving up and just move on, so the average data read rate is around
5Kb/s. With such rates, it will take more than an year to finish. Since
I'm using gnu ddrescue (which logs bad sectors, so one can try then
again later), my goal is not waste time with errors, leaving the retries
to a second round.
So, my first attempt was to drastically lower the timeouts in
libata-eh.c. It seems to have improved a little, but I'm not having more
than 12Kb/s.
Is there any way to minimize retries and make errors finish faster?

Atila Romero

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Faulty drive data recovery
  2010-11-24 14:04 Faulty drive data recovery Atila
@ 2010-11-24 16:03 ` Artem Bokhan
       [not found]   ` <AANLkTikhpPg4XMiDs8B7Jq=YC-xLYtemsarW5=EADkQM@mail.gmail.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Artem Bokhan @ 2010-11-24 16:03 UTC (permalink / raw)
  To: Atila; +Cc: linux-ide

  find /sys/devices -name timeout

/sys/devices/pci0000:00/0000:00:1f.1/host0/target0:0:0/0:0:0:0/timeout
/sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/timeout
/sys/devices/pci0000:00/0000:00:1f.2/host3/target3:0:0/3:0:0:0/timeout

As I remember, some time ago somewhere in the sources was hardcoded 3 retries.

24.11.2010 20:04, Atila пишет:
> HI, I'm trying to recover data from a damaged hard disk, which has
> plenty of bad sectors, but also has many good ones. The problem is that
> when a bad sector is found, the drive keeps trying to read it, instead
> of giving up and just move on, so the average data read rate is around
> 5Kb/s. With such rates, it will take more than an year to finish. Since
> I'm using gnu ddrescue (which logs bad sectors, so one can try then
> again later), my goal is not waste time with errors, leaving the retries
> to a second round.
> So, my first attempt was to drastically lower the timeouts in
> libata-eh.c. It seems to have improved a little, but I'm not having more
> than 12Kb/s.
> Is there any way to minimize retries and make errors finish faster?
>
> Atila Romero
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

[parent not found: <AANLkTikhpPg4XMiDs8B7Jq=YC-xLYtemsarW5=EADkQM@mail.gmail.com>]

* Re: Faulty drive data recovery
       [not found]   ` <AANLkTikhpPg4XMiDs8B7Jq=YC-xLYtemsarW5=EADkQM@mail.gmail.com>
@ 2010-11-24 17:00     ` Greg Freemyer
  2010-11-26 17:29       ` Tejun Heo
  0 siblings, 1 reply; 5+ messages in thread
From: Greg Freemyer @ 2010-11-24 17:00 UTC (permalink / raw)
  To: Artem Bokhan; +Cc: Atila, linux-ide

resend

> On Wed, Nov 24, 2010 at 11:03 AM, Artem Bokhan <aptem@ngs.ru> wrote:
>>
>>  find /sys/devices -name timeout
>>
>> /sys/devices/pci0000:00/0000:00:1f.1/host0/target0:0:0/0:0:0:0/timeout
>> /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/timeout
>> /sys/devices/pci0000:00/0000:00:1f.2/host3/target3:0:0/3:0:0:0/timeout
>>
>> As I remember, some time ago somewhere in the sources was hardcoded 3 retries.
>>
>> 24.11.2010 20:04, Atila пишет:
>>>
>>> HI, I'm trying to recover data from a damaged hard disk, which has
>>> plenty of bad sectors, but also has many good ones. The problem is that
>>> when a bad sector is found, the drive keeps trying to read it, instead
>>> of giving up and just move on, so the average data read rate is around
>>> 5Kb/s. With such rates, it will take more than an year to finish. Since
>>> I'm using gnu ddrescue (which logs bad sectors, so one can try then
>>> again later), my goal is not waste time with errors, leaving the retries
>>> to a second round.
>>> So, my first attempt was to drastically lower the timeouts in
>>> libata-eh.c. It seems to have improved a little, but I'm not having more
>>> than 12Kb/s.
>>> Is there any way to minimize retries and make errors finish faster?
>>>
>>> Atila Romero
>>>
>
Those are kernel timeout / retry parameters, and while helpful to
adjust, you really also need to tell the drive itself not to retry,
and or reduce its internal drive retry timeout.  They can be as long
as 2 minutes as shipped from the factory.

So getting rid of the kernel retries, will still leave you with an
exceptionally slow ddrescue if your stuck with the drive default retry
logic.

But you may be lucky, because using drives with a 2 minute retry
timeout is unacceptable in the raid world progress has been made for
making the timeout end-user controllable.

Western Digital first came out with TLER as a way to control that
internal drive retry logic a few years ago.

The ATA-8 spec. incorporated it and called it ERC (Error Recovery Control).

Unfortunately, not many drives really support your adjusting TLER/ERC.

And for those few that do, Mark Lord hasn't (yet?) incorporated
setting either into hdparm yet from what I know.
So your only way to adjust it is a very recent version of smartctl.

I think smartctl handles both the original TLER capability and the new
ERC capability.  (They may be the same, but just with different
names.)

Here's an end-user message about it:

http://markmail.org/message/vnb2ozh3fbmxgyrt

Good Luck
Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Faulty drive data recovery
  2010-11-24 17:00     ` Greg Freemyer
@ 2010-11-26 17:29       ` Tejun Heo
  2010-11-26 18:45         ` Atila
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2010-11-26 17:29 UTC (permalink / raw)
  To: Greg Freemyer; +Cc: Artem Bokhan, Atila, linux-ide

Hello,

On 11/24/2010 06:00 PM, Greg Freemyer wrote:
>>>> HI, I'm trying to recover data from a damaged hard disk, which has
>>>> plenty of bad sectors, but also has many good ones. The problem is that
>>>> when a bad sector is found, the drive keeps trying to read it, instead
>>>> of giving up and just move on, so the average data read rate is around
>>>> 5Kb/s. With such rates, it will take more than an year to finish. Since
>>>> I'm using gnu ddrescue (which logs bad sectors, so one can try then
>>>> again later), my goal is not waste time with errors, leaving the retries
>>>> to a second round.
>>>> So, my first attempt was to drastically lower the timeouts in
>>>> libata-eh.c. It seems to have improved a little, but I'm not having more
>>>> than 12Kb/s.
>>>> Is there any way to minimize retries and make errors finish faster?

You can directly issue r/w commands using SGIO where you can control
retry and timeout explicitly.  Hmm... it might be a good idea to allow
userland to set FAILFAST bit on a block device?

-- 
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Faulty drive data recovery
  2010-11-26 17:29       ` Tejun Heo
@ 2010-11-26 18:45         ` Atila
  0 siblings, 0 replies; 5+ messages in thread
From: Atila @ 2010-11-26 18:45 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Greg Freemyer, Artem Bokhan, linux-ide

Em 26/11/2010 15:29, Tejun Heo escreveu:
> Hello,
>
> On 11/24/2010 06:00 PM, Greg Freemyer wrote:
>>>>> HI, I'm trying to recover data from a damaged hard disk, which has
>>>>> plenty of bad sectors, but also has many good ones. The problem is that
>>>>> when a bad sector is found, the drive keeps trying to read it, instead
>>>>> of giving up and just move on, so the average data read rate is around
>>>>> 5Kb/s. With such rates, it will take more than an year to finish. Since
>>>>> I'm using gnu ddrescue (which logs bad sectors, so one can try then
>>>>> again later), my goal is not waste time with errors, leaving the retries
>>>>> to a second round.
>>>>> So, my first attempt was to drastically lower the timeouts in
>>>>> libata-eh.c. It seems to have improved a little, but I'm not having more
>>>>> than 12Kb/s.
>>>>> Is there any way to minimize retries and make errors finish faster?
> You can directly issue r/w commands using SGIO where you can control
> retry and timeout explicitly.  Hmm... it might be a good idea to allow
> userland to set FAILFAST bit on a block device?

Sounds like a great idea to me. I think it would become popular in
forensics.

I followed Artem Bokhan s' tip on timeout and tried Greg Freemyer s' on
smartctl. Unfortunately this particular drive is only ATA-7, so scterc
wasn't available. But is definitively good to know.

I never played with SGIO before, but there is plenty of time to learn to
use it before ddrescue can finish copying.:)

Thank you all,

Atila


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-11-26 18:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-24 14:04 Faulty drive data recovery Atila
2010-11-24 16:03 ` Artem Bokhan
     [not found]   ` <AANLkTikhpPg4XMiDs8B7Jq=YC-xLYtemsarW5=EADkQM@mail.gmail.com>
2010-11-24 17:00     ` Greg Freemyer
2010-11-26 17:29       ` Tejun Heo
2010-11-26 18:45         ` Atila

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.