All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
@ 2003-09-11 18:20 Eric Bickle
  2003-09-11 18:43 ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Bickle @ 2003-09-11 18:20 UTC (permalink / raw)
  To: linux-kernel

> > kernel: hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > kernel: hdc: dma_intr: error=0x40 { UncorrectableError },
LBAsect=150637065,
> > sector=150636992
>
> This is a physical failure from the hard disk *NOT* a Linux problem

That's exactially what I thought when I first saw the problem as well.

However, we had about 16-20 different drives show up with the problem, about
3 different brands too. I did some low-level tests on the drives that linux
had an error on and none of my diagnostic tools could find any problems.

Any ideas?

Thanks.
-Eric Bickle


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
  2003-09-11 18:20 Eric Bickle
@ 2003-09-11 18:43 ` Alan Cox
  2003-09-12  8:14   ` Rogier Wolff
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2003-09-11 18:43 UTC (permalink / raw)
  To: Eric Bickle; +Cc: Linux Kernel Mailing List

On Iau, 2003-09-11 at 19:20, Eric Bickle wrote:
> > > kernel: hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > > kernel: hdc: dma_intr: error=0x40 { UncorrectableError },
> LBAsect=150637065,
> > > sector=150636992
> >
> > This is a physical failure from the hard disk *NOT* a Linux problem
> 
> That's exactially what I thought when I first saw the problem as well.
> 
> However, we had about 16-20 different drives show up with the problem, about
> 3 different brands too. I did some low-level tests on the drives that linux
> had an error on and none of my diagnostic tools could find any problems.
> 
> Any ideas?

Other than to tell you Linux is simply reporting back what the drive
itself reported - which is a physical failure to recover a sector of
data no.

A test that rewrites such a sector will generally clear the error, its
one of the problems of some diagnostic tools. A pure read test should
fine the error again unless its something like overheat causing the
problem. SMART data will tell you drive temperatures


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
@ 2003-09-11 19:11 Eric Bickle
  0 siblings, 0 replies; 7+ messages in thread
From: Eric Bickle @ 2003-09-11 19:11 UTC (permalink / raw)
  To: linux-kernel

> Other than to tell you Linux is simply reporting back what the drive
> itself reported - which is a physical failure to recover a sector of
> data no.
>
> A test that rewrites such a sector will generally clear the error, its
> one of the problems of some diagnostic tools. A pure read test should
> fine the error again unless its something like overheat causing the
> problem. SMART data will tell you drive temperatures


Thanks for the info, I'll try to dig up some better diagnostic tools. I
definately appreciate the quick response!

Thanks again,
-Eric Bickle


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
  2003-09-11 18:43 ` Alan Cox
@ 2003-09-12  8:14   ` Rogier Wolff
  2003-09-12 10:44     ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Rogier Wolff @ 2003-09-12  8:14 UTC (permalink / raw)
  To: Alan Cox; +Cc: Eric Bickle, Linux Kernel Mailing List

On Thu, Sep 11, 2003 at 07:43:33PM +0100, Alan Cox wrote:
> On Iau, 2003-09-11 at 19:20, Eric Bickle wrote:
> > > > kernel: hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
> > > > kernel: hdc: dma_intr: error=0x40 { UncorrectableError },
> > LBAsect=150637065,
> > > > sector=150636992

> A test that rewrites such a sector will generally clear the error, its
> one of the problems of some diagnostic tools. A pure read test should
> fine the error again unless its something like overheat causing the
> problem. SMART data will tell you drive temperatures

Some drives don't have the sensor for that. :-(

Anyway, speaking about SMART, some "smartd" was interfering with
normal operation on one of our systems and we saw similar "nasty"
stuff on that system until I removed "smartd". 

Aug 10 06:54:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 06:54:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
Aug 10 07:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 07:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
Aug 10 08:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 08:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }

Linux version 2.4.19-rc1 (root@zurix) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 Mon Jul 8 15:37:19 CEST 2002


		Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
**** "Linux is like a wigwam -  no windows, no gates, apache inside!" ****

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
  2003-09-12  8:14   ` Rogier Wolff
@ 2003-09-12 10:44     ` Alan Cox
  0 siblings, 0 replies; 7+ messages in thread
From: Alan Cox @ 2003-09-12 10:44 UTC (permalink / raw)
  To: Rogier Wolff; +Cc: Eric Bickle, Linux Kernel Mailing List

On Gwe, 2003-09-12 at 09:14, Rogier Wolff wrote:
> Aug 10 06:54:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 06:54:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }

Drive rejecting a command. Looks like smartd asked the drive to do
something it didnt support, which it really should not be doing.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
@ 2003-09-12 11:19 Roman Kagan
  0 siblings, 0 replies; 7+ messages in thread
From: Roman Kagan @ 2003-09-12 11:19 UTC (permalink / raw)
  To: linux-kernel; +Cc: Rogier Wolff

On Fri, Sep 12, 2003 at 04:14:54AM +0000, Rogier Wolff wrote:
> Anyway, speaking about SMART, some "smartd" was interfering with
> normal operation on one of our systems and we saw similar "nasty"
> stuff on that system until I removed "smartd". 
> 
> Aug 10 06:54:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 06:54:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
> Aug 10 07:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 07:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
> Aug 10 08:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 08:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }

You probably have SMART disabled on those drives by BIOS, and smartd is
not smart enough to enable it before trying to use it so the drives
complain.  I had the same problem on my GigaByte mobo where the BIOS
setup didn't even provide an option to turn SMART on (like earlier Award
BIOSes did).

Check with smartctl -i /dev/hdX.  Enable with smartctl -e /dev/hdX
_before_ starting smartd.

Sorry for OT.

  Roman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others
@ 2003-09-12 11:45 John Bradford
  0 siblings, 0 replies; 7+ messages in thread
From: John Bradford @ 2003-09-12 11:45 UTC (permalink / raw)
  To: linux-kernel, Roman.Kagan; +Cc: R.E.Wolff

> > Anyway, speaking about SMART, some "smartd" was interfering with
> > normal operation on one of our systems and we saw similar "nasty"
> > stuff on that system until I removed "smartd". 
> > 
> > Aug 10 06:54:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 06:54:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> > Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 06:54:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
> > Aug 10 07:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 07:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> > Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 07:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
> > Aug 10 08:24:25 falbala kernel: hda: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 08:24:25 falbala kernel: hda: drive_cmd: error=0x04 { DriveStatusError }
> > Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error }
> > Aug 10 08:24:25 falbala kernel: hdb: drive_cmd: error=0x04 { DriveStatusError }
>
> You probably have SMART disabled on those drives by BIOS, and smartd is
> not smart enough to enable it before trying to use it so the drives
> complain.

Quite possible.

> I had the same problem on my GigaByte mobo where the BIOS
> setup didn't even provide an option to turn SMART on (like earlier Award
> BIOSes did).

For some reason, both of my Gigabyte GA-7VA motherboards seem to
disable SMART when I reboot.

> Check with smartctl -i /dev/hdX.  Enable with smartctl -e /dev/hdX
> _before_ starting smartd.

You may need to use smartctl -e /dev/hdX every time you boot.

John.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-09-12 11:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-11 19:11 Problem: IDE data corruption with VIA chipsets on2.4.20-19.8+others Eric Bickle
  -- strict thread matches above, loose matches on Subject: below --
2003-09-12 11:45 John Bradford
2003-09-12 11:19 Roman Kagan
2003-09-11 18:20 Eric Bickle
2003-09-11 18:43 ` Alan Cox
2003-09-12  8:14   ` Rogier Wolff
2003-09-12 10:44     ` Alan Cox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.