regarding bug #5914 - fs corruption on SATA

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* regarding bug #5914 - fs corruption on SATA
@ 2006-01-26  5:50 Tejun Heo
  2006-01-26  5:51 ` Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Tejun Heo @ 2006-01-26  5:50 UTC (permalink / raw)
  To: Nicolas.Mailhot; +Cc: Jeff Garzik, Jens Axboe, Linux-ide

Hello, Nicolas.  Hello, all.

Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
(forced-unit-access)thing made into the mainline lately, and it seems
that your drive is reporting FUA support but doesn't really do it
properly when it's asked to.

Can you try the followings to verify the problem?

1. make a small partition on the affected drive and do mkfs.ext3 on it.
2. mount -o barrier new_partition /mnt/tmp
3. cd /mnt/tmp; touch asdf; sync

This should give something like the following.

======
ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
sd 2:0:0:0: SCSI error: return code = 0x8000002
sdc: Current: sense key: Aborted Command
    Additional sense: No additional sense information
end_request: I/O error, dev sdc, sector 4359
Buffer I/O error on device sdc1, logical block 537
lost page write due to I/O error on sdc1
Aborting journal on device sdc1.
journal commit I/O error
======

The ext3 fs will back off and won't use any barrier from this point.

If this is what you see, please apply the patch at the end of this
mail, which makes libata issue non-FUA commmands even if FUA commands
are asked for.  After recompiling repeat above, create some files,
unmount, mount, verify stuff, unmount and fsck...  All should succeed
without any complaint from the kernel.

If my guess turns out to be true, we'll need a blacklist for those
lying drives.  Damn it.

diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index 46c4cdb..6ba6ad2 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -565,7 +565,7 @@ static const u8 ata_rw_cmds[] = {
 	0,
 	0,
 	0,
-	ATA_CMD_WRITE_MULTI_FUA_EXT,
+	ATA_CMD_WRITE_MULTI_EXT,
 	/* pio */
 	ATA_CMD_PIO_READ,
 	ATA_CMD_PIO_WRITE,
@@ -583,7 +583,7 @@ static const u8 ata_rw_cmds[] = {
 	0,
 	0,
 	0,
-	ATA_CMD_WRITE_FUA_EXT
+	ATA_CMD_WRITE_EXT
 };
 
 /**

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  5:50 regarding bug #5914 - fs corruption on SATA Tejun Heo
@ 2006-01-26  5:51 ` Tejun Heo
  2006-01-26  9:14   ` Nicolas Mailhot
  2006-01-26  9:18 ` Jens Axboe
  2006-01-26 16:41 ` David Greaves
  2 siblings, 1 reply; 33+ messages in thread
From: Tejun Heo @ 2006-01-26  5:51 UTC (permalink / raw)
  To: Nicolas.Mailhot; +Cc: Jeff Garzik, Jens Axboe, Linux-ide

Tejun Heo wrote:
> Hello, Nicolas.  Hello, all.
> 
> Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
> (forced-unit-access)thing made into the mainline lately, and it seems
> that your drive is reporting FUA support but doesn't really do it
> properly when it's asked to.
> 
> Can you try the followings to verify the problem?
> 
> 1. make a small partition on the affected drive and do mkfs.ext3 on it.
> 2. mount -o barrier new_partition /mnt/tmp

This should be 'mount -o barrier=1 new_partition /mnt/tmp'

> 3. cd /mnt/tmp; touch asdf; sync


-- 
tejun

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  5:51 ` Tejun Heo
@ 2006-01-26  9:14   ` Nicolas Mailhot
  2006-01-26  9:21     ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-26  9:14 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, Jens Axboe, Linux-ide


Le Jeu 26 janvier 2006 06:51, Tejun Heo a écrit :
> Tejun Heo wrote:
>> Hello, Nicolas.  Hello, all.

Hi

>> Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
>> (forced-unit-access)thing made into the mainline lately, and it seems
>> that your drive is reporting FUA support but doesn't really do it
>> properly when it's asked to.
>>
>> Can you try the followings to verify the problem?
>>
>> 1. make a small partition on the affected drive and do mkfs.ext3 on it.
>> 2. mount -o barrier new_partition /mnt/tmp
>
> This should be 'mount -o barrier=1 new_partition /mnt/tmp'
>
>> 3. cd /mnt/tmp; touch asdf; sync

What parts can be done one a pre-breakage kernel and what parts on a
problem kernel (I ask this because a problem kernel will corrupt basically
any file it writes to, even in single login mode the damage is significant
so I need to limit the corruption window to minimum).

Also I have plenty of space to create partitions but that will be
lvm-on-md-raid1 space (don't know if it matters, if it does I need to
learn to shrink the lvm/md)

Regards,

BTW what's FUA in semi-layman terms ?

-- 
Nicolas Mailhot


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  5:50 regarding bug #5914 - fs corruption on SATA Tejun Heo
  2006-01-26  5:51 ` Tejun Heo
@ 2006-01-26  9:18 ` Jens Axboe
  2006-01-26 14:11   ` Bartlomiej Zolnierkiewicz
  2006-01-26 16:41 ` David Greaves
  2 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-26  9:18 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Nicolas.Mailhot, Jeff Garzik, Linux-ide

On Thu, Jan 26 2006, Tejun Heo wrote:
> Hello, Nicolas.  Hello, all.
> 
> Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
> (forced-unit-access)thing made into the mainline lately, and it seems
> that your drive is reporting FUA support but doesn't really do it
> properly when it's asked to.

It's strange. I have 3 out of 4 drives in a box here reporting FUA
capability, and I have now tested all three of them both with plain FUA
writes and NCQ FUA tagged writes. I used data integrity verifying
writes, and the data is sound as well. fs likewise, I used ext3 mounted
with barriers enabled.

What exact model drive is this? It could also be a raid funny. Tejuns
proposal with testing the drive alone with ext3+barriers is a good one.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  9:14   ` Nicolas Mailhot
@ 2006-01-26  9:21     ` Jens Axboe
  2006-01-26 10:01       ` Nicolas Mailhot
       [not found]       ` <5840.192.54.193.25.1138269692.squirrel@rousalka.dyndns.org>
  0 siblings, 2 replies; 33+ messages in thread
From: Jens Axboe @ 2006-01-26  9:21 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Tejun Heo, Jeff Garzik, Linux-ide

On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> 
> Le Jeu 26 janvier 2006 06:51, Tejun Heo a écrit :
> > Tejun Heo wrote:
> >> Hello, Nicolas.  Hello, all.
> 
> Hi
> 
> >> Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
> >> (forced-unit-access)thing made into the mainline lately, and it seems
> >> that your drive is reporting FUA support but doesn't really do it
> >> properly when it's asked to.
> >>
> >> Can you try the followings to verify the problem?
> >>
> >> 1. make a small partition on the affected drive and do mkfs.ext3 on it.
> >> 2. mount -o barrier new_partition /mnt/tmp
> >
> > This should be 'mount -o barrier=1 new_partition /mnt/tmp'
> >
> >> 3. cd /mnt/tmp; touch asdf; sync
> 
> What parts can be done one a pre-breakage kernel and what parts on a
> problem kernel (I ask this because a problem kernel will corrupt basically
> any file it writes to, even in single login mode the damage is significant
> so I need to limit the corruption window to minimum).

You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
instance.

> Also I have plenty of space to create partitions but that will be
> lvm-on-md-raid1 space (don't know if it matters, if it does I need to
> learn to shrink the lvm/md)

It would be best to exclude lvm/md for now, but I can see it might not
be so easy for you...

> BTW what's FUA in semi-layman terms ?

It stands for Forced Unit Access, basically a way to force the drive to
write through the cache directly to platter even when write back caching
is enabled. Or just bypass the cache on a read, but we use it for writes
with the barrier stuff.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  9:21     ` Jens Axboe
@ 2006-01-26 10:01       ` Nicolas Mailhot
       [not found]       ` <5840.192.54.193.25.1138269692.squirrel@rousalka.dyndns.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-26 10:01 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Jeff Garzik, Linux-ide


Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> On Thu, Jan 26 2006, Nicolas Mailhot wrote:

>> What parts can be done one a pre-breakage kernel and what parts on a
>> problem kernel (I ask this because a problem kernel will corrupt
>> basically
>> any file it writes to, even in single login mode the damage is
>> significant
>> so I need to limit the corruption window to minimum).
>
> You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> instance.

Ok, I'll do the test this evening (CET) with the rawhide/davej kernel-of
the day.

Regards,

-- 
Nicolas Mailhot


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  9:18 ` Jens Axboe
@ 2006-01-26 14:11   ` Bartlomiej Zolnierkiewicz
  2006-01-26 14:27     ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2006-01-26 14:11 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Nicolas.Mailhot, Jeff Garzik, Linux-ide

On 1/26/06, Jens Axboe <axboe@suse.de> wrote:
> On Thu, Jan 26 2006, Tejun Heo wrote:
> > Hello, Nicolas.  Hello, all.
> >
> > Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
> > (forced-unit-access)thing made into the mainline lately, and it seems
> > that your drive is reporting FUA support but doesn't really do it
> > properly when it's asked to.
>
> It's strange. I have 3 out of 4 drives in a box here reporting FUA
> capability, and I have now tested all three of them both with plain FUA
> writes and NCQ FUA tagged writes. I used data integrity verifying
> writes, and the data is sound as well. fs likewise, I used ext3 mounted
> with barriers enabled.

You are just lucky ;-).  There are drives out there having buggy NCQ
support.  Windows driver for Sil have them listed in .inf file, you can also
find discussions on various internet forums about problems (including
data corruption) with these drives and controllers supporting NCQ.

However there is official firmware update so hopefully it can be fixed
(unfortunately we still need to blacklist buggy firmware revisions).

I wouldn't be surprised if there is similar situation with FUA support.
Does anybody know if Windows use FUA?

Bartlomiej

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26 14:11   ` Bartlomiej Zolnierkiewicz
@ 2006-01-26 14:27     ` Jens Axboe
  0 siblings, 0 replies; 33+ messages in thread
From: Jens Axboe @ 2006-01-26 14:27 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Tejun Heo, Nicolas.Mailhot, Jeff Garzik, Linux-ide

On Thu, Jan 26 2006, Bartlomiej Zolnierkiewicz wrote:
> On 1/26/06, Jens Axboe <axboe@suse.de> wrote:
> > On Thu, Jan 26 2006, Tejun Heo wrote:
> > > Hello, Nicolas.  Hello, all.
> > >
> > > Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
> > > (forced-unit-access)thing made into the mainline lately, and it seems
> > > that your drive is reporting FUA support but doesn't really do it
> > > properly when it's asked to.
> >
> > It's strange. I have 3 out of 4 drives in a box here reporting FUA
> > capability, and I have now tested all three of them both with plain FUA
> > writes and NCQ FUA tagged writes. I used data integrity verifying
> > writes, and the data is sound as well. fs likewise, I used ext3 mounted
> > with barriers enabled.
> 
> You are just lucky ;-).  There are drives out there having buggy NCQ
> support.  Windows driver for Sil have them listed in .inf file, you can also

Oh yeah, I'm very well aware of NCQ firmware issues. Interesting about
the Sil inf file having them listed, I started a blacklist myself for
NCQ and I'll be sure to look at theirs!

> find discussions on various internet forums about problems (including
> data corruption) with these drives and controllers supporting NCQ.

But that's a seperate story, it's just plain FUA that is the case here.

> However there is official firmware update so hopefully it can be fixed
> (unfortunately we still need to blacklist buggy firmware revisions).
> 
> I wouldn't be surprised if there is similar situation with FUA support.
> Does anybody know if Windows use FUA?

Actually I'm a little surprised, honestly. It's a pretty simple feature.
It's not like NCQ where I can understand that bugs can creap into the
firmware (although some are so buggy it's unbelivable).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26  5:50 regarding bug #5914 - fs corruption on SATA Tejun Heo
  2006-01-26  5:51 ` Tejun Heo
  2006-01-26  9:18 ` Jens Axboe
@ 2006-01-26 16:41 ` David Greaves
  2006-01-26 16:58   ` Jeff Garzik
  2 siblings, 1 reply; 33+ messages in thread
From: David Greaves @ 2006-01-26 16:41 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Nicolas.Mailhot, Jeff Garzik, Jens Axboe, Linux-ide,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks

Have  you guys seen the parallel threads (in linux-ide and linux-raid)
that have been reporting very similar problems for a few days now.

Have a look for subjects such as
  Problems with multiple Promise SATA150 TX4 cards
  Possible libata/sata/Asus problem (was Re: Need to upgrade to latest
stable mdadm version?)

For me, please see:

http://marc.theaimsgroup.com/?l=linux-kernel&m=113769509617034&w=2


David
PS I'm using XFS on md5 and md1
PPS Buying a new £60+ PSU didn't fix my problem - <sigh>


Tejun Heo wrote:

>Hello, Nicolas.  Hello, all.
>
>Nicolas, I'm probably the guy who broke your filesystem.  :-p This FUA
>(forced-unit-access)thing made into the mainline lately, and it seems
>that your drive is reporting FUA support but doesn't really do it
>properly when it's asked to.
>
>Can you try the followings to verify the problem?
>
>1. make a small partition on the affected drive and do mkfs.ext3 on it.
>2. mount -o barrier new_partition /mnt/tmp
>3. cd /mnt/tmp; touch asdf; sync
>
>This should give something like the following.
>
>======
>ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
>ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
>ata2: status=0x51 { DriveReady SeekComplete Error }
>ata2: error=0x04 { DriveStatusError }
>ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
>ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
>ata2: status=0x51 { DriveReady SeekComplete Error }
>ata2: error=0x04 { DriveStatusError }
>ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
>ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
>ata2: status=0x51 { DriveReady SeekComplete Error }
>ata2: error=0x04 { DriveStatusError }
>ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
>ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
>ata2: status=0x51 { DriveReady SeekComplete Error }
>ata2: error=0x04 { DriveStatusError }
>ata2: port reset, p_is 40000001 is 2 pis 0 cmd 44017 tf 451 ss 123 se 0
>ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
>ata2: status=0x51 { DriveReady SeekComplete Error }
>ata2: error=0x04 { DriveStatusError }
>sd 2:0:0:0: SCSI error: return code = 0x8000002
>sdc: Current: sense key: Aborted Command
>    Additional sense: No additional sense information
>end_request: I/O error, dev sdc, sector 4359
>Buffer I/O error on device sdc1, logical block 537
>lost page write due to I/O error on sdc1
>Aborting journal on device sdc1.
>journal commit I/O error
>======
>
>The ext3 fs will back off and won't use any barrier from this point.
>
>If this is what you see, please apply the patch at the end of this
>mail, which makes libata issue non-FUA commmands even if FUA commands
>are asked for.  After recompiling repeat above, create some files,
>unmount, mount, verify stuff, unmount and fsck...  All should succeed
>without any complaint from the kernel.
>
>If my guess turns out to be true, we'll need a blacklist for those
>lying drives.  Damn it.
>
>diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
>index 46c4cdb..6ba6ad2 100644
>--- a/drivers/scsi/libata-core.c
>+++ b/drivers/scsi/libata-core.c
>@@ -565,7 +565,7 @@ static const u8 ata_rw_cmds[] = {
> 	0,
> 	0,
> 	0,
>-	ATA_CMD_WRITE_MULTI_FUA_EXT,
>+	ATA_CMD_WRITE_MULTI_EXT,
> 	/* pio */
> 	ATA_CMD_PIO_READ,
> 	ATA_CMD_PIO_WRITE,
>@@ -583,7 +583,7 @@ static const u8 ata_rw_cmds[] = {
> 	0,
> 	0,
> 	0,
>-	ATA_CMD_WRITE_FUA_EXT
>+	ATA_CMD_WRITE_EXT
> };
> 
> /**
>-
>To unsubscribe from this list: send the line "unsubscribe linux-ide" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>


-- 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26 16:41 ` David Greaves
@ 2006-01-26 16:58   ` Jeff Garzik
  2006-01-26 17:15     ` David Greaves
  2006-01-26 17:20     ` regarding bug #5914 - fs corruption on SATA Soeren Sonnenburg
  0 siblings, 2 replies; 33+ messages in thread
From: Jeff Garzik @ 2006-01-26 16:58 UTC (permalink / raw)
  To: David Greaves
  Cc: Tejun Heo, Nicolas.Mailhot, Jens Axboe, Linux-ide,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks

David Greaves wrote:
> Have a look for subjects such as
>   Problems with multiple Promise SATA150 TX4 cards

This is almost certainly either a power or PCI bus/slot issue.


>   Possible libata/sata/Asus problem (was Re: Need to upgrade to latest
> stable mdadm version?)

Highly likely to be a motherboard/BIOS issue related to properly tuning 
and timing the hardware.

HOWEVER, libata can help (via Tejun's recent patches) by properly 
handling the error when throw to us by hardware.

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26 16:58   ` Jeff Garzik
@ 2006-01-26 17:15     ` David Greaves
  2006-02-07 18:35       ` SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA) David Greaves
  2006-01-26 17:20     ` regarding bug #5914 - fs corruption on SATA Soeren Sonnenburg
  1 sibling, 1 reply; 33+ messages in thread
From: David Greaves @ 2006-01-26 17:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Nicolas.Mailhot, Jens Axboe, Linux-ide,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks

Jeff Garzik wrote:

> David Greaves wrote:
>
>>   Possible libata/sata/Asus problem (was Re: Need to upgrade to latest
>> stable mdadm version?)
>
> Highly likely to be a motherboard/BIOS issue related to properly
> tuning and timing the hardware.
>
> HOWEVER, libata can help (via Tejun's recent patches) by properly
> handling the error when throw to us by hardware.

OK - I thought my messages:

Jan 20 06:25:04 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:04 haze kernel: ata2: error=0x04 { DriveStatusError }
Jan 20 06:25:10 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:10 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:18 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:18 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:18 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:18 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:20 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:20 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:22 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:22 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:52 haze kernel: ata2: no sense translation for status: 0x51
Jan 20 06:25:52 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
Error }
Jan 20 06:25:52 haze kernel: sd 1:0:0:0: SCSI error: return code = 0x8000002
Jan 20 06:25:52 haze kernel: sdb: Current: sense key: Medium Error
Jan 20 06:25:52 haze kernel:     Additional sense: Unrecovered read
error - auto reallocate failed
Jan 20 06:25:52 haze kernel: end_request: I/O error, dev sdb, sector
390787713

bore a certain similarity to those in Tejun/Nicolas' mail:

Different problem? as irq might ask: "does anybody care?" :)

(and yes badblocks and SMART reports all is well)

David

-- 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26 16:58   ` Jeff Garzik
  2006-01-26 17:15     ` David Greaves
@ 2006-01-26 17:20     ` Soeren Sonnenburg
  1 sibling, 0 replies; 33+ messages in thread
From: Soeren Sonnenburg @ 2006-01-26 17:20 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: David Greaves, Tejun Heo, Nicolas.Mailhot, Jens Axboe, Linux-ide,
	Christopher Smith, Erik Slagter, hahn, mlaks, mlaks

On Thu, 2006-01-26 at 11:58 -0500, Jeff Garzik wrote:
> David Greaves wrote:
> > Have a look for subjects such as
> >   Problems with multiple Promise SATA150 TX4 cards
> 
> This is almost certainly either a power or PCI bus/slot issue.

So you mean the freeze I am observing when I copy files from the sata
disk to some
ieee1394 device (they are sharing interrupt 16)

 16:     430796   IO-APIC-level  ide2, ide3, libata, ohci1394

leading to lots of output as...

ata2: translated ATA stat/err 0x51/0c to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x0c { DriveStatusError }

could be a cause of just that ?

> >   Possible libata/sata/Asus problem (was Re: Need to upgrade to latest
> > stable mdadm version?)
> 
> Highly likely to be a motherboard/BIOS issue related to properly tuning 
> and timing the hardware.
> 
> HOWEVER, libata can help (via Tejun's recent patches) by properly 
> handling the error when throw to us by hardware.

So it could help in the first case but also with this: I can freeze the
system bei hdparm -y /dev/sda ... will this patch also help in that
case ?

Soeren.
-- 
Sometimes, there's a moment as you're waking, when you become aware of
the real world around you, but you're still dreaming.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
       [not found]       ` <5840.192.54.193.25.1138269692.squirrel@rousalka.dyndns.org>
@ 2006-01-26 21:04         ` Nicolas Mailhot
  2006-01-27  8:13           ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-26 21:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Jeff Garzik, Linux-ide

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> 
> >> What parts can be done one a pre-breakage kernel and what parts on a
> >> problem kernel (I ask this because a problem kernel will corrupt
> >> basically
> >> any file it writes to, even in single login mode the damage is
> >> significant
> >> so I need to limit the corruption window to minimum).
> >
> > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> > instance.
> 
> Ok, I'll do the test this evening (CET) with the rawhide/davej kernel-of
> the day.

I applied the fua backout patch and the kernel booted beautifully.
Now I guess I need to see if Maxtor released a fixed firmware right ?
(is it possible to change the firmware on a running system ?)

Regards,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-26 21:04         ` Nicolas Mailhot
@ 2006-01-27  8:13           ` Jens Axboe
  2006-01-27  8:53             ` Nicolas Mailhot
  2006-01-27 12:12             ` Ric Wheeler
  0 siblings, 2 replies; 33+ messages in thread
From: Jens Axboe @ 2006-01-27  8:13 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Tejun Heo, Jeff Garzik, Linux-ide

On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > 
> > >> What parts can be done one a pre-breakage kernel and what parts on a
> > >> problem kernel (I ask this because a problem kernel will corrupt
> > >> basically
> > >> any file it writes to, even in single login mode the damage is
> > >> significant
> > >> so I need to limit the corruption window to minimum).
> > >
> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> > > instance.
> > 
> > Ok, I'll do the test this evening (CET) with the rawhide/davej kernel-of
> > the day.
> 
> I applied the fua backout patch and the kernel booted beautifully.
> Now I guess I need to see if Maxtor released a fixed firmware right ?
> (is it possible to change the firmware on a running system ?)

If you can get an update firmware, it is usually done by booting from
DOS floppy and running a special flash utility from there. Can you send
me the hdparm -I /dev/sdX output of the problem drive? I think we should
just blacklist it for FUA. This bug is so obscure I think it's a better
solution than adding a FUA disable module parameter at this point.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  8:13           ` Jens Axboe
@ 2006-01-27  8:53             ` Nicolas Mailhot
  2006-01-27  9:10               ` Jens Axboe
  2006-01-27 12:12             ` Ric Wheeler
  1 sibling, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-27  8:53 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Jeff Garzik, Linux-ide


Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
> On Thu, Jan 26 2006, Nicolas Mailhot wrote:
>> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
>> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
>> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
>> >
>> > >> What parts can be done one a pre-breakage kernel and what parts on
>> a
>> > >> problem kernel (I ask this because a problem kernel will corrupt
>> > >> basically
>> > >> any file it writes to, even in single login mode the damage is
>> > >> significant
>> > >> so I need to limit the corruption window to minimum).
>> > >
>> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
>> > > instance.
>> >
>> > Ok, I'll do the test this evening (CET) with the rawhide/davej
>> kernel-of
>> > the day.
>>
>> I applied the fua backout patch and the kernel booted beautifully.
>> Now I guess I need to see if Maxtor released a fixed firmware right ?
>> (is it possible to change the firmware on a running system ?)
>
> If you can get an update firmware, it is usually done by booting from
> DOS floppy and running a special flash utility from there. Can you send
> me the hdparm -I /dev/sdX output of the problem drive? I think we should
> just blacklist it for FUA. This bug is so obscure I think it's a better
> solution than adding a FUA disable module parameter at this point.

There is already fairly complete smart info available in
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
(https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)

I'll add hdparm info this evening if it's not sufficient

Regards,

-- 
Nicolas Mailhot


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  8:53             ` Nicolas Mailhot
@ 2006-01-27  9:10               ` Jens Axboe
  2006-01-27  9:20                 ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-27  9:10 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Tejun Heo, Jeff Garzik, Linux-ide

On Fri, Jan 27 2006, Nicolas Mailhot wrote:
> 
> Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
> > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> >> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> >> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> >> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> >> >
> >> > >> What parts can be done one a pre-breakage kernel and what parts on
> >> a
> >> > >> problem kernel (I ask this because a problem kernel will corrupt
> >> > >> basically
> >> > >> any file it writes to, even in single login mode the damage is
> >> > >> significant
> >> > >> so I need to limit the corruption window to minimum).
> >> > >
> >> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> >> > > instance.
> >> >
> >> > Ok, I'll do the test this evening (CET) with the rawhide/davej
> >> kernel-of
> >> > the day.
> >>
> >> I applied the fua backout patch and the kernel booted beautifully.
> >> Now I guess I need to see if Maxtor released a fixed firmware right ?
> >> (is it possible to change the firmware on a running system ?)
> >
> > If you can get an update firmware, it is usually done by booting from
> > DOS floppy and running a special flash utility from there. Can you send
> > me the hdparm -I /dev/sdX output of the problem drive? I think we should
> > just blacklist it for FUA. This bug is so obscure I think it's a better
> > solution than adding a FUA disable module parameter at this point.
> 
> There is already fairly complete smart info available in
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
> (https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
> https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)

I didn't notice the smart info, yes that holds enough information.
Thanks!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  9:10               ` Jens Axboe
@ 2006-01-27  9:20                 ` Jens Axboe
  2006-01-27  9:27                   ` Nicolas Mailhot
  2006-01-27  9:46                   ` Bartlomiej Zolnierkiewicz
  0 siblings, 2 replies; 33+ messages in thread
From: Jens Axboe @ 2006-01-27  9:20 UTC (permalink / raw)
  To: Nicolas Mailhot; +Cc: Tejun Heo, Jeff Garzik, Linux-ide

On Fri, Jan 27 2006, Jens Axboe wrote:
> On Fri, Jan 27 2006, Nicolas Mailhot wrote:
> > 
> > Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > >> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> > >> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> > >> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > >> >
> > >> > >> What parts can be done one a pre-breakage kernel and what parts on
> > >> a
> > >> > >> problem kernel (I ask this because a problem kernel will corrupt
> > >> > >> basically
> > >> > >> any file it writes to, even in single login mode the damage is
> > >> > >> significant
> > >> > >> so I need to limit the corruption window to minimum).
> > >> > >
> > >> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> > >> > > instance.
> > >> >
> > >> > Ok, I'll do the test this evening (CET) with the rawhide/davej
> > >> kernel-of
> > >> > the day.
> > >>
> > >> I applied the fua backout patch and the kernel booted beautifully.
> > >> Now I guess I need to see if Maxtor released a fixed firmware right ?
> > >> (is it possible to change the firmware on a running system ?)
> > >
> > > If you can get an update firmware, it is usually done by booting from
> > > DOS floppy and running a special flash utility from there. Can you send
> > > me the hdparm -I /dev/sdX output of the problem drive? I think we should
> > > just blacklist it for FUA. This bug is so obscure I think it's a better
> > > solution than adding a FUA disable module parameter at this point.
> > 
> > There is already fairly complete smart info available in
> > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
> > (https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
> > https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)
> 
> I didn't notice the smart info, yes that holds enough information.
> Thanks!

Can you try and boot a kernel with this patch applied (needs to be one
of the newer ones, of course) and see if you still see the "w/ FUA"
string next to your Maxtor drive(s)?

diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
index cfbceb5..3feda07 100644
--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -1700,6 +1700,28 @@ static unsigned int ata_msense_rw_recove
 	return sizeof(def_rw_recovery_mpage);
 }
 
+/*
+ * We can turn this into a real blacklist if it's needed, for now just
+ * blacklist any Maxtor BANC1G10 revision firmware
+ */
+static int ata_dev_supports_fua(u16 *id)
+{
+	unsigned char model[41], fw[9];
+
+	if (!ata_id_has_fua(id))
+		return 0;
+
+	ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model));
+	ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw));
+
+	if (strncmp(model, "Maxtor", 6))
+		return 1;
+	if (strncmp(model, "BANC1G10", 8))
+		return 1;
+
+	return 0; /* blacklisted */
+}
+
 /**
  *	ata_scsiop_mode_sense - Simulate MODE SENSE 6, 10 commands
  *	@args: device IDENTIFY data / SCSI command of interest.
@@ -1797,7 +1819,7 @@ unsigned int ata_scsiop_mode_sense(struc
 		return 0;
 
 	dpofua = 0;
-	if (ata_id_has_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
+	if (ata_dev_supports_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
 	    (!(dev->flags & ATA_DFLAG_PIO) || dev->multi_count))
 		dpofua = 1 << 4;
 


-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  9:20                 ` Jens Axboe
@ 2006-01-27  9:27                   ` Nicolas Mailhot
  2006-01-27  9:46                   ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-27  9:27 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, Jeff Garzik, Linux-ide


Le Ven 27 janvier 2006 10:20, Jens Axboe a écrit :
> On Fri, Jan 27 2006, Jens Axboe wrote:
>> On Fri, Jan 27 2006, Nicolas Mailhot wrote:
>> >
>> > Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
>> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
>> > >> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
>> > >> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
>> > >> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
>> > >> >
>> > >> > >> What parts can be done one a pre-breakage kernel and what
>> parts on
>> > >> a
>> > >> > >> problem kernel (I ask this because a problem kernel will
>> corrupt
>> > >> > >> basically
>> > >> > >> any file it writes to, even in single login mode the damage is
>> > >> > >> significant
>> > >> > >> so I need to limit the corruption window to minimum).
>> > >> > >
>> > >> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1
>> for
>> > >> > > instance.
>> > >> >
>> > >> > Ok, I'll do the test this evening (CET) with the rawhide/davej
>> > >> kernel-of
>> > >> > the day.
>> > >>
>> > >> I applied the fua backout patch and the kernel booted beautifully.
>> > >> Now I guess I need to see if Maxtor released a fixed firmware right
>> ?
>> > >> (is it possible to change the firmware on a running system ?)
>> > >
>> > > If you can get an update firmware, it is usually done by booting
>> from
>> > > DOS floppy and running a special flash utility from there. Can you
>> send
>> > > me the hdparm -I /dev/sdX output of the problem drive? I think we
>> should
>> > > just blacklist it for FUA. This bug is so obscure I think it's a
>> better
>> > > solution than adding a FUA disable module parameter at this point.
>> >
>> > There is already fairly complete smart info available in
>> > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
>> > (https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
>> > https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)
>>
>> I didn't notice the smart info, yes that holds enough information.
>> Thanks!
>
> Can you try and boot a kernel with this patch applied (needs to be one
> of the newer ones, of course) and see if you still see the "w/ FUA"
> string next to your Maxtor drive(s)?
>
> diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
> index cfbceb5..3feda07 100644
> --- a/drivers/scsi/libata-scsi.c
> +++ b/drivers/scsi/libata-scsi.c
> @@ -1700,6 +1700,28 @@ static unsigned int ata_msense_rw_recove
>  	return sizeof(def_rw_recovery_mpage);
>  }
>
> +/*
> + * We can turn this into a real blacklist if it's needed, for now just
> + * blacklist any Maxtor BANC1G10 revision firmware
> + */
> +static int ata_dev_supports_fua(u16 *id)
> +{
> +	unsigned char model[41], fw[9];
> +
> +	if (!ata_id_has_fua(id))
> +		return 0;
> +
> +	ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model));
> +	ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw));
> +
> +	if (strncmp(model, "Maxtor", 6))
> +		return 1;
> +	if (strncmp(model, "BANC1G10", 8))
> +		return 1;
> +
> +	return 0; /* blacklisted */
> +}
> +
>  /**
>   *	ata_scsiop_mode_sense - Simulate MODE SENSE 6, 10 commands
>   *	@args: device IDENTIFY data / SCSI command of interest.
> @@ -1797,7 +1819,7 @@ unsigned int ata_scsiop_mode_sense(struc
>  		return 0;
>
>  	dpofua = 0;
> -	if (ata_id_has_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
> +	if (ata_dev_supports_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
>  	    (!(dev->flags & ATA_DFLAG_PIO) || dev->multi_count))
>  		dpofua = 1 << 4;
>

Will do this evening (CET)

-- 
Nicolas Mailhot


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  9:20                 ` Jens Axboe
  2006-01-27  9:27                   ` Nicolas Mailhot
@ 2006-01-27  9:46                   ` Bartlomiej Zolnierkiewicz
  2006-01-27  9:50                     ` Jens Axboe
  1 sibling, 1 reply; 33+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2006-01-27  9:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Nicolas Mailhot, Tejun Heo, Jeff Garzik, Linux-ide

On 1/27/06, Jens Axboe <axboe@suse.de> wrote:
> On Fri, Jan 27 2006, Jens Axboe wrote:
> > On Fri, Jan 27 2006, Nicolas Mailhot wrote:
> > >
> > > Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
> > > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > > >> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> > > >> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> > > >> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > > >> >
> > > >> > >> What parts can be done one a pre-breakage kernel and what parts on
> > > >> a
> > > >> > >> problem kernel (I ask this because a problem kernel will corrupt
> > > >> > >> basically
> > > >> > >> any file it writes to, even in single login mode the damage is
> > > >> > >> significant
> > > >> > >> so I need to limit the corruption window to minimum).
> > > >> > >
> > > >> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> > > >> > > instance.
> > > >> >
> > > >> > Ok, I'll do the test this evening (CET) with the rawhide/davej
> > > >> kernel-of
> > > >> > the day.
> > > >>
> > > >> I applied the fua backout patch and the kernel booted beautifully.
> > > >> Now I guess I need to see if Maxtor released a fixed firmware right ?
> > > >> (is it possible to change the firmware on a running system ?)
> > > >
> > > > If you can get an update firmware, it is usually done by booting from
> > > > DOS floppy and running a special flash utility from there. Can you send
> > > > me the hdparm -I /dev/sdX output of the problem drive? I think we should
> > > > just blacklist it for FUA. This bug is so obscure I think it's a better
> > > > solution than adding a FUA disable module parameter at this point.
> > >
> > > There is already fairly complete smart info available in
> > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
> > > (https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
> > > https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)
> >
> > I didn't notice the smart info, yes that holds enough information.
> > Thanks!
>
> Can you try and boot a kernel with this patch applied (needs to be one
> of the newer ones, of course) and see if you still see the "w/ FUA"
> string next to your Maxtor drive(s)?
>
> diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
> index cfbceb5..3feda07 100644
> --- a/drivers/scsi/libata-scsi.c
> +++ b/drivers/scsi/libata-scsi.c
> @@ -1700,6 +1700,28 @@ static unsigned int ata_msense_rw_recove
>         return sizeof(def_rw_recovery_mpage);
>  }
>
> +/*
> + * We can turn this into a real blacklist if it's needed, for now just
> + * blacklist any Maxtor BANC1G10 revision firmware
> + */
> +static int ata_dev_supports_fua(u16 *id)
> +{
> +       unsigned char model[41], fw[9];
> +
> +       if (!ata_id_has_fua(id))
> +               return 0;
> +
> +       ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model));
> +       ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw));
> +
> +       if (strncmp(model, "Maxtor", 6))
> +               return 1;
> +       if (strncmp(model, "BANC1G10", 8))
> +               return 1;

s/model/fw/

?

Bartlomiej

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  9:46                   ` Bartlomiej Zolnierkiewicz
@ 2006-01-27  9:50                     ` Jens Axboe
  2006-01-27 19:37                       ` Nicolas Mailhot
  0 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-27  9:50 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Nicolas Mailhot, Tejun Heo, Jeff Garzik, Linux-ide

On Fri, Jan 27 2006, Bartlomiej Zolnierkiewicz wrote:
> On 1/27/06, Jens Axboe <axboe@suse.de> wrote:
> > On Fri, Jan 27 2006, Jens Axboe wrote:
> > > On Fri, Jan 27 2006, Nicolas Mailhot wrote:
> > > >
> > > > Le Ven 27 janvier 2006 09:13, Jens Axboe a écrit :
> > > > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > > > >> Le jeudi 26 janvier 2006 à 11:01 +0100, Nicolas Mailhot a écrit :
> > > > >> > Le Jeu 26 janvier 2006 10:21, Jens Axboe a écrit :
> > > > >> > > On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > > > >> >
> > > > >> > >> What parts can be done one a pre-breakage kernel and what parts on
> > > > >> a
> > > > >> > >> problem kernel (I ask this because a problem kernel will corrupt
> > > > >> > >> basically
> > > > >> > >> any file it writes to, even in single login mode the damage is
> > > > >> > >> significant
> > > > >> > >> so I need to limit the corruption window to minimum).
> > > > >> > >
> > > > >> > > You need a new kernel (after the barrier rework), so 2.6.16-rc1 for
> > > > >> > > instance.
> > > > >> >
> > > > >> > Ok, I'll do the test this evening (CET) with the rawhide/davej
> > > > >> kernel-of
> > > > >> > the day.
> > > > >>
> > > > >> I applied the fua backout patch and the kernel booted beautifully.
> > > > >> Now I guess I need to see if Maxtor released a fixed firmware right ?
> > > > >> (is it possible to change the firmware on a running system ?)
> > > > >
> > > > > If you can get an update firmware, it is usually done by booting from
> > > > > DOS floppy and running a special flash utility from there. Can you send
> > > > > me the hdparm -I /dev/sdX output of the problem drive? I think we should
> > > > > just blacklist it for FUA. This bug is so obscure I think it's a better
> > > > > solution than adding a FUA disable module parameter at this point.
> > > >
> > > > There is already fairly complete smart info available in
> > > > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=177951
> > > > (https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123604
> > > > https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123605)
> > >
> > > I didn't notice the smart info, yes that holds enough information.
> > > Thanks!
> >
> > Can you try and boot a kernel with this patch applied (needs to be one
> > of the newer ones, of course) and see if you still see the "w/ FUA"
> > string next to your Maxtor drive(s)?
> >
> > diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
> > index cfbceb5..3feda07 100644
> > --- a/drivers/scsi/libata-scsi.c
> > +++ b/drivers/scsi/libata-scsi.c
> > @@ -1700,6 +1700,28 @@ static unsigned int ata_msense_rw_recove
> >         return sizeof(def_rw_recovery_mpage);
> >  }
> >
> > +/*
> > + * We can turn this into a real blacklist if it's needed, for now just
> > + * blacklist any Maxtor BANC1G10 revision firmware
> > + */
> > +static int ata_dev_supports_fua(u16 *id)
> > +{
> > +       unsigned char model[41], fw[9];
> > +
> > +       if (!ata_id_has_fua(id))
> > +               return 0;
> > +
> > +       ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model));
> > +       ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw));
> > +
> > +       if (strncmp(model, "Maxtor", 6))
> > +               return 1;
> > +       if (strncmp(model, "BANC1G10", 8))
> > +               return 1;
> 
> s/model/fw/

Of course, silly typo! Thanks for catching that. Update patch below.

diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
index cfbceb5..3feda07 100644
--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -1700,6 +1700,28 @@ static unsigned int ata_msense_rw_recove
 	return sizeof(def_rw_recovery_mpage);
 }
 
+/*
+ * We can turn this into a real blacklist if it's needed, for now just
+ * blacklist any Maxtor BANC1G10 revision firmware
+ */
+static int ata_dev_supports_fua(u16 *id)
+{
+	unsigned char model[41], fw[9];
+
+	if (!ata_id_has_fua(id))
+		return 0;
+
+	ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model));
+	ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw));
+
+	if (strncmp(model, "Maxtor", 6))
+		return 1;
+	if (strncmp(fw, "BANC1G10", 8))
+		return 1;
+
+	return 0; /* blacklisted */
+}
+
 /**
  *	ata_scsiop_mode_sense - Simulate MODE SENSE 6, 10 commands
  *	@args: device IDENTIFY data / SCSI command of interest.
@@ -1797,7 +1819,7 @@ unsigned int ata_scsiop_mode_sense(struc
 		return 0;
 
 	dpofua = 0;
-	if (ata_id_has_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
+	if (ata_dev_supports_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
 	    (!(dev->flags & ATA_DFLAG_PIO) || dev->multi_count))
 		dpofua = 1 << 4;
 

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  8:13           ` Jens Axboe
  2006-01-27  8:53             ` Nicolas Mailhot
@ 2006-01-27 12:12             ` Ric Wheeler
  2006-01-27 12:23               ` Jens Axboe
  1 sibling, 1 reply; 33+ messages in thread
From: Ric Wheeler @ 2006-01-27 12:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Nicolas Mailhot, Tejun Heo, Jeff Garzik, Linux-ide

Jens Axboe wrote:

>On Thu, Jan 26 2006, Nicolas Mailhot wrote:
>  
>
>>
>>I applied the fua backout patch and the kernel booted beautifully.
>>Now I guess I need to see if Maxtor released a fixed firmware right ?
>>(is it possible to change the firmware on a running system ?)
>>    
>>
>
>If you can get an update firmware, it is usually done by booting from
>DOS floppy and running a special flash utility from there. Can you send
>me the hdparm -I /dev/sdX output of the problem drive? I think we should
>just blacklist it for FUA. This bug is so obscure I think it's a better
>solution than adding a FUA disable module parameter at this point.
>
>  
>
I am not sure that drive vendors support firmware upgrades - the 
downside is that you can produce a nice paper weight if the firmware 
upgrade fails ;-)

Also, there are specific versions where I know that you cannot jump from 
firmware version X to version X + 1. 

ric


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27 12:12             ` Ric Wheeler
@ 2006-01-27 12:23               ` Jens Axboe
  0 siblings, 0 replies; 33+ messages in thread
From: Jens Axboe @ 2006-01-27 12:23 UTC (permalink / raw)
  To: Ric Wheeler; +Cc: Nicolas Mailhot, Tejun Heo, Jeff Garzik, Linux-ide

On Fri, Jan 27 2006, Ric Wheeler wrote:
> Jens Axboe wrote:
> 
> >On Thu, Jan 26 2006, Nicolas Mailhot wrote:
> > 
> >
> >>
> >>I applied the fua backout patch and the kernel booted beautifully.
> >>Now I guess I need to see if Maxtor released a fixed firmware right ?
> >>(is it possible to change the firmware on a running system ?)
> >>   
> >>
> >
> >If you can get an update firmware, it is usually done by booting from
> >DOS floppy and running a special flash utility from there. Can you send
> >me the hdparm -I /dev/sdX output of the problem drive? I think we should
> >just blacklist it for FUA. This bug is so obscure I think it's a better
> >solution than adding a FUA disable module parameter at this point.
> >
> > 
> >
> I am not sure that drive vendors support firmware upgrades - the 
> downside is that you can produce a nice paper weight if the firmware 
> upgrade fails ;-)

Yeah, hence most of them don't put it online. It would be nice to have,
though...

> Also, there are specific versions where I know that you cannot jump from 
> firmware version X to version X + 1. 

That's unfortunate, too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27  9:50                     ` Jens Axboe
@ 2006-01-27 19:37                       ` Nicolas Mailhot
  2006-01-27 23:54                         ` Nicolas Mailhot
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-27 19:37 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

Le vendredi 27 janvier 2006 à 10:50 +0100, Jens Axboe a écrit :

> Of course, silly typo! Thanks for catching that. Update patch below.
 
...

a patched kernel reboots before finishing to initialize (Just before it
prints a line starting with SCSI - the rest is too fast for me to catch)

Now the kernel base is slightly different from yesterday, so the bug may
be in the base not the patch. I'll rebuild a new kernel with the same
base and yesterday's patch to check this now

Regards,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27 19:37                       ` Nicolas Mailhot
@ 2006-01-27 23:54                         ` Nicolas Mailhot
  2006-01-30 15:08                           ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-27 23:54 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

[-- Attachment #1: Type: text/plain, Size: 706 bytes --]

Le vendredi 27 janvier 2006 à 20:37 +0100, Nicolas Mailhot a écrit :
> Le vendredi 27 janvier 2006 à 10:50 +0100, Jens Axboe a écrit :
> 
> > Of course, silly typo! Thanks for catching that. Update patch below.
>  
> ...
> 
> a patched kernel reboots before finishing to initialize (Just before it
> prints a line starting with SCSI - the rest is too fast for me to catch)
> 
> Now the kernel base is slightly different from yesterday, so the bug may
> be in the base not the patch. I'll rebuild a new kernel with the same
> base and yesterday's patch to check this now

I can confirm today's patch is not OK. The same baseline with
yesterday's patch boot fine.

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-27 23:54                         ` Nicolas Mailhot
@ 2006-01-30 15:08                           ` Jens Axboe
  2006-01-30 23:33                             ` Nicolas Mailhot
  0 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-30 15:08 UTC (permalink / raw)
  To: Nicolas Mailhot
  Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

On Sat, Jan 28 2006, Nicolas Mailhot wrote:
> Le vendredi 27 janvier 2006 à 20:37 +0100, Nicolas Mailhot a écrit :
> > Le vendredi 27 janvier 2006 à 10:50 +0100, Jens Axboe a écrit :
> > 
> > > Of course, silly typo! Thanks for catching that. Update patch below.
> >  
> > ...
> > 
> > a patched kernel reboots before finishing to initialize (Just before it
> > prints a line starting with SCSI - the rest is too fast for me to catch)
> > 
> > Now the kernel base is slightly different from yesterday, so the bug may
> > be in the base not the patch. I'll rebuild a new kernel with the same
> > base and yesterday's patch to check this now
> 
> I can confirm today's patch is not OK. The same baseline with
> yesterday's patch boot fine.

Is this any better?

diff --git a/drivers/scsi/libata-scsi.c b/drivers/scsi/libata-scsi.c
index cfbceb5..07b1e7c 100644
--- a/drivers/scsi/libata-scsi.c
+++ b/drivers/scsi/libata-scsi.c
@@ -1700,6 +1700,31 @@ static unsigned int ata_msense_rw_recove
 	return sizeof(def_rw_recovery_mpage);
 }
 
+/*
+ * We can turn this into a real blacklist if it's needed, for now just
+ * blacklist any Maxtor BANC1G10 revision firmware
+ */
+static int ata_dev_supports_fua(u16 *id)
+{
+	unsigned char model[41], fw[9];
+
+	if (!ata_id_has_fua(id))
+		return 0;
+
+	model[40] = '\0';
+	fw[8] = '\0';
+
+	ata_dev_id_string(id, model, ATA_ID_PROD_OFS, sizeof(model) - 1);
+	ata_dev_id_string(id, fw, ATA_ID_FW_REV_OFS, sizeof(fw) - 1);
+
+	if (strncmp(model, "Maxtor", 6))
+		return 1;
+	if (strncmp(fw, "BANC1G10", 8))
+		return 1;
+
+	return 0; /* blacklisted */
+}
+
 /**
  *	ata_scsiop_mode_sense - Simulate MODE SENSE 6, 10 commands
  *	@args: device IDENTIFY data / SCSI command of interest.
@@ -1797,7 +1822,7 @@ unsigned int ata_scsiop_mode_sense(struc
 		return 0;
 
 	dpofua = 0;
-	if (ata_id_has_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
+	if (ata_dev_supports_fua(args->id) && dev->flags & ATA_DFLAG_LBA48 &&
 	    (!(dev->flags & ATA_DFLAG_PIO) || dev->multi_count))
 		dpofua = 1 << 4;
 


-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-30 15:08                           ` Jens Axboe
@ 2006-01-30 23:33                             ` Nicolas Mailhot
  2006-01-31  7:26                               ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-30 23:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

[-- Attachment #1: Type: text/plain, Size: 307 bytes --]

Le lundi 30 janvier 2006 à 16:08 +0100, Jens Axboe a écrit :
> On Sat, Jan 28 2006, Nicolas Mailhot wrote:
> > I can confirm today's patch is not OK. The same baseline with
> > yesterday's patch boot fine.
> 
> Is this any better?

This one seems to work fine.

Regards,

-- 
Nicolas Mailhot

[-- Attachment #2: Ceci est une partie de message numériquement signée --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-30 23:33                             ` Nicolas Mailhot
@ 2006-01-31  7:26                               ` Jens Axboe
  2006-01-31  8:39                                 ` Nicolas Mailhot
  0 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-31  7:26 UTC (permalink / raw)
  To: Nicolas Mailhot
  Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

On Tue, Jan 31 2006, Nicolas Mailhot wrote:
> Le lundi 30 janvier 2006 à 16:08 +0100, Jens Axboe a écrit :
> > On Sat, Jan 28 2006, Nicolas Mailhot wrote:
> > > I can confirm today's patch is not OK. The same baseline with
> > > yesterday's patch boot fine.
> > 
> > Is this any better?
> 
> This one seems to work fine.

And you don't get "w/ FUA" messages from the problematic drives - and
your data appears safe? Just checking, we cannot take these corruption
things lightly.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-31  7:26                               ` Jens Axboe
@ 2006-01-31  8:39                                 ` Nicolas Mailhot
  2006-01-31  8:47                                   ` Jens Axboe
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-31  8:39 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

Le Mar 31 janvier 2006 08:26, Jens Axboe a écrit :
> On Tue, Jan 31 2006, Nicolas Mailhot wrote:
>> Le lundi 30 janvier 2006 à 16:08 +0100, Jens Axboe a écrit :
>> > On Sat, Jan 28 2006, Nicolas Mailhot wrote:
>> > > I can confirm today's patch is not OK. The same baseline with
>> > > yesterday's patch boot fine.
>> >
>> > Is this any better?
>>
>> This one seems to work fine.
>
> And you don't get "w/ FUA" messages from the problematic drives - and
> your data appears safe? Just checking, we cannot take these corruption
> things lightly.

I didn't spend a lot of time on this, the build finished rather late in
the evening/night. What I can say is the dramatic breakage I had before is
gone and I don't think there was any error in dmesg (will post it this
evening if you want). With dm+raid when FUA broke things it was difficult
to miss (screenfulls of ATA/raid errors, fs corruption on reboot, etc)

Now if you ask me if I did some heavy I/O to stress the system no I didn't
yet. If problems still lurk they are a lot less extensive than they were
before. I did trigger a full FS autorelabel so at least the read part was
tested a bit.

Regards,

-- 
Nicolas Mailhot

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-31  8:39                                 ` Nicolas Mailhot
@ 2006-01-31  8:47                                   ` Jens Axboe
  2006-01-31 22:54                                     ` Nicolas Mailhot
  0 siblings, 1 reply; 33+ messages in thread
From: Jens Axboe @ 2006-01-31  8:47 UTC (permalink / raw)
  To: Nicolas Mailhot
  Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

On Tue, Jan 31 2006, Nicolas Mailhot wrote:
> 
> Le Mar 31 janvier 2006 08:26, Jens Axboe a écrit :
> > On Tue, Jan 31 2006, Nicolas Mailhot wrote:
> >> Le lundi 30 janvier 2006 à 16:08 +0100, Jens Axboe a écrit :
> >> > On Sat, Jan 28 2006, Nicolas Mailhot wrote:
> >> > > I can confirm today's patch is not OK. The same baseline with
> >> > > yesterday's patch boot fine.
> >> >
> >> > Is this any better?
> >>
> >> This one seems to work fine.
> >
> > And you don't get "w/ FUA" messages from the problematic drives - and
> > your data appears safe? Just checking, we cannot take these corruption
> > things lightly.
> 
> I didn't spend a lot of time on this, the build finished rather late
> in the evening/night. What I can say is the dramatic breakage I had
> before is gone and I don't think there was any error in dmesg (will
> post it this evening if you want). With dm+raid when FUA broke things
> it was difficult to miss (screenfulls of ATA/raid errors, fs
> corruption on reboot, etc)
> 
> Now if you ask me if I did some heavy I/O to stress the system no I
> didn't yet. If problems still lurk they are a lot less extensive than
> they were before. I did trigger a full FS autorelabel so at least the
> read part was tested a bit.

Sounds like it works, if you saw the errors so quickly. Just trying to
be absolutely sure, if you could check for the "w/ FUA" prints not being
there now it would confirm that the blacklist does its job.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: regarding bug #5914 - fs corruption on SATA
  2006-01-31  8:47                                   ` Jens Axboe
@ 2006-01-31 22:54                                     ` Nicolas Mailhot
  0 siblings, 0 replies; 33+ messages in thread
From: Nicolas Mailhot @ 2006-01-31 22:54 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Bartlomiej Zolnierkiewicz, Tejun Heo, Jeff Garzik, Linux-ide

Jens Axboe a écrit :

> Sounds like it works, if you saw the errors so quickly.

Well you know while I was testing broken kernels the problem was more to 
keep the finger on reset on boot and act before too much damage was done 
rather than waiting for hard-to-spot symptoms. Seems root on md1+lvm is 
very good to flush fua problems.

> Just trying to
> be absolutely sure, if you could check for the "w/ FUA" prints not being
> there now it would confirm that the blacklist does its job.

The patched kernel ran for a day without hiccups. I've attached its 
demesg to the redhat bug - you can check if it's ok for you (yes I've 
rebooted since and no it was not a ata problem - just you 
run-of-the-mill rawhide xorg freeze)

https://bugzilla.redhat.com/bugzilla/attachment.cgi?id=123941

Regards,

-- 
Nicolas Mailhot

^ permalink raw reply	[flat|nested] 33+ messages in thread

* SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA)
  2006-01-26 17:15     ` David Greaves
@ 2006-02-07 18:35       ` David Greaves
  2006-02-07 19:30         ` Jeff Garzik
  0 siblings, 1 reply; 33+ messages in thread
From: David Greaves @ 2006-02-07 18:35 UTC (permalink / raw)
  To: Linux-ide
  Cc: Jeff Garzik, Tejun Heo, Nicolas.Mailhot, Jens Axboe,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks, smartmontools-support

This is a followon to the email below.

Basically, it seems some SMART commands produce unexpected errrors.

My Debian smartd config has "-o on" and "-S on" for every drive so it
puts out lots of errors every time I boot.

I did a little investigation and I see that when I do:
# smartctl -o on -data /dev/sdb
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Error SMART Disable Automatic Offline failed: Input/output error
Smartctl: SMART Disable Automatic Offline Failed.

(Which is fine if the drive doesn't support it.)

I unexpectedly get this in dmesg:

ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: no sense translation for status: 0x51
ata2: translated ATA stat/err 0x51/00 to SCSI SK/ASC/ASCQ 0x3/11/04
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }
ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata2: status=0x51 { DriveReady SeekComplete Error }
ata2: error=0x04 { DriveStatusError }


If I try with sda the first time it fails:
# smartctl -o off -data /dev/sda
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
Error SMART Disable Automatic Offline failed: Input/output error
Smartctl: SMART Disable Automatic Offline Failed.

and I get:
ata1: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata1: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }
ata1: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata1: status=0x51 { DriveReady SeekComplete Error }
ata1: error=0x04 { DriveStatusError }

thereafter it works:
# smartctl -s on -data /dev/sda
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Enabled.

# smartctl -s off -data /dev/sda
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Disabled. Use option -s with argument 'on' to enable it.

(no dmesg output this time)

If I try this on sdc, it succeeds *and* I get error messages:
# smartctl -S off -data /dev/sdc
smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF ENABLE/DISABLE COMMANDS SECTION ===
SMART Disabled. Use option -s with argument 'on' to enable it.

I still get this in dmesg:

ata3: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x04 { DriveStatusError }
ata3: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x04 { DriveStatusError }
ata3: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
ata3: status=0x51 { DriveReady SeekComplete Error }
ata3: error=0x04 { DriveStatusError }



Some more boot time dmesg info
Linux version 2.6.15 (root@haze) (gcc version 4.0.3 20051201
(prerelease) (Debian 4.0.2-5)) #4 PREEMPT Tue Jan 24 08:30:31 UTC 2006
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009d800 (usable)
 BIOS-e820: 000000000009d800 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000003fffb000 (usable)
 BIOS-e820: 000000003fffb000 - 000000003ffff000 (ACPI data)
 BIOS-e820: 000000003ffff000 - 0000000040000000 (ACPI NVS)
 BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
 BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB HIGHMEM available.
896MB LOWMEM available.
On node 0 totalpages: 262139
  DMA zone: 4096 pages, LIFO batch:0
  DMA32 zone: 0 pages, LIFO batch:0
  Normal zone: 225280 pages, LIFO batch:31
  HighMem zone: 32763 pages, LIFO batch:7
DMI 2.3 present.
ACPI: RSDP (v000 ASUS                                  ) @ 0x000f5e30
ACPI: RSDT (v001 ASUS   A7V600-X 0x42302e31 MSFT 0x31313031) @ 0x3fffb000
ACPI: FADT (v001 ASUS   A7V600-X 0x42302e31 MSFT 0x31313031) @ 0x3fffb0b2
ACPI: BOOT (v001 ASUS   A7V600-X 0x42302e31 MSFT 0x31313031) @ 0x3fffb030
ACPI: MADT (v001 ASUS   A7V600-X 0x42302e31 MSFT 0x31313031) @ 0x3fffb058
ACPI: DSDT (v001   ASUS A7V600-X 0x00001000 MSFT 0x0100000b) @ 0x00000000
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 6:10 APIC version 16
ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode:  Flat.  Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 50000000 (gap: 40000000:bec00000)
Built 1 zonelists
Kernel command line: root=/dev/md0 ro
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 2125.801 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 1035636k/1048556k available (2326k kernel code, 12328k reserved,
576k data, 176k init, 131052k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 4258.01 BogoMIPS
(lpj=8516036)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff c1c3fbff 00000000 00000000
00000000 00000000 00000000
CPU: After vendor identify, caps: 0383fbff c1c3fbff 00000000 00000000
00000000 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU: After all inits, caps: 0383fbff c1c3fbff 00000000 00000020 00000000
00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
mtrr: v2.0 (20020519)
CPU: AMD Athlon(TM) XP 3000+ stepping 00
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf1970, last bus=1
PCI: Using configuration type 1
ACPI: Subsystem revision 20050902
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs *3 4 5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs *3 4 5 6 7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 *7 9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11 12) *15, disabled.
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a
report
PCI: Bridge: 0000:00:01.0
  IO window: d000-dfff
  MEM window: be800000-bfefffff
  PREFETCH window: c0000000-f7ffffff
PCI: Setting latency timer of device 0000:00:01.0 to 64
Simple Boot Flag at 0x3a set to 0x80
Machine check exception polling timer started.
highmem bounce pool size: 64 pages
SGI XFS with no debug enabled
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
PCI: Bypassing VIA 8237 APIC De-Assert Message
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 16
PCI: Via IRQ fixup for 0000:00:0f.1, from 14 to 0
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
    ide0: BM-DMA at 0x7800-0x7807, BIOS settings: hda:DMA, hdb:DMA
    ide1: BM-DMA at 0x7808-0x780f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: PLEXTOR DVDR PX-708A, ATAPI CD/DVD-ROM drive
hdb: TSSTcorpCD/DVDW SH-W162C, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hda: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
hdb: ATAPI 48X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
libata version 1.20 loaded.
sata_sil 0000:00:0a.0: version 0.9
ACPI: PCI Interrupt 0000:00:0a.0[A] -> GSI 16 (level, low) -> IRQ 17
ata1: SATA max UDMA/100 cmd 0xF8804080 ctl 0xF880408A bmdma 0xF8804000
irq 17
ata2: SATA max UDMA/100 cmd 0xF88040C0 ctl 0xF88040CA bmdma 0xF8804008
irq 17
ata1: dev 0 cfg 49:2f00 82:7869 83:7d09 84:4043 85:7869 86:3c01 87:4043
88:203f
ata1: dev 0 ATA-7, max UDMA/100, 390721968 sectors: LBA48
ata1: dev 0 configured for UDMA/100
scsi0 : sata_sil
ata2: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4063 85:7c69 86:3e01 87:4063
88:007f
ata2: dev 0 ATA-7, max UDMA/133, 398297088 sectors: LBA48
ata2: dev 0 configured for UDMA/100
scsi1 : sata_sil
  Vendor: ATA       Model: Maxtor 6B200M0    Rev: BANC
  Type:   Direct-Access                      ANSI SCSI revision: 05
  Vendor: ATA       Model: Maxtor 6B200M0    Rev: BANC
  Type:   Direct-Access                      ANSI SCSI revision: 05
sata_via 0000:00:0f.0: version 1.1
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 16
sata_via 0000:00:0f.0: routed to hard irq line 0
ata3: SATA max UDMA/133 cmd 0x9800 ctl 0x9402 bmdma 0x8400 irq 16
ata4: SATA max UDMA/133 cmd 0x9000 ctl 0x8802 bmdma 0x8408 irq 16
ata3: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4003 85:3469 86:3c01 87:4003
88:407f
ata3: dev 0 ATA-6, max UDMA/133, 312581808 sectors: LBA48
ata3: dev 0 configured for UDMA/133
scsi2 : sata_via
ata4: dev 0 cfg 49:2f00 82:7c6b 83:7f09 84:4063 85:7c69 86:3e01 87:4063
88:407f
ata4: dev 0 ATA-7, max UDMA/133, 398297088 sectors: LBA48
ata4: dev 0 configured for UDMA/133
scsi3 : sata_via
  Vendor: ATA       Model: ST3160023AS       Rev: 3.18
  Type:   Direct-Access                      ANSI SCSI revision: 05
  Vendor: ATA       Model: Maxtor 6B200M0    Rev: BANC
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
SCSI device sda: drive cache: write back
 sda: sda1
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 398297088 512-byte hdwr sectors (203928 MB)
SCSI device sdb: drive cache: write back
SCSI device sdb: 398297088 512-byte hdwr sectors (203928 MB)
SCSI device sdb: drive cache: write back
 sdb: sdb1 sdb2
sd 1:0:0:0: Attached scsi disk sdb
SCSI device sdc: 312581808 512-byte hdwr sectors (160042 MB)
SCSI device sdc: drive cache: write back
SCSI device sdc: 312581808 512-byte hdwr sectors (160042 MB)
SCSI device sdc: drive cache: write back
 sdc: sdc1 sdc2 sdc3 sdc4
sd 2:0:0:0: Attached scsi disk sdc
SCSI device sdd: 398297088 512-byte hdwr sectors (203928 MB)
SCSI device sdd: drive cache: write back
SCSI device sdd: 398297088 512-byte hdwr sectors (203928 MB)
SCSI device sdd: drive cache: write back
 sdd: sdd1 sdd2
sd 3:0:0:0: Attached scsi disk sdd
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:0:0: Attached scsi generic sg1 type 0
sd 2:0:0:0: Attached scsi generic sg2 type 0
sd 3:0:0:0: Attached scsi generic sg3 type 0

David

David Greaves wrote:

>Jeff Garzik wrote:
>  
>
>>David Greaves wrote:
>>    
>>
>>>  Possible libata/sata/Asus problem (was Re: Need to upgrade to latest
>>>stable mdadm version?)
>>>      
>>>
>>Highly likely to be a motherboard/BIOS issue related to properly
>>tuning and timing the hardware.
>>
>>HOWEVER, libata can help (via Tejun's recent patches) by properly
>>handling the error when throw to us by hardware.
>>    
>>
>OK - I thought my messages:
>
>Jan 20 06:25:04 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:04 haze kernel: ata2: error=0x04 { DriveStatusError }
>Jan 20 06:25:10 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:10 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:18 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:18 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:18 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:18 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:20 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:20 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:22 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:22 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:52 haze kernel: ata2: no sense translation for status: 0x51
>Jan 20 06:25:52 haze kernel: ata2: status=0x51 { DriveReady SeekComplete
>Error }
>Jan 20 06:25:52 haze kernel: sd 1:0:0:0: SCSI error: return code = 0x8000002
>Jan 20 06:25:52 haze kernel: sdb: Current: sense key: Medium Error
>Jan 20 06:25:52 haze kernel:     Additional sense: Unrecovered read
>error - auto reallocate failed
>Jan 20 06:25:52 haze kernel: end_request: I/O error, dev sdb, sector
>390787713
>
>bore a certain similarity to those in Tejun/Nicolas' mail:
>
>Different problem? as irq might ask: "does anybody care?" :)
>
>(and yes badblocks and SMART reports all is well)
>  
>


-- 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA)
  2006-02-07 18:35       ` SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA) David Greaves
@ 2006-02-07 19:30         ` Jeff Garzik
  2006-02-08  7:21           ` David Greaves
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff Garzik @ 2006-02-07 19:30 UTC (permalink / raw)
  To: David Greaves
  Cc: Linux-ide, Tejun Heo, Nicolas.Mailhot, Jens Axboe,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks, smartmontools-support

David Greaves wrote:
> This is a followon to the email below.
> 
> Basically, it seems some SMART commands produce unexpected errrors.
> 
> My Debian smartd config has "-o on" and "-S on" for every drive so it
> puts out lots of errors every time I boot.
> 
> I did a little investigation and I see that when I do:
> # smartctl -o on -data /dev/sdb
> smartctl version 5.34 [i686-pc-linux-gnu] Copyright (C) 2002-5 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF ENABLE/DISABLE COMMANDS SECTION ===
> Error SMART Disable Automatic Offline failed: Input/output error
> Smartctl: SMART Disable Automatic Offline Failed.
> 
> (Which is fine if the drive doesn't support it.)
> 
> I unexpectedly get this in dmesg:
> 
> ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> ata2: status=0x51 { DriveReady SeekComplete Error }
> ata2: error=0x04 { DriveStatusError }
> ata2: no sense translation for status: 0x51
> ata2: translated ATA stat/err 0x51/00 to SCSI SK/ASC/ASCQ 0x3/11/04
> ata2: status=0x51 { DriveReady SeekComplete Error }
> ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> ata2: status=0x51 { DriveReady SeekComplete Error }
> ata2: error=0x04 { DriveStatusError }
> ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> ata2: status=0x51 { DriveReady SeekComplete Error }
> ata2: error=0x04 { DriveStatusError }
> ata2: translated ATA stat/err 0x51/04 to SCSI SK/ASC/ASCQ 0xb/00/00
> ata2: status=0x51 { DriveReady SeekComplete Error }
> ata2: error=0x04 { DriveStatusError }


All of your commands are missing "-d ata"

	Jeff



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA)
  2006-02-07 19:30         ` Jeff Garzik
@ 2006-02-08  7:21           ` David Greaves
  0 siblings, 0 replies; 33+ messages in thread
From: David Greaves @ 2006-02-08  7:21 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linux-ide, Tejun Heo, Nicolas.Mailhot, Jens Axboe,
	Christopher Smith, Erik Slagter, hahn, mlaks, Soeren Sonnenburg,
	mlaks, smartmontools-support

Jeff Garzik wrote:

> David Greaves wrote:
>
>> I did a little investigation and I see that when I do:
>> # smartctl -o on -data /dev/sdb
>
<snip>

> All of your commands are missing "-d ata"

well, technically yes, I used -data in all of them, is the space or option order important?


David Greaves wrote:
# smartctl -o on -data /dev/sdb
# smartctl -o off -data /dev/sda
# smartctl -s on -data /dev/sda
# smartctl -s off -data /dev/sda
# smartctl -S off -data /dev/sdc

David


-- 


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2006-02-08  7:21 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-26  5:50 regarding bug #5914 - fs corruption on SATA Tejun Heo
2006-01-26  5:51 ` Tejun Heo
2006-01-26  9:14   ` Nicolas Mailhot
2006-01-26  9:21     ` Jens Axboe
2006-01-26 10:01       ` Nicolas Mailhot
     [not found]       ` <5840.192.54.193.25.1138269692.squirrel@rousalka.dyndns.org>
2006-01-26 21:04         ` Nicolas Mailhot
2006-01-27  8:13           ` Jens Axboe
2006-01-27  8:53             ` Nicolas Mailhot
2006-01-27  9:10               ` Jens Axboe
2006-01-27  9:20                 ` Jens Axboe
2006-01-27  9:27                   ` Nicolas Mailhot
2006-01-27  9:46                   ` Bartlomiej Zolnierkiewicz
2006-01-27  9:50                     ` Jens Axboe
2006-01-27 19:37                       ` Nicolas Mailhot
2006-01-27 23:54                         ` Nicolas Mailhot
2006-01-30 15:08                           ` Jens Axboe
2006-01-30 23:33                             ` Nicolas Mailhot
2006-01-31  7:26                               ` Jens Axboe
2006-01-31  8:39                                 ` Nicolas Mailhot
2006-01-31  8:47                                   ` Jens Axboe
2006-01-31 22:54                                     ` Nicolas Mailhot
2006-01-27 12:12             ` Ric Wheeler
2006-01-27 12:23               ` Jens Axboe
2006-01-26  9:18 ` Jens Axboe
2006-01-26 14:11   ` Bartlomiej Zolnierkiewicz
2006-01-26 14:27     ` Jens Axboe
2006-01-26 16:41 ` David Greaves
2006-01-26 16:58   ` Jeff Garzik
2006-01-26 17:15     ` David Greaves
2006-02-07 18:35       ` SMART on SATA reporting errors? (was Re: regarding bug #5914 - fs corruption on SATA) David Greaves
2006-02-07 19:30         ` Jeff Garzik
2006-02-08  7:21           ` David Greaves
2006-01-26 17:20     ` regarding bug #5914 - fs corruption on SATA Soeren Sonnenburg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).