[git patches] libata fixes

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [git patches] libata fixes
@ 2006-11-14 15:04 Jeff Garzik
  2006-11-14 16:32 ` Mark Lord
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jeff Garzik @ 2006-11-14 15:04 UTC (permalink / raw)
  To: Andrew Morton, Linus Torvalds; +Cc: linux-ide, LKML


Please pull from 'upstream-linus' branch of
master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev.git upstream-linus

to receive the following updates:

 drivers/ata/libata-scsi.c |    2 +-
 drivers/ata/pata_artop.c  |    2 +-
 drivers/ata/pata_hpt37x.c |   19 ++++++++++++++++---
 3 files changed, 18 insertions(+), 5 deletions(-)

Alan Cox:
      hpt37x: Check the enablebits

Alexey Dobriyan:
      pata_artop: fix "& (1 >>" typo

Darrick J. Wong:
      libata: fix double-completion on error

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7af2a4b..5c1fc46 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1612,9 +1612,9 @@ early_finish:
 
 err_did:
 	ata_qc_free(qc);
-err_mem:
 	cmd->result = (DID_ERROR << 16);
 	done(cmd);
+err_mem:
 	DPRINTK("EXIT - internal\n");
 	return 0;
 
diff --git a/drivers/ata/pata_artop.c b/drivers/ata/pata_artop.c
index 690828e..96a0980 100644
--- a/drivers/ata/pata_artop.c
+++ b/drivers/ata/pata_artop.c
@@ -92,7 +92,7 @@ static int artop6260_pre_reset(struct at
 		return -ENOENT;
 
 	pci_read_config_byte(pdev, 0x49, &tmp);
-	if (tmp & (1 >> ap->port_no))
+	if (tmp & (1 << ap->port_no))
 		ap->cbl = ATA_CBL_PATA40;
 	else
 		ap->cbl = ATA_CBL_PATA80;
diff --git a/drivers/ata/pata_hpt37x.c b/drivers/ata/pata_hpt37x.c
index 7350443..fce3fcd 100644
--- a/drivers/ata/pata_hpt37x.c
+++ b/drivers/ata/pata_hpt37x.c
@@ -25,7 +25,7 @@ #include <scsi/scsi_host.h>
 #include <linux/libata.h>
 
 #define DRV_NAME	"pata_hpt37x"
-#define DRV_VERSION	"0.5"
+#define DRV_VERSION	"0.5.1"
 
 struct hpt_clock {
 	u8	xfer_speed;
@@ -453,7 +453,13 @@ static int hpt37x_pre_reset(struct ata_p
 {
 	u8 scr2, ata66;
 	struct pci_dev *pdev = to_pci_dev(ap->host->dev);
-
+	static const struct pci_bits hpt37x_enable_bits[] = {
+		{ 0x50, 1, 0x04, 0x04 },
+		{ 0x54, 1, 0x04, 0x04 }
+	};
+	if (!pci_test_config_bits(pdev, &hpt37x_enable_bits[ap->port_no]))
+		return -ENOENT;
+		
 	pci_read_config_byte(pdev, 0x5B, &scr2);
 	pci_write_config_byte(pdev, 0x5B, scr2 & ~0x01);
 	/* Cable register now active */
@@ -488,10 +494,17 @@ static void hpt37x_error_handler(struct 
 
 static int hpt374_pre_reset(struct ata_port *ap)
 {
+	static const struct pci_bits hpt37x_enable_bits[] = {
+		{ 0x50, 1, 0x04, 0x04 },
+		{ 0x54, 1, 0x04, 0x04 }
+	};
 	u16 mcr3, mcr6;
 	u8 ata66;
-
 	struct pci_dev *pdev = to_pci_dev(ap->host->dev);
+
+	if (!pci_test_config_bits(pdev, &hpt37x_enable_bits[ap->port_no]))
+		return -ENOENT;
+		
 	/* Do the extra channel work */
 	pci_read_config_word(pdev, 0x52, &mcr3);
 	pci_read_config_word(pdev, 0x56, &mcr6);

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [git patches] libata fixes
  2006-11-14 15:04 [git patches] libata fixes Jeff Garzik
@ 2006-11-14 16:32 ` Mark Lord
  2006-11-14 16:41   ` Jeff Garzik
  2006-11-28 16:56 ` Scary Intel SATA errors Linus Torvalds
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
  2 siblings, 1 reply; 19+ messages in thread
From: Mark Lord @ 2006-11-14 16:32 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, Linus Torvalds, linux-ide, LKML

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7af2a4b..5c1fc46 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1612,9 +1612,9 @@ early_finish:
 
 err_did:
 	ata_qc_free(qc);
-err_mem:
 	cmd->result = (DID_ERROR << 16);
 	done(cmd);
+err_mem:
 	DPRINTK("EXIT - internal\n");
 	return 0;

This doesn't look correct to me, but I did miss out on the original discussion(?).

Any time we return 0 from queuecommand, the SCSI mid-layer expects us
to also take care of invoking the done() function.  Where does this now
happen for this case (err_mem) ???

Cheers

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [git patches] libata fixes
  2006-11-14 16:32 ` Mark Lord
@ 2006-11-14 16:41   ` Jeff Garzik
  2006-11-14 18:11     ` Mark Lord
  0 siblings, 1 reply; 19+ messages in thread
From: Jeff Garzik @ 2006-11-14 16:41 UTC (permalink / raw)
  To: Mark Lord; +Cc: Andrew Morton, Linus Torvalds, linux-ide, LKML

Mark Lord wrote:
> Any time we return 0 from queuecommand, the SCSI mid-layer expects us
> to also take care of invoking the done() function.  Where does this now
> happen for this case (err_mem) ???

It _already_ happened in the error path of ata_scsi_qc_new(), which is 
why this is a double-completion bug fix.

	Jeff



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [git patches] libata fixes
  2006-11-14 16:41   ` Jeff Garzik
@ 2006-11-14 18:11     ` Mark Lord
  0 siblings, 0 replies; 19+ messages in thread
From: Mark Lord @ 2006-11-14 18:11 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, Linus Torvalds, linux-ide, LKML

Jeff Garzik wrote:
> Mark Lord wrote:
>> Any time we return 0 from queuecommand, the SCSI mid-layer expects us
>> to also take care of invoking the done() function.  Where does this now
>> happen for this case (err_mem) ???
> 
> It _already_ happened in the error path of ata_scsi_qc_new(), which is 
> why this is a double-completion bug fix.

Ack. 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Scary Intel SATA errors..
  2006-11-14 15:04 [git patches] libata fixes Jeff Garzik
  2006-11-14 16:32 ` Mark Lord
@ 2006-11-28 16:56 ` Linus Torvalds
  2006-11-29 18:25   ` Mark Lord
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
  2 siblings, 1 reply; 19+ messages in thread
From: Linus Torvalds @ 2006-11-28 16:56 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, linux-ide


Jeff, what does this mean:

	ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
	ata1.00: (BMDMA stat 0x21)
	ata1.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout)
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1: failed to recover some devices, retrying in 5 secs
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1: failed to recover some devices, retrying in 5 secs
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1.00: disabled
	ata1: EH complete

followed by various IO errors:

	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 335093
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217445
	Buffer I/O error on device dm-0, logical block 112001027
	lost page write due to I/O error on dm-0
	..

nasty, nasty, nasty.

This is with ata_piix on a Intel i965 motherboard (everything Intel)

		Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Scary Intel SATA problem: "frozen"
  2006-11-14 15:04 [git patches] libata fixes Jeff Garzik
  2006-11-14 16:32 ` Mark Lord
  2006-11-28 16:56 ` Scary Intel SATA errors Linus Torvalds
@ 2006-11-28 17:31 ` Linus Torvalds
  2006-11-28 17:37   ` Mark Lord
                     ` (3 more replies)
  2 siblings, 4 replies; 19+ messages in thread
From: Linus Torvalds @ 2006-11-28 17:31 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, linux-ide


[ You may or may not have gotten my previous email. The kernel stayed 
  working, but due to the IO errors the filesystem got re-mounted 
  read-only, and I'm not sure that the email I sent out in that state 
  actually ever made it out. I suspect it didn't. ]

Jeff,
 I just had a scary thing on my nice new Intel i965 box (all Intel 
chipsets apart from some strange Marvell IDE interface that I'm not using 
and that no driver even detected, and a TI firewire thing that I'm 
similarly not using).

The machine basically froze for about a minute or so (well, things worked 
surprisingly well, considering that apparently no disk IO happened - I 
initially thought it was just firefox that had frozen up, since my mail 
session seemed to be fine), and after it came back the filesystem was 
mounted read-only and nothing really worked any more..

I have no idea what status 0xD0 means: it looks like ATA_BUSY + ATA_DRDY + 
"bit#4", but what is bit#4?

And clearly, the soft-reset isn't doing squat.

Ideas?

		Linus

----
Boot-time messages:

	libata version 2.00 loaded.
	..
	Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
	ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
	Probing IDE interface ide0...
	Probing IDE interface ide1...
	ide-floppy driver 0.99.newide
	ata_piix 0000:00:1f.2: version 2.00ac6
	ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
	ACPI: PCI Interrupt 0000:00:1f.2[A] -> GSI 19 (level, low) -> IRQ 19
	PCI: Setting latency timer of device 0000:00:1f.2 to 64
	ata1: SATA max UDMA/133 cmd 0x2148 ctl 0x217E bmdma 0x2110 irq 19
	ata2: SATA max UDMA/133 cmd 0x2140 ctl 0x217A bmdma 0x2118 irq 19
	scsi0 : ata_piix
	ata1.00: ATA-7, max UDMA/133, 976773168 sectors: LBA48 NCQ (depth 0/32)
	ata1.00: ata1: dev 0 multi count 16
	ata1.00: configured for UDMA/133
	scsi1 : ata_piix
	ata2.00: ATAPI, max UDMA/66
	ata2.00: configured for UDMA/66
	scsi 0:0:0:0: Direct-Access     ATA      WDC WD5000YS-01M 07.0 PQ: 0 ANSI: 5
	SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
	sda: Write Protect is off
	sda: Mode Sense: 00 3a 00 00
	SCSI device sda: drive cache: write back
	SCSI device sda: 976773168 512-byte hdwr sectors (500108 MB)
	sda: Write Protect is off
	sda: Mode Sense: 00 3a 00 00
	SCSI device sda: drive cache: write back
	 sda: sda1 sda2
	sd 0:0:0:0: Attached scsi disk sda
	sd 0:0:0:0: Attached scsi generic sg0 type 0
	scsi 1:0:0:0: CD-ROM            PLEXTOR  DVDR   PX-755A   1.04 PQ: 0 ANSI: 5
	sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
	Uniform CD-ROM driver Revision: 3.20
	sr 1:0:0:0: Attached scsi CD-ROM sr0
	sr 1:0:0:0: Attached scsi generic sg1 type 5
	ata_piix 0000:00:1f.5: MAP [ P0 P2 P1 P3 ]
	ACPI: PCI Interrupt 0000:00:1f.5[A] -> GSI 19 (level, low) -> IRQ 19
	PCI: Setting latency timer of device 0000:00:1f.5 to 64
	ata3: SATA max UDMA/133 cmd 0x2138 ctl 0x2176 bmdma 0x20F0 irq 19
	ata4: SATA max UDMA/133 cmd 0x2130 ctl 0x2172 bmdma 0x20F8 irq 19
	scsi2 : ata_piix
	ATA: abnormal status 0x7F on port 0x213F
	scsi3 : ata_piix
	ATA: abnormal status 0x7F on port 0x2137

Problem starts:

	ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
	ata1.00: (BMDMA stat 0x21)
	ata1.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout)
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1: failed to recover some devices, retrying in 5 secs
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1: failed to recover some devices, retrying in 5 secs
	ata1: port is slow to respond, please be patient (Status 0xd0)
	ata1: port failed to respond (30 secs, Status 0xd0)
	ata1: soft resetting port
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ATA: abnormal status 0xD0 on port 0x214F
	ata1.00: qc timeout (cmd 0xec)
	ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
	ata1.00: revalidation failed (errno=-5)
	ata1.00: disabled
	ata1: EH complete

And then it goes all downhill from there - the machine is toast:

	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 335093
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217445
	Buffer I/O error on device dm-0, logical block 112001027
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221400325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221400325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221400325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625475, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217485
	Buffer I/O error on device dm-0, logical block 112001032
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217501
	Buffer I/O error on device dm-0, logical block 112001034
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 209229
	Buffer I/O error on device dm-0, logical block 0
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 335117
	Buffer I/O error on device dm-0, logical block 15736
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255077
	Buffer I/O error on device dm-0, logical block 54630731
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255085
	Buffer I/O error on device dm-0, logical block 54630732
	lost page write due to I/O error on dm-0
	Buffer I/O error on device dm-0, logical block 54630733
	lost page write due to I/O error on dm-0
	Buffer I/O error on device dm-0, logical block 54630734
	lost page write due to I/O error on dm-0
	Buffer I/O error on device dm-0, logical block 54630735
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255157
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255221
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255381
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437269949
	Aborting journal on device dm-0.
	EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted
	ext3_abort called.
	EXT3-fs error (device dm-0): ext3_journal_start_sb: Detected aborted journal
	Remounting filesystem read-only
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437270109
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437270229
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437270245
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437496517
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438170837
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438170941
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438235885
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438235981
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438236021
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438236045
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438236077
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438236141
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438242893
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438246285
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438249165
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438250517
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438286605
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438286861
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438286885
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438286917
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438286933
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438289381
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438290421
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438290653
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438290685
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438290717
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438290765
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438291717
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438291773
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438291797
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438296973
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438709949
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 640245117
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 209229
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 247935325
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=30965782, block=30965762
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050605, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822081, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358245
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018686, block=53018627
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625477, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 247935325
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=30965784, block=30965762
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050607, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822083, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358245
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018688, block=53018627
	EXT3-fs error (device dm-0) in ext3_reserve_inode_write: Journal has aborted
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 435077853
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438236021
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438249165
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438296973
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	__journal_remove_journal_head: freeing b_committed_data
	__journal_remove_journal_head: freeing b_committed_data
	__journal_remove_journal_head: freeing b_committed_data
	__journal_remove_journal_head: freeing b_committed_data
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	journal commit I/O error
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625479, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050608, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822085, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358253
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018690, block=53018628
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 232730981
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=29065261, block=29065219
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 193409381
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=24150055, block=24150019
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625481, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 247935325
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=30965785, block=30965762
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050610, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822087, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358253
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018692, block=53018628
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221082957
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221082957
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221082957
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625482, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 247935325
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=30965786, block=30965762
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050611, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822088, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358253
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018693, block=53018628
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 232730981
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=29065262, block=29065219
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 193409381
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=24150056, block=24150019
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625484, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 247935325
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=30965787, block=30965762
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050613, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822090, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358253
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018695, block=53018628
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 421212525
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52625485, block=52625412
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 208613733
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=26050614, block=26050563
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 422785389
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=52822091, block=52822020
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 424358253
	EXT3-fs error (device dm-0): ext3_get_inode_loc: unable to read inode block - inode=53018696, block=53018628
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 222476325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 222476325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 222476325
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220006069
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220006069
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220006069
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218847341
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218847341
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223939293
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223939293
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223939293
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223968533
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 209229
	printk: 141 messages suppressed.
	Buffer I/O error on device dm-0, logical block 0
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 209341
	Buffer I/O error on device dm-0, logical block 14
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 209429
	Buffer I/O error on device dm-0, logical block 25
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218838829
	Buffer I/O error on device dm-0, logical block 27328700
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218838957
	Buffer I/O error on device dm-0, logical block 27328716
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839021
	Buffer I/O error on device dm-0, logical block 27328724
	lost page write due to I/O error on dm-0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839133
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839173
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839229
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839261
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839317
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839341
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218839365
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218840301
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218840349
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218840453
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218840501
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218841173
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218842141
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218842877
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218842901
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218843021
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218843045
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218844933
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218844949
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218844965
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218844997
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218845045
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 218845141
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219624301
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219624381
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219625901
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219625957
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219626005
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219626421
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 219626485
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220149909
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220673461
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220673485
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220673525
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220673733
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221196965
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221722813
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221722845
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 222245285
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 222247365
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223818765
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 223818965
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 434844005
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 434844397
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 436680901
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437203277
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437203437
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437203477
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437236053
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437255213
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437270237
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437727581
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 437989709
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438170933
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438235669
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438251925
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 438939997
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 454766925
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 553070941
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 553070973
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 553071005
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 640102773
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 640102965
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 640102997
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 830681421
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 830927181
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217437
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217597
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217645
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 896217885
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220252173
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220252253
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220252277
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220252173
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 220252173
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 830927181
	EXT3-fs error (device dm-0): ext3_find_entry: reading directory #103809025 offset 0
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816517
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816589
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816629
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816653
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816677
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816629
	sd 0:0:0:0: SCSI error: return code = 0x00040000
	end_request: I/O error, dev sda, sector 221816629

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
@ 2006-11-28 17:37   ` Mark Lord
  2006-11-28 17:55     ` Sergei Shtylyov
  2006-11-29  1:12     ` Tejun Heo
  2006-11-28 18:05   ` Alan
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Mark Lord @ 2006-11-28 17:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Andrew Morton, linux-ide, Tejun Heo

Linus Torvalds wrote:
> [ You may or may not have gotten my previous email. The kernel stayed 
>   working, but due to the IO errors the filesystem got re-mounted 
>   read-only, and I'm not sure that the email I sent out in that state 
>   actually ever made it out. I suspect it didn't. ]
> 
> Jeff,
>  I just had a scary thing on my nice new Intel i965 box (all Intel 
> chipsets apart from some strange Marvell IDE interface that I'm not using 
> and that no driver even detected, and a TI firewire thing that I'm 
> similarly not using).
> 
> The machine basically froze for about a minute or so (well, things worked 
> surprisingly well, considering that apparently no disk IO happened - I 
> initially thought it was just firefox that had frozen up, since my mail 
> session seemed to be fine), and after it came back the filesystem was 
> mounted read-only and nothing really worked any more..
> 
> I have no idea what status 0xD0 means: it looks like ATA_BUSY + ATA_DRDY + 
> "bit#4", but what is bit#4?

Bit #4, when actually implemented, is a rotational seek indicator,
which can be used for timing purposes.

But when BUSY (bit #7) is set, the rest are generally nonsense.
 
> And clearly, the soft-reset isn't doing squat.

Tejun ?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:37   ` Mark Lord
@ 2006-11-28 17:55     ` Sergei Shtylyov
  2006-11-28 20:12       ` Eric D. Mudama
  2006-11-29  1:12     ` Tejun Heo
  1 sibling, 1 reply; 19+ messages in thread
From: Sergei Shtylyov @ 2006-11-28 17:55 UTC (permalink / raw)
  To: Mark Lord; +Cc: Jeff Garzik, linux-ide, Tejun Heo

Hello.

Mark Lord wrote:

> Bit #4, when actually implemented, is a rotational seek indicator,
> which can be used for timing purposes.

    Hm, I thought it was DSC (drive seek complete) set by the SEEK command 
completion, and it's always implemented. Didn't you mean IDX (bit 1, IIRC)?

> But when BUSY (bit #7) is set, the rest are generally nonsense.

    Indeed...

WBR, Sergei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
  2006-11-28 17:37   ` Mark Lord
@ 2006-11-28 18:05   ` Alan
  2006-11-28 18:33     ` Linus Torvalds
  2006-11-28 21:03   ` Jeff Garzik
  2006-11-28 22:18   ` Jeff Garzik
  3 siblings, 1 reply; 19+ messages in thread
From: Alan @ 2006-11-28 18:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff Garzik, Andrew Morton, linux-ide

On Tue, 28 Nov 2006 09:31:51 -0800 (PST)
Linus Torvalds <torvalds@osdl.org> wrote:

>  I just had a scary thing on my nice new Intel i965 box (all Intel 
> chipsets apart from some strange Marvell IDE interface that I'm not using 
> and that no driver even detected, and a TI firewire thing that I'm 

Mr Morton has the Marvell libata driver in his tree waiting to head your
way.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 18:05   ` Alan
@ 2006-11-28 18:33     ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2006-11-28 18:33 UTC (permalink / raw)
  To: Alan; +Cc: Jeff Garzik, Andrew Morton, linux-ide

On Tue, 28 Nov 2006, Alan wrote:
>
> On Tue, 28 Nov 2006 09:31:51 -0800 (PST)
> Linus Torvalds <torvalds@osdl.org> wrote:
> 
> >  I just had a scary thing on my nice new Intel i965 box (all Intel 
> > chipsets apart from some strange Marvell IDE interface that I'm not using 
> > and that no driver even detected, and a TI firewire thing that I'm 
> 
> Mr Morton has the Marvell libata driver in his tree waiting to head your
> way.

Well, I don't actually personally want it (I have nothing connected to it, 
nor any intention of connecting anything in the future), I just want my 
bog-standard PIIX driver to not do the scary things to me.

	"Mommy, mommy, the IDE messages/behaviour is scaring me!"

I just mentioned the Marvell chip because apart from those two (unused) 
chips, the box is absolutely and utterly bog-standard Intel-everything. 
The i965 may still be somewhat unusual right now, but that's going to 
change, and if there's something strange going on, we should try to fix it 
asap.

It could be a one-off thing (knock wood), but on the other hand, I've only 
been using this machine for a couple of weeks now, and I can't remember 
seeing anything even remotely similar on my other machines (including the 
earlier-generation i945 SATA setup that I've had a lot longer). So I worry 
that it's something i965-specific, and that will be a _very_ common 
chipset soon enough.

One data-point that may or may not be relevant: the afore-mentioned i945 
machine that I've had longer is otherwise reasonably similar, but the DVD 
drive on that one is in legacy mode. Not that I see why it should matter 
(the problem happened on the harddisk, not the DVD)...

		Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:55     ` Sergei Shtylyov
@ 2006-11-28 20:12       ` Eric D. Mudama
  2006-11-28 20:36         ` Sergei Shtylyov
  0 siblings, 1 reply; 19+ messages in thread
From: Eric D. Mudama @ 2006-11-28 20:12 UTC (permalink / raw)
  To: Sergei Shtylyov; +Cc: Mark Lord, Jeff Garzik, linux-ide, Tejun Heo

On 11/28/06, Sergei Shtylyov <sshtylyov@ru.mvista.com> wrote:
> Hello.
>
> Mark Lord wrote:
>
> > Bit #4, when actually implemented, is a rotational seek indicator,
> > which can be used for timing purposes.
>
>     Hm, I thought it was DSC (drive seek complete) set by the SEEK command
> completion, and it's always implemented. Didn't you mean IDX (bit 1, IIRC)?

0x50 is the standard, non queueing "device is ready" status.  It used
to have those special meanings, but they're pretty obsolete today as I
understand it.

0x40 is used for queueing, because bit 4 was the service bit for PATA TCQ.

> > But when BUSY (bit #7) is set, the rest are generally nonsense.
>
>     Indeed...
>
> WBR, Sergei

Typically, 0x80 as the busy state indicates the device is in POR
reset.  Once the firmware is up and running in the device, it often
switches from 0x80 to 0xD0 during POR.

0xD0 is the busy state you'd get to if you were 0x50 and received a
command, so this is reported typically after the device is up and
running.

0x7F usually is hardware indicating nothing is attached to the port,
and isn't supposed to infer a non-busy state.

You're right, while not meaningful according to spec, you can derive
some information from the reported status even when you're only
supposed to look at one bit.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 20:12       ` Eric D. Mudama
@ 2006-11-28 20:36         ` Sergei Shtylyov
  0 siblings, 0 replies; 19+ messages in thread
From: Sergei Shtylyov @ 2006-11-28 20:36 UTC (permalink / raw)
  To: Eric D. Mudama; +Cc: Mark Lord, Jeff Garzik, linux-ide, Tejun Heo

Hello.

Eric D. Mudama wrote:
>> > Bit #4, when actually implemented, is a rotational seek indicator,
>> > which can be used for timing purposes.

>>     Hm, I thought it was DSC (drive seek complete) set by the SEEK 
>> command
>> completion, and it's always implemented. Didn't you mean IDX (bit 1, 
>> IIRC)?

> 0x50 is the standard, non queueing "device is ready" status.  It used
> to have those special meanings, but they're pretty obsolete today as I
> understand it.

    Erm, some status bits maybe obsolete but I've never heard that the status 
*values* were specified to mean anything special anywhere...

> 0x40 is used for queueing, because bit 4 was the service bit for PATA TCQ.

   I know. This meaning (SERVICE) actualy came from ATAPI

>> > But when BUSY (bit #7) is set, the rest are generally nonsense.

>>     Indeed...

>> WBR, Sergei

> Typically, 0x80 as the busy state indicates the device is in POR
> reset.  Once the firmware is up and running in the device, it often
> switches from 0x80 to 0xD0 during POR.

    Oh, I guess it's completely up to the disk makers what other status to 
show with BSY=1.

> 0xD0 is the busy state you'd get to if you were 0x50 and received a
> command, so this is reported typically after the device is up and
> running.

> 0x7F usually is hardware indicating nothing is attached to the port,
> and isn't supposed to infer a non-busy state.

    Ha, *never* seen that one. It's has always been 0xFF since PC people 
didn't ever bother themselves with silly pulldowns. :-)

> You're right, while not meaningful according to spec, you can derive
> some information from the reported status even when you're only
> supposed to look at one bit.

   Well, to some extent...

WBR, Sergei

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
  2006-11-28 17:37   ` Mark Lord
  2006-11-28 18:05   ` Alan
@ 2006-11-28 21:03   ` Jeff Garzik
  2006-11-28 21:45     ` Linus Torvalds
  2006-11-28 22:18   ` Jeff Garzik
  3 siblings, 1 reply; 19+ messages in thread
From: Jeff Garzik @ 2006-11-28 21:03 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, linux-ide

Linus Torvalds wrote:
> [ You may or may not have gotten my previous email. The kernel stayed 
>   working, but due to the IO errors the filesystem got re-mounted 
>   read-only, and I'm not sure that the email I sent out in that state 
>   actually ever made it out. I suspect it didn't. ]
> 
> Jeff,
>  I just had a scary thing on my nice new Intel i965 box (all Intel 
> chipsets apart from some strange Marvell IDE interface that I'm not using 
> and that no driver even detected, and a TI firewire thing that I'm 
> similarly not using).

Does jgarzik/libata-dev.git#upstream (don't pull, just test) work for you?

Or -mm, which includes #upstream?

I'm pretty sure this is already fixed, by the polling IDENTIFY for 
ata_piix patchset.

	Jeff



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 21:03   ` Jeff Garzik
@ 2006-11-28 21:45     ` Linus Torvalds
  0 siblings, 0 replies; 19+ messages in thread
From: Linus Torvalds @ 2006-11-28 21:45 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Andrew Morton, linux-ide

On Tue, 28 Nov 2006, Jeff Garzik wrote:
> 
> Does jgarzik/libata-dev.git#upstream (don't pull, just test) work for you?

Well, since I can't really test, I don't know. This problem has happened 
just once in the couple of weeks I've used that machine, and I wasn't even 
doing anything strange when it triggered (no heavy IO, no special 
programs, no nothing - I was literally just reading email and I think 
trying to browse over to news.com or something..)

So I was more hoping that you'd say that it's a known issue, and already 
fixed, or that the status bits would give you some clue and make you say 
"Ahh, we don't handle that case". I have nothing to "test". The thing 
seems to work, and I have no known way to trigger the problem...

> I'm pretty sure this is already fixed, by the polling IDENTIFY for ata_piix
> patchset.

Hmm. That sounds like it should just affect the bootup identification, 
which has always worked fine for me. Would it fix the softreset too?

Anyway, I can certainly try yout current "upstream" branch, but as 
mentioned, the standard kernel works fine for me generally, so I don't 
really know what I can offer (except if "upstream" simply doesn't work at 
all, in which case I'll certainly let you know ;)

		Linus

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
                     ` (2 preceding siblings ...)
  2006-11-28 21:03   ` Jeff Garzik
@ 2006-11-28 22:18   ` Jeff Garzik
  3 siblings, 0 replies; 19+ messages in thread
From: Jeff Garzik @ 2006-11-28 22:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Andrew Morton, linux-ide

And FWIW, "frozen" in this context means that Tejun's libata error 
handling code has take ownership of the ATA port, after stopping all 
outstanding I/O transactions.

So when you see that, that's libata EH kicking in.

	Jeff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA problem: "frozen"
  2006-11-28 17:37   ` Mark Lord
  2006-11-28 17:55     ` Sergei Shtylyov
@ 2006-11-29  1:12     ` Tejun Heo
  1 sibling, 0 replies; 19+ messages in thread
From: Tejun Heo @ 2006-11-29  1:12 UTC (permalink / raw)
  To: Mark Lord; +Cc: Linus Torvalds, Jeff Garzik, Andrew Morton, linux-ide

Mark Lord wrote:
> Linus Torvalds wrote:
>> [ You may or may not have gotten my previous email. The kernel stayed 
>>   working, but due to the IO errors the filesystem got re-mounted   
>> read-only, and I'm not sure that the email I sent out in that state   
>> actually ever made it out. I suspect it didn't. ]
>>
>> Jeff,
>>  I just had a scary thing on my nice new Intel i965 box (all Intel 
>> chipsets apart from some strange Marvell IDE interface that I'm not 
>> using and that no driver even detected, and a TI firewire thing that 
>> I'm similarly not using).
>>
>> The machine basically froze for about a minute or so (well, things 
>> worked surprisingly well, considering that apparently no disk IO 
>> happened - I initially thought it was just firefox that had frozen up, 
>> since my mail session seemed to be fine), and after it came back the 
>> filesystem was mounted read-only and nothing really worked any more..
>>
>> I have no idea what status 0xD0 means: it looks like ATA_BUSY + 
>> ATA_DRDY + "bit#4", but what is bit#4?
> 
> Bit #4, when actually implemented, is a rotational seek indicator,
> which can be used for timing purposes.
> 
> But when BUSY (bit #7) is set, the rest are generally nonsense.
> 
>> And clearly, the soft-reset isn't doing squat.

I dunno.  My first suspect is transient transmission error and yeah they 
do occur from time to time even on otherwise stable setup.  For example, 
my machine is nvidia ck804 which has pretty weak error handling (at 
least used to) and stays up 24/7 and I've seen such unrecovered 
transmission error just once during last 6+ months.

My experience is that if something is weird (say, power fluctuation or 
electro-magnetic interference), SATA is the first thing to give out and 
that's why we need good EH w/ SATA much more than we do with PATA.

Drives (controllers too) sometimes fall into weird state after such 
errors and softreset is often not enough, so we need hardreset.  ICH8 
can do hardreset even in ata_piix mode.  I'll work on it.

Linus, I'll follow up with Jonas as his problem seems reproducible but 
I'm a bit skeptical about it being a driver issue.  Even w/ all its 
kinks, ata_piix is just a sff IDE controller and libata has been doing 
it for a long time.  I would be really surprised if the driver or 
controller has any such issue in the usual r/w path.  AHCI should be 
able to recover from most error conditions unless drive firmware is 
completely stuck requiring physical power off.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA errors..
  2006-11-28 16:56 ` Scary Intel SATA errors Linus Torvalds
@ 2006-11-29 18:25   ` Mark Lord
  2006-11-29 18:42     ` Alan
  2006-12-01 19:42     ` Alan
  0 siblings, 2 replies; 19+ messages in thread
From: Mark Lord @ 2006-11-29 18:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Linus Torvalds, Jeff Garzik, Andrew Morton, linux-ide

Mmmm.. Tejun, here's a clue for what Linus saw on his system:

Right now I'm implementing support for READ/WRITE LONG commands via libata.

And the ata_piix driver gets into a non-recoverable state after successfully
doing a READ LONG command for me, a very similar state to what Linus reported.

But the ahci driver does NOT have this problem.  I haven't tried others.

The thing about R/W LONG, is that they transfer a single PIO sector of data,
PLUS an extra 4 (or more) words at the end.

Funny thing about ATAPI DVD/RW drives, is that they also transfer odd amounts
of data, non-multiples of 512.

I'm betting that the ata_piix hardware has some kind of internal pipeline that
gets confused *sometimes* when a non-512 multiple passes through.  Rarely, though.

I wonder if there's something on that device that we could bit-bang to reset 
it's internal pipelines?

Cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA errors..
  2006-11-29 18:25   ` Mark Lord
@ 2006-11-29 18:42     ` Alan
  2006-12-01 19:42     ` Alan
  1 sibling, 0 replies; 19+ messages in thread
From: Alan @ 2006-11-29 18:42 UTC (permalink / raw)
  To: Mark Lord
  Cc: Tejun Heo, Linus Torvalds, Jeff Garzik, Andrew Morton, linux-ide

On Wed, 29 Nov 2006 13:25:18 -0500
Mark Lord <liml@rtr.ca> wrote:

> I'm betting that the ata_piix hardware has some kind of internal pipeline that
> gets confused *sometimes* when a non-512 multiple passes through.  Rarely, though.

It will do this if the FIFO setup is misconfigured.
 
> I wonder if there's something on that device that we could bit-bang to reset 
> it's internal pipelines?

That would cripple performance and just ask for more bizarre bugs.
Firstly I think it makes sense to verify/play with the fifo and prefetch
setup then verify the problem case and see what Intel think.

Alan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Scary Intel SATA errors..
  2006-11-29 18:25   ` Mark Lord
  2006-11-29 18:42     ` Alan
@ 2006-12-01 19:42     ` Alan
  1 sibling, 0 replies; 19+ messages in thread
From: Alan @ 2006-12-01 19:42 UTC (permalink / raw)
  To: Mark Lord
  Cc: Tejun Heo, Linus Torvalds, Jeff Garzik, Andrew Morton, linux-ide

On Wed, 29 Nov 2006 13:25:18 -0500
Mark Lord <liml@rtr.ca> wrote:

> Mmmm.. Tejun, here's a clue for what Linus saw on his system:
> 
> Right now I'm implementing support for READ/WRITE LONG commands via libata.
> 
> And the ata_piix driver gets into a non-recoverable state after successfully
> doing a READ LONG command for me, a very similar state to what Linus reported.

Looks like it may in fact be a chip erratum. Turn off PPE unconditionally
and try and repeat it (if you look at ata_piix right now its turned off
for ATAPI only).

Alan

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2006-12-01 19:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-14 15:04 [git patches] libata fixes Jeff Garzik
2006-11-14 16:32 ` Mark Lord
2006-11-14 16:41   ` Jeff Garzik
2006-11-14 18:11     ` Mark Lord
2006-11-28 16:56 ` Scary Intel SATA errors Linus Torvalds
2006-11-29 18:25   ` Mark Lord
2006-11-29 18:42     ` Alan
2006-12-01 19:42     ` Alan
2006-11-28 17:31 ` Scary Intel SATA problem: "frozen" Linus Torvalds
2006-11-28 17:37   ` Mark Lord
2006-11-28 17:55     ` Sergei Shtylyov
2006-11-28 20:12       ` Eric D. Mudama
2006-11-28 20:36         ` Sergei Shtylyov
2006-11-29  1:12     ` Tejun Heo
2006-11-28 18:05   ` Alan
2006-11-28 18:33     ` Linus Torvalds
2006-11-28 21:03   ` Jeff Garzik
2006-11-28 21:45     ` Linus Torvalds
2006-11-28 22:18   ` Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).