linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux v2.6.22-rc3
       [not found] <alpine.LFD.0.98.0705252008210.26602@woody.linux-foundation.org>
@ 2007-05-27 15:01 ` Gregor Jasny
  2007-05-27 15:06   ` Jeff Garzik
  2007-06-02 16:11   ` [PATCH] " Jeff Garzik
  0 siblings, 2 replies; 53+ messages in thread
From: Gregor Jasny @ 2007-05-27 15:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Linus Torvalds, Jeff Garzik, linux-ide, Alan Cox

Hi,

2007/5/26, Linus Torvalds <torvalds@linux-foundation.org>:
> What more could you possibly want? Some ATA updates? USB suspend problem

22-rc3 broke the CDROM in my Dell notebook.  After I've switched to
libata som time ago, I've got some delays/timeouts during boot [1].
But the drive works as expected. With 2.6.22-rc3 I've got the
following messages during bootup:

[   19.343211] ata_piix 0000:00:07.1: version 2.11
[   19.343598] scsi0 : ata_piix
[   19.344068] scsi1 : ata_piix
[   19.344426] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
bmdma 0x00010860 irq 14
[   19.345135] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
bmdma 0x00010868 irq 15
[    0.653333] Marking TSC unstable due to: possible TSC halt in C2.
[    0.656666] Time: acpi_pm clocksource has been installed.
[    0.656666] Switched to NOHz mode on CPU #0
[    0.779999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.779999] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
[    0.796666] ata1.00: 78242976 sectors, multi 8: LBA48
[    0.816666] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.829999] ata1.00: configured for UDMA/33
[    1.376666] Clocksource tsc unstable (delta = -496149826 ns)
[   10.866665] ata2: SRST failed (errno=-16)
[   20.899997] ata2: SRST failed (errno=-16)
[   55.946661] ata2: SRST failed (errno=-16)
[   60.966660] ata2: SRST failed (errno=-16)
[   60.979993] ata2: reset failed, giving up
[   60.993327] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
MP0402H  UC10 PQ: 0 ANSI: 5
[   61.009993] sd 0:0:0:0: [sda] 78242976 512-byte hardware sectors (40060 MB)
[   61.023327] sd 0:0:0:0: [sda] Write Protect is off
[   61.039993] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   61.039993] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[   61.056660] sd 0:0:0:0: [sda] 78242976 512-byte hardware sectors (40060 MB)
[   61.069993] sd 0:0:0:0: [sda] Write Protect is off
[   61.086660] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   61.086660] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[   61.103327]  sda: sda1 sda2 sda3 sda4
[   61.546660] sd 0:0:0:0: [sda] Attached SCSI disk
[   61.563327] sd 0:0:0:0: Attached scsi generic sg0 type 0

AFAIR the drive, a  SAMSUNG CD-ROM SN-124, is somehow misbehaving and
listed with ATA_HORKAGE_NODMA in the dma blacklist.

Is there a patch I can apply?

Thanks,
Gregor

[1] http://lkml.org/lkml/2007/3/1/243

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 15:01 ` Linux v2.6.22-rc3 Gregor Jasny
@ 2007-05-27 15:06   ` Jeff Garzik
  2007-05-27 16:07     ` Gregor Jasny
  2007-05-28 21:50     ` Bill Davidsen
  2007-06-02 16:11   ` [PATCH] " Jeff Garzik
  1 sibling, 2 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-05-27 15:06 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ide, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 517 bytes --]

Gregor Jasny wrote:
> Hi,
> 
> 2007/5/26, Linus Torvalds <torvalds@linux-foundation.org>:
>> What more could you possibly want? Some ATA updates? USB suspend problem
> 
> 22-rc3 broke the CDROM in my Dell notebook.  After I've switched to
> libata som time ago, I've got some delays/timeouts during boot [1].
> But the drive works as expected. With 2.6.22-rc3 I've got the
> following messages during bootup:

Does the attached patch fix things?

Tejun just posted the upstream-ready version earlier today.

	Jeff




[-- Attachment #2: Attached Message --]
[-- Type: message/rfc822, Size: 8582 bytes --]

From: Tejun Heo <htejun@gmail.com>
To: Jeff Garzik <jeff@garzik.org>, IDE/ATA development list <linux-ide@vger.kernel.org>, Alan Cox <alan@lxorguk.ukuu.org.uk>
Subject: [PATCH] libata: always use polling SETXFER
Date: Sun, 27 May 2007 15:10:40 +0200
Message-ID: <46598350.6030403@gmail.com>

Several people have reported LITE-ON LTR-48246S detection failed
because SETXFER fails.  It seems the device raises IRQ too early after
SETXFER.  This is controller independent.  The same problem has been
reported for different controllers.

So, now we have pata_via where the controller raises IRQ before it's
ready after SETXFER and a device which does similar thing.  This patch
makes libata always execute SETXFER via polling.  As this only happens
during EH, performance impact is nil.  Setting ATA_TFLAG_POLLING is
also moved from issue hot path to ata_dev_set_xfermode() - the only
place where SETXFER can be issued.

Note that ATA_TFLAG_POLLING applies only to drivers which implement
SFF TF interface and use libata HSM.  More advanced controllers ignore
the flag.  This doesn't matter for this fix as SFF TF controllers are
the problematic ones.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
 drivers/ata/libata-core.c |   13 ++++---------
 drivers/ata/pata_via.c    |   12 ++++++------
 include/linux/libata.h    |    1 -
 3 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 3ca9c61..4d6de65 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3932,10 +3932,13 @@ static unsigned int ata_dev_set_xfermode
 	/* set up set-features taskfile */
 	DPRINTK("set features - xfer mode\n");
 
+	/* Some controllers and ATAPI devices show flaky interrupt
+	 * behavior after setting xfer mode.  Use polling instead.
+	 */
 	ata_tf_init(dev, &tf);
 	tf.command = ATA_CMD_SET_FEATURES;
 	tf.feature = SETFEATURES_XFER;
-	tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
+	tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE | ATA_TFLAG_POLLING;
 	tf.protocol = ATA_PROT_NODATA;
 	tf.nsect = dev->xfer_mode;
 
@@ -5413,14 +5416,6 @@ unsigned int ata_qc_issue_prot(struct at
 		}
 	}
 
-	/* Some controllers show flaky interrupt behavior after
-	 * setting xfer mode.  Use polling instead.
-	 */
-	if (unlikely(qc->tf.command == ATA_CMD_SET_FEATURES &&
-		     qc->tf.feature == SETFEATURES_XFER) &&
-	    (ap->flags & ATA_FLAG_SETXFER_POLLING))
-		qc->tf.flags |= ATA_TFLAG_POLLING;
-
 	/* select the device */
 	ata_dev_select(ap, qc->dev->devno, 1, 0);
 
diff --git a/drivers/ata/pata_via.c b/drivers/ata/pata_via.c
index a8462f1..63eca29 100644
--- a/drivers/ata/pata_via.c
+++ b/drivers/ata/pata_via.c
@@ -452,7 +452,7 @@ static int via_init_one(struct pci_dev *
 	/* Early VIA without UDMA support */
 	static const struct ata_port_info via_mwdma_info = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.port_ops = &via_port_ops
@@ -460,7 +460,7 @@ static int via_init_one(struct pci_dev *
 	/* Ditto with IRQ masking required */
 	static const struct ata_port_info via_mwdma_info_borked = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.port_ops = &via_port_ops_noirq,
@@ -468,7 +468,7 @@ static int via_init_one(struct pci_dev *
 	/* VIA UDMA 33 devices (and borked 66) */
 	static const struct ata_port_info via_udma33_info = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.udma_mask = 0x7,
@@ -477,7 +477,7 @@ static int via_init_one(struct pci_dev *
 	/* VIA UDMA 66 devices */
 	static const struct ata_port_info via_udma66_info = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.udma_mask = 0x1f,
@@ -486,7 +486,7 @@ static int via_init_one(struct pci_dev *
 	/* VIA UDMA 100 devices */
 	static const struct ata_port_info via_udma100_info = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.udma_mask = 0x3f,
@@ -495,7 +495,7 @@ static int via_init_one(struct pci_dev *
 	/* UDMA133 with bad AST (All current 133) */
 	static const struct ata_port_info via_udma133_info = {
 		.sht = &via_sht,
-		.flags = ATA_FLAG_SLAVE_POSS | ATA_FLAG_SETXFER_POLLING,
+		.flags = ATA_FLAG_SLAVE_POSS,
 		.pio_mask = 0x1f,
 		.mwdma_mask = 0x07,
 		.udma_mask = 0x7f,	/* FIXME: should check north bridge */
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 85f7b1b..a6a3113 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -171,7 +171,6 @@ enum {
 	ATA_FLAG_SKIP_D2H_BSY	= (1 << 12), /* can't wait for the first D2H
 					      * Register FIS clearing BSY */
 	ATA_FLAG_DEBUGMSG	= (1 << 13),
-	ATA_FLAG_SETXFER_POLLING= (1 << 14), /* use polling for SETXFER */
 	ATA_FLAG_IGN_SIMPLEX	= (1 << 15), /* ignore SIMPLEX */
 	ATA_FLAG_NO_IORDY	= (1 << 16), /* controller lacks iordy */
 	ATA_FLAG_ACPI_SATA	= (1 << 17), /* need native SATA ACPI layout */

-
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 15:06   ` Jeff Garzik
@ 2007-05-27 16:07     ` Gregor Jasny
  2007-05-27 16:24       ` Linus Torvalds
  2007-05-28 21:50     ` Bill Davidsen
  1 sibling, 1 reply; 53+ messages in thread
From: Gregor Jasny @ 2007-05-27 16:07 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ide, Alan Cox

2007/5/27, Jeff Garzik <jgarzik@pobox.com>:
> Does the attached patch fix things?

No it doesn't. Note that I'm using ata_piix.

Gregor

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 16:07     ` Gregor Jasny
@ 2007-05-27 16:24       ` Linus Torvalds
  2007-05-27 20:15         ` Gregor Jasny
  0 siblings, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-05-27 16:24 UTC (permalink / raw)
  To: Gregor Jasny; +Cc: Jeff Garzik, Linux Kernel Mailing List, linux-ide, Alan Cox



On Sun, 27 May 2007, Gregor Jasny wrote:

> 2007/5/27, Jeff Garzik <jgarzik@pobox.com>:
> > Does the attached patch fix things?
> 
> No it doesn't. Note that I'm using ata_piix.

The commit message for that thing is badly worded. It definitely hits 
ata_piix too, and it concerns a lot more than just the LITE-ON CD-ROM 
drives (ie that patch fixes a bug for me on a PIIX system I have access 
to, with a TDK drive).

But if it doesn't help for you, you have some other issue (which is not 
surprising - yours wasn't a SETFXSR error, and I don't think it would have 
worked very well before either).

So since it apparently _did_ work for you before, can you bisect it?

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 16:24       ` Linus Torvalds
@ 2007-05-27 20:15         ` Gregor Jasny
  2007-05-28  9:47           ` Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Gregor Jasny @ 2007-05-27 20:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Linux Kernel Mailing List, linux-ide, Alan Cox,
	Tejun Heo

2007/5/27, Linus Torvalds <torvalds@linux-foundation.org>:
> But if it doesn't help for you, you have some other issue (which is not
> surprising - yours wasn't a SETFXSR error, and I don't think it would have
> worked very well before either).
>
> So since it apparently _did_ work for you before, can you bisect it?

git bisect told me that this is the first bad commit:
commit d4b2bab4f26345ea1803feb23ea92fbe3f6b77bc
Author: Tejun Heo <htejun@gmail.com>
Date:   Fri Feb 2 16:50:52 2007 +0900

libata: add deadline support to prereset and reset methods

I suspected my unstable timing source and booted with acpi=off. Now
the printk timestamps are in sequence but the bug is still there.

The drive worked flawlessly with the old ide-cd. So what's the
difference between libata and ide-cd?

Is there a kernel parameter to completely disable probing the cdrom?

Thanks,
Gregor

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 20:15         ` Gregor Jasny
@ 2007-05-28  9:47           ` Tejun Heo
  2007-05-28 14:07             ` Gregor Jasny
  0 siblings, 1 reply; 53+ messages in thread
From: Tejun Heo @ 2007-05-28  9:47 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Linus Torvalds, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox

Gregor Jasny wrote:
> 2007/5/27, Linus Torvalds <torvalds@linux-foundation.org>:
>> But if it doesn't help for you, you have some other issue (which is not
>> surprising - yours wasn't a SETFXSR error, and I don't think it would
>> have
>> worked very well before either).
>>
>> So since it apparently _did_ work for you before, can you bisect it?
> 
> git bisect told me that this is the first bad commit:
> commit d4b2bab4f26345ea1803feb23ea92fbe3f6b77bc
> Author: Tejun Heo <htejun@gmail.com>
> Date:   Fri Feb 2 16:50:52 2007 +0900
> 
> libata: add deadline support to prereset and reset methods
> 
> I suspected my unstable timing source and booted with acpi=off. Now
> the printk timestamps are in sequence but the bug is still there.
> 
> The drive worked flawlessly with the old ide-cd. So what's the
> difference between libata and ide-cd?

I can't find the rest of this thread.  Can you post the result of
'dmesg' and 'lspci -nn'?

> Is there a kernel parameter to completely disable probing the cdrom?

Nope.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-28  9:47           ` Tejun Heo
@ 2007-05-28 14:07             ` Gregor Jasny
  2007-05-29  9:28               ` Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Gregor Jasny @ 2007-05-28 14:07 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox

Hi Tejun,

2007/5/28, Tejun Heo <htejun@gmail.com>:
> I can't find the rest of this thread.
http://lkml.org/lkml/2007/5/27/63
Old report:
http://lkml.org/lkml/2007/3/1/243

>  Can you post the result of 'dmesg' and 'lspci -nn'?

2.6.22-rc3 dmesg:
[    0.000000] On node 0 totalpages: 65499
[    0.000000]   DMA zone: 32 pages used for memmap
[    0.000000]   DMA zone: 0 pages reserved
[    0.000000]   DMA zone: 4064 pages, LIFO batch:0
[    0.000000]   Normal zone: 479 pages used for memmap
[    0.000000]   Normal zone: 60924 pages, LIFO batch:15
[    0.000000] DMI 2.3 present.
[    0.000000] ACPI: RSDP 000F4040, 0014 (r0 DELL  )
[    0.000000] ACPI: RSDT 0FFF0000, 0028 (r1 DELL    CPi R   27D2051C
ASL        61)
[    0.000000] ACPI: FACP 0FFF0400, 0074 (r1 DELL    CPi R   27D2051C
ASL        61)
[    0.000000] ACPI: DSDT 0FFF0800, 29A6 (r1 INT430 SYSFexxx     1001
MSFT  100000D)
[    0.000000] ACPI: FACS 0FFFF800, 0040
[    0.000000] ACPI: PM-Timer IO Port: 0x808
[    0.000000] Allocating PCI resources starting at 20000000 (gap:
10100000:efd00000)
[    0.000000] Built 1 zonelists.  Total pages: 64988
[    0.000000] Kernel command line: root=/dev/sda1 ro vga=0x317 resume=/dev/sda3
[    0.000000] Enabling fast FPU save and restore... done.
[    0.000000] Enabling unmasked SIMD FPU exception support... done.
[    0.000000] Initializing CPU#0
[    0.000000] PID hash table entries: 1024 (order: 10, 4096 bytes)
[    0.000000] Detected 497.852 MHz processor.
[   17.951694] Console: colour dummy device 80x25
[   17.952386] Dentry cache hash table entries: 32768 (order: 5, 131072 bytes)
[   17.953014] Inode-cache hash table entries: 16384 (order: 4, 65536 bytes)
[   17.986017] Memory: 256764k/261996k available (1555k kernel code,
4684k reserved, 564k data, 124k init, 0k highmem)
[   17.986078] virtual kernel memory layout:
[   17.986083]     fixmap  : 0xffff8000 - 0xfffff000   (  28 kB)
[   17.986089]     vmalloc : 0xd0800000 - 0xffff6000   ( 759 MB)
[   17.986096]     lowmem  : 0xc0000000 - 0xcffdb000   ( 255 MB)
[   17.986102]       .init : 0xc0314000 - 0xc0333000   ( 124 kB)
[   17.986107]       .data : 0xc0284d5d - 0xc0311e94   ( 564 kB)
[   17.986113]       .text : 0xc0100000 - 0xc0284d5d   (1555 kB)
[   17.986177] Checking if this processor honours the WP bit even in
supervisor mode... Ok.
[   17.986342] SLUB: Genslabs=23, HWalign=32, Order=0-1, MinObjects=4,
Processors=1, Nodes=1
[   18.069679] Calibrating delay using timer specific routine.. 996.72
BogoMIPS (lpj=1660367)
[   18.069779] Mount-cache hash table entries: 512
[   18.070072] CPU: After generic identify, caps: 0383f9ff 00000000
00000000 00000000 00000000 00000000 00000000
[   18.070110] CPU: L1 I cache: 16K, L1 D cache: 16K
[   18.070130] CPU: L2 cache: 256K
[   18.070145] CPU: After all inits, caps: 0383f9ff 00000000 00000000
00000040 00000000 00000000 00000000
[   18.070168] Intel machine check architecture supported.
[   18.070189] Intel machine check reporting enabled on CPU#0.
[   18.070215] Compat vDSO mapped to ffffe000.
[   18.070260] CPU: Intel Pentium III (Coppermine) stepping 01
[   18.070287] Checking 'hlt' instruction... OK.
[   18.083143] ACPI: Core revision 20070126
[   18.087856] ACPI: setting ELCR to 0200 (from 0820)
[   18.090228] NET: Registered protocol family 16
[   18.090500] ACPI: bus type pci registered
[   18.094673] PCI: PCI BIOS revision 2.10 entry at 0xfc0ae, last bus=1
[   18.094693] PCI: Using configuration type 1
[   18.094708] Setting up standard PCI resources
[   18.130663] ACPI: Interpreter enabled
[   18.130700] ACPI: (supports S0 S1 S3 S4 S5)
[   18.130775] ACPI: Using PIC for interrupt routing
[   18.168609] ACPI: PCI Root Bridge [PCI0] (0000:00)
[   18.168698] PCI: Probing PCI hardware (bus 00)
[   18.169221] PCI quirk: region 0800-083f claimed by PIIX4 ACPI
[   18.169246] PCI quirk: region 0840-084f claimed by PIIX4 SMB
[   18.169267] PIIX4 devres B PIO at 00e0-00e7
[   18.169284] PIIX4 devres C PIO at 0850-085f
[   18.169888] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[   18.170186] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGP_._PRT]
[   18.253152] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10
*11 12 14 15)
[   18.253575] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 6 7 9 10
11 12 14 15)
[   18.253923] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 11
12 14 15) *0, disabled.
[   18.254333] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10
*11 12 14 15)
[   18.254760] ACPI: Power Resource [PADA] (on)
[   18.254800] Linux Plug and Play Support v0.97 (c) Adam Belay
[   18.254842] pnp: PnP ACPI init
[   18.254880] ACPI: bus type pnp registered
[   18.324625] pnp: PnP ACPI: found 13 devices
[   18.324657] ACPI: ACPI bus type pnp unregistered
[   18.325044] SCSI subsystem initialized
[   18.325126] libata version 2.21 loaded.
[   18.325204] PCI: Using ACPI for IRQ routing
[   18.325225] PCI: If a device doesn't work, try "pci=routeirq".  If
it helps, post a report
[   18.328448] pnp: 00:00: iomem range 0x0-0x9fbff could not be reserved
[   18.328475] pnp: 00:00: iomem range 0x9fc00-0x9ffff could not be reserved
[   18.328497] pnp: 00:00: iomem range 0xc0000-0xcffff could not be reserved
[   18.328519] pnp: 00:00: iomem range 0xf0000-0xfffff could not be reserved
[   18.328551] pnp: 00:02: ioport range 0x4d0-0x4d1 has been reserved
[   18.328572] pnp: 00:02: ioport range 0x800-0x805 has been reserved
[   18.328592] pnp: 00:02: ioport range 0x808-0x80f has been reserved
[   18.328636] pnp: 00:03: ioport range 0x806-0x807 has been reserved
[   18.328656] pnp: 00:03: ioport range 0x850-0x853 has been reserved
[   18.328676] pnp: 00:03: ioport range 0x856-0x85f has been reserved
[   18.328696] pnp: 00:03: ioport range 0x810-0x83f has been reserved
[   18.328716] pnp: 00:03: ioport range 0x840-0x84f has been reserved
[   18.328736] pnp: 00:03: ioport range 0x600-0x67f has been reserved
[   18.328756] pnp: 00:03: ioport range 0x680-0x6ff has been reserved
[   18.328786] pnp: 00:04: ioport range 0xf000-0xf0fe has been reserved
[   18.328807] pnp: 00:04: ioport range 0xf100-0xf1fe has been reserved
[   18.328828] pnp: 00:04: ioport range 0xf200-0xf2fe has been reserved
[   18.328848] pnp: 00:04: ioport range 0xf400-0xf4fe has been reserved
[   18.328868] pnp: 00:04: ioport range 0xf500-0xf5fe has been reserved
[   18.328889] pnp: 00:04: ioport range 0xf600-0xf6fe has been reserved
[   18.328909] pnp: 00:04: ioport range 0xf800-0xf8fe has been reserved
[   18.328929] pnp: 00:04: ioport range 0xf900-0xf9fe has been reserved
[   18.328951] pnp: 00:04: iomem range 0xed000000-0xedffffff has been reserved
[   18.328988] pnp: 00:09: ioport range 0x3f0-0x3f1 has been reserved
[   18.329549] Time: tsc clocksource has been installed.
[   18.360100] PCI: Bridge: 0000:00:01.0
[   18.360122]   IO window: e000-efff
[   18.360140]   MEM window: fc000000-feffffff
[   18.360159]   PREFETCH window: 30000000-300fffff
[   18.360178] PCI: Bus 2, cardbus bridge: 0000:00:03.0
[   18.360195]   IO window: 00001000-000010ff
[   18.360212]   IO window: 00001400-000014ff
[   18.360229]   PREFETCH window: 20000000-23ffffff
[   18.360247]   MEM window: 24000000-27ffffff
[   18.360265] PCI: Bus 6, cardbus bridge: 0000:00:03.1
[   18.360281]   IO window: 00001800-000018ff
[   18.360298]   IO window: 00001c00-00001cff
[   18.360315]   PREFETCH window: 28000000-2bffffff
[   18.360334]   MEM window: 2c000000-2fffffff
[   18.360380] PCI: Enabling device 0000:00:03.0 (0000 -> 0003)
[   18.361012] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 11
[   18.361034] PCI: setting IRQ 11 as level-triggered
[   18.361045] ACPI: PCI Interrupt 0000:00:03.0[A] -> Link [LNKD] ->
GSI 11 (level, low) -> IRQ 11
[   18.361093] PCI: Enabling device 0000:00:03.1 (0000 -> 0003)
[   18.361113] ACPI: PCI Interrupt 0000:00:03.1[A] -> Link [LNKD] ->
GSI 11 (level, low) -> IRQ 11
[   18.361179] NET: Registered protocol family 2
[   18.392920] IP route cache hash table entries: 2048 (order: 1, 8192 bytes)
[   18.393035] TCP established hash table entries: 8192 (order: 4, 65536 bytes)
[   18.393248] TCP bind hash table entries: 8192 (order: 3, 32768 bytes)
[   18.393361] TCP: Hash tables configured (established 8192 bind 8192)
[   18.393381] TCP reno registered
[   18.413562] io scheduler noop registered
[   18.413602] io scheduler anticipatory registered (default)
[   18.413619] io scheduler deadline registered
[   18.413871] io scheduler cfq registered
[   18.413896] Limiting direct PCI/PCI transfers.
[   18.413959] Boot video device is 0000:01:00.0
[   18.414312] vesafb: framebuffer at 0xfd000000, mapped to
0xd0880000, using 3072k, total 8128k
[   18.414346] vesafb: mode is 1024x768x16, linelength=2048, pages=4
[   18.414365] vesafb: protected mode interface info at c000:5044
[   18.414385] vesafb: pmi: set display start = c00c50aa, set palette = c00c5104
[   18.414403] vesafb: pmi: ports = ec85 ec1f ecb4 ecb8 ec18 ec14 ecc0 ecc3 ecc1
[   18.414441] vesafb: scrolling: redraw
[   18.414460] vesafb: Truecolor: size=0:5:6:5, shift=0:11:5:0
[   18.453475] Console: switching to colour frame buffer device 128x48
[   18.492137] fb0: VESA VGA frame buffer device
[   18.493594] ACPI: AC Adapter [AC] (on-line)
[   18.578801] ACPI: Battery Slot [BAT0] (battery present)
[   18.579831] ACPI: Battery Slot [BAT1] (battery absent)
[   18.580614] input: Lid Switch as /class/input/input0
[   18.580999] ACPI: Lid Switch [LID]
[   18.581473] input: Power Button (CM) as /class/input/input1
[   18.581896] ACPI: Power Button (CM) [PBTN]
[   18.582436] input: Sleep Button (CM) as /class/input/input2
[   18.582872] ACPI: Sleep Button (CM) [SBTN]
[   18.583593] ACPI: CPU0 (power states: C1[C1] C2[C2])
[   18.628870] ACPI: Thermal Zone [THM] (25 C)
[   18.637506] i8k: unable to get SMM BIOS version
[   18.637887] Dell laptop SMM driver v1.14 21/02/2005 Massimo Dal
Zotto (dz@debian.org)
[   18.639188] ata_piix 0000:00:07.1: version 2.11
[   18.639718] scsi0 : ata_piix
[   18.640189] scsi1 : ata_piix
[   18.640541] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
bmdma 0x00010860 irq 14
[   18.641148] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
bmdma 0x00010868 irq 15
[    0.653333] Marking TSC unstable due to: possible TSC halt in C2.
[    0.656666] Time: acpi_pm clocksource has been installed.
[    0.656666] Switched to NOHz mode on CPU #0
[    0.779999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.779999] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
[    0.796666] ata1.00: 78242976 sectors, multi 8: LBA48
[    0.816666] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.829999] ata1.00: configured for UDMA/33
[    1.376666] Clocksource tsc unstable (delta = -496294688 ns)
[   10.866665] ata2: SRST failed (errno=-16)
[   20.899997] ata2: SRST failed (errno=-16)
[   55.946661] ata2: SRST failed (errno=-16)
[   60.966660] ata2: SRST failed (errno=-16)
[   60.979993] ata2: reset failed, giving up
[   60.993327] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
MP0402H  UC10 PQ: 0 ANSI: 5
[   61.009993] sd 0:0:0:0: [sda] 78242976 512-byte hardware sectors (40060 MB)
[   61.023327] sd 0:0:0:0: [sda] Write Protect is off
[   61.039993] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   61.039993] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[   61.056660] sd 0:0:0:0: [sda] 78242976 512-byte hardware sectors (40060 MB)
[   61.069993] sd 0:0:0:0: [sda] Write Protect is off
[   61.086660] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   61.086660] sd 0:0:0:0: [sda] Write cache: enabled, read cache:
enabled, doesn't support DPO or FUA
[   61.103327]  sda: sda1 sda2 sda3 sda4
[   61.559993] sd 0:0:0:0: [sda] Attached SCSI disk
[   61.576660] sd 0:0:0:0: Attached scsi generic sg0 type 0
[   61.589993] PNP: PS/2 Controller [PNP0303:KBC,PNP0f13:PS2M] at
0x60,0x64 irq 1,12
[   61.616660] serio: i8042 KBD port at 0x60,0x64 irq 1
[   61.629993] serio: i8042 AUX port at 0x60,0x64 irq 12
[   61.646660] mice: PS/2 mouse device common for all mice
[   61.666660] input: AT Translated Set 2 keyboard as /class/input/input3
[   61.683327] TCP cubic registered
[   61.699993] NET: Registered protocol family 1
[   61.713327] NET: Registered protocol family 17
[   61.729993] Using IPI Shortcut mode
[   62.333327] input: DualPoint Stick as /class/input/input4
[   62.363327] input: AlpsPS/2 ALPS DualPoint TouchPad as /class/input/input5
[   62.496660] kjournald starting.  Commit interval 5 seconds
[   62.513327] EXT3-fs: mounted filesystem with ordered data mode.
[   62.526660] VFS: Mounted root (ext3 filesystem) readonly.
[   62.543327] Freeing unused kernel memory: 124k freed
[   65.916660] Yenta: CardBus bridge found at 0000:00:03.0 [1028:00aa]
[   65.933326] Yenta: Using CSCINT to route CSC interrupts to PCI
[   65.949993] Yenta: Routing CardBus interrupts to PCI
[   65.966660] Yenta TI: socket 0000:00:03.0, mfunc 0x01261222, devctl 0x66
[   66.163326] Linux agpgart interface v0.102 (c) Dave Jones
[   66.213326] Yenta: ISA IRQ mask 0x04b8, PCI irq 11
[   66.229993] Socket status: 30000010
[   66.499993] Yenta: CardBus bridge found at 0000:00:03.1 [1028:00aa]
[   66.513326] Yenta: Using CSCINT to route CSC interrupts to PCI
[   66.529993] Yenta: Routing CardBus interrupts to PCI
[   66.546660] Yenta TI: socket 0000:00:03.1, mfunc 0x01261222, devctl 0x66
[   66.589993] usbcore: registered new interface driver usbfs
[   66.606660] usbcore: registered new interface driver hub
[   66.623326] usbcore: registered new device driver usb
[   66.763326] USB Universal Host Controller Interface driver v3.0
[   66.793326] Yenta: ISA IRQ mask 0x04b8, PCI irq 11
[   66.809993] Socket status: 30000006
[   67.079993] agpgart: Detected an Intel 440BX Chipset.
[   67.099993] agpgart: AGP aperture is 64M @ 0xf4000000
[   67.146659] pccard: PCMCIA card inserted into slot 0
[   67.183326] ACPI: PCI Interrupt 0000:00:07.2[D] -> Link [LNKD] ->
GSI 11 (level, low) -> IRQ 11
[   67.199993] uhci_hcd 0000:00:07.2: UHCI Host Controller
[   67.216659] uhci_hcd 0000:00:07.2: new USB bus registered, assigned
bus number 1
[   67.233326] uhci_hcd 0000:00:07.2: irq 11, io base 0x0000dce0
[   67.249993] usb usb1: configuration #1 chosen from 1 choice
[   67.266659] hub 1-0:1.0: USB hub found
[   67.286659] hub 1-0:1.0: 2 ports detected
[   67.759993] Real Time Clock Driver v1.12ac
[   68.056659] Floppy drive(s): fd0 is 1.44M
[   68.103326] FDC 0 is a post-1991 82077
[   68.533326] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 5
[   68.549993] PCI: setting IRQ 5 as level-triggered
[   68.549993] ACPI: PCI Interrupt 0000:00:08.0[A] -> Link [LNKB] ->
GSI 5 (level, low) -> IRQ 5
[   69.166659] es1968: clocking to 48000
[   69.379993] cs: IO port probe 0x100-0x4ff: clean.
[   69.396659] cs: IO port probe 0x800-0x8ff: clean.
[   69.413326] cs: IO port probe 0xc00-0xcff: clean.
[   69.429993] cs: IO port probe 0xa00-0xaff: clean.
[   69.446659] cs: memory probe 0xa0000000-0xa0ffffff: clean.
[   69.466659] pcmcia: registering new device pcmcia0.0
[   69.543326] cs: IO port probe 0x100-0x4ff: clean.
[   69.559993] cs: IO port probe 0x800-0x8ff: clean.
[   69.576659] cs: IO port probe 0xc00-0xcff: clean.
[   69.589993] cs: IO port probe 0xa00-0xaff: clean.
[   70.016659] eth0: NE2000 Compatible: io 0x300, irq 3, hw_addr
00:E0:98:8B:B5:17
[   71.556659] Adding 506036k swap on /dev/sda3.  Priority:-1
extents:1 across:506036k
[   71.843326] EXT3 FS on sda1, internal journal
[   73.346659] device-mapper: ioctl: 4.11.0-ioctl (2006-10-12)
initialised: dm-devel@redhat.com
[  220.306644] fuse init (API version 7.8)
[  220.469977] kjournald starting.  Commit interval 5 seconds
[  220.486644] EXT3 FS on dm-0, internal journal
[  220.503311] EXT3-fs: mounted filesystem with ordered data mode.

00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX
Host bridge (rev 03)
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort+ >SERR- <PERR-
        Latency: 32
        Region 0: Memory at f4000000 (32-bit, prefetchable) [size=64M]
        Capabilities: [a0] AGP version 1.0
                Status: RQ=32 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64-
HTrans- 64bit- FW- AGP3- Rate=x1,x2
                Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit-
FW- Rate=<none>

00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP
bridge (rev 03) (prog-if 00 [Normal decode])
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap- 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: fc000000-feffffff
        Prefetchable memory behind bridge: 30000000-300fffff
        Secondary status: 66MHz+ FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort+ <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA+ VGA+ MAbort- >Reset- FastB2B+

00:03.0 CardBus bridge: Texas Instruments PCI1225 (rev 01)
        Subsystem: Dell Unknown device 00aa
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 168, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at 30100000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=00, secondary=02, subordinate=05, sec-latency=176
        Memory window 0: 20000000-23fff000 (prefetchable)
        Memory window 1: 24000000-27fff000
        I/O window 0: 00001000-000010ff
        I/O window 1: 00001400-000014ff
        BridgeCtl: Parity- SERR- ISA- VGA- MAbort- >Reset+ 16bInt+ PostWrite+
        16-bit legacy interface ports at 0001

00:03.1 CardBus bridge: Texas Instruments PCI1225 (rev 01)
        Subsystem: Dell Unknown device 00aa
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 168, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at 30101000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=00, secondary=06, subordinate=09, sec-latency=176
        Memory window 0: 28000000-2bfff000 (prefetchable)
        Memory window 1: 2c000000-2ffff000
        I/O window 0: 00001800-000018ff
        I/O window 1: 00001c00-00001cff
        BridgeCtl: Parity- SERR- ISA- VGA- MAbort- >Reset+ 16bInt+ PostWrite+
        16-bit legacy interface ports at 0001

00:07.0 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
        Control: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 0

00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev
01) (prog-if 80 [Master])
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32
        Region 0: [virtual] Memory at 000001f0 (32-bit,
non-prefetchable) [disabled] [size=8]
        Region 1: [virtual] Memory at 000003f0 (type 3,
non-prefetchable) [disabled] [size=1]
        Region 2: [virtual] Memory at 00000170 (32-bit,
non-prefetchable) [disabled] [size=8]
        Region 3: [virtual] Memory at 00000370 (type 3,
non-prefetchable) [disabled] [size=1]
        Region 4: I/O ports at 0860 [size=16]

00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev
01) (prog-if 00 [UHCI])
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32
        Interrupt: pin D routed to IRQ 11
        Region 4: I/O ports at dce0 [size=32]

00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Interrupt: pin ? routed to IRQ 9

00:08.0 Multimedia audio controller: ESS Technology ES1978 Maestro 2E (rev 10)
        Subsystem: Dell Unknown device 00aa
        Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (500ns min, 6000ns max)
        Interrupt: pin A routed to IRQ 5
        Region 0: I/O ports at d800 [size=256]
        Capabilities: [c0] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
PME(D0-,D1+,D2+,D3hot+,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:00.0 VGA compatible controller: ATI Technologies Inc Rage Mobility
P/M AGP 2x (rev 64) (prog-if 00 [VGA])
        Subsystem: Dell Latitude CPt
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping+ SERR- FastB2B-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
>TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 32 (2000ns min), Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: I/O ports at ec00 [size=256]
        Region 2: Memory at fcfff000 (32-bit, non-prefetchable) [size=4K]
        [virtual] Expansion ROM at 30000000 [disabled] [size=128K]
        Capabilities: [50] AGP version 1.0
                Status: RQ=256 Iso- ArqSz=0 Cal=0 SBA+ ITACoh- GART64-
HTrans- 64bit- FW- AGP3- Rate=x1,x2
                Command: RQ=1 ArqSz=0 Cal=0 SBA- AGP- GART64- 64bit-
FW- Rate=<none>
        Capabilities: [5c] Power Management version 1
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 PME-Enable- DSel=0 DScale=0 PME-

Thanks!
Gregor

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-27 15:06   ` Jeff Garzik
  2007-05-27 16:07     ` Gregor Jasny
@ 2007-05-28 21:50     ` Bill Davidsen
  1 sibling, 0 replies; 53+ messages in thread
From: Bill Davidsen @ 2007-05-28 21:50 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Gregor Jasny, Linux Kernel Mailing List, Linus Torvalds,
	linux-ide, Alan Cox

Jeff Garzik wrote:

> Several people have reported LITE-ON LTR-48246S detection failed
> because SETXFER fails.  It seems the device raises IRQ too early after
> SETXFER.  This is controller independent.  The same problem has been
> reported for different controllers.
> 
> So, now we have pata_via where the controller raises IRQ before it's
> ready after SETXFER and a device which does similar thing.  This patch
> makes libata always execute SETXFER via polling.  As this only happens
> during EH, performance impact is nil.  Setting ATA_TFLAG_POLLING is
> also moved from issue hot path to ata_dev_set_xfermode() - the only
> place where SETXFER can be issued.
> 
> Note that ATA_TFLAG_POLLING applies only to drivers which implement
> SFF TF interface and use libata HSM.  More advanced controllers ignore
> the flag.  This doesn't matter for this fix as SFF TF controllers are
> the problematic ones.
> 
Not only kills two birds with a single store, but will avoid having to 
re-solve the problem at sometime in the future. That's good software!

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-28 14:07             ` Gregor Jasny
@ 2007-05-29  9:28               ` Tejun Heo
  2007-05-29 15:19                 ` Gregor Jasny
  2007-05-29 16:44                 ` Linus Torvalds
  0 siblings, 2 replies; 53+ messages in thread
From: Tejun Heo @ 2007-05-29  9:28 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Linus Torvalds, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox

Hello, Gregor.

Gregor Jasny wrote:
> [   18.639188] ata_piix 0000:00:07.1: version 2.11
> [   18.639718] scsi0 : ata_piix
> [   18.640189] scsi1 : ata_piix
> [   18.640541] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
> bmdma 0x00010860 irq 14
> [   18.641148] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
> bmdma 0x00010868 irq 15
> [    0.653333] Marking TSC unstable due to: possible TSC halt in C2.
> [    0.656666] Time: acpi_pm clocksource has been installed.
> [    0.656666] Switched to NOHz mode on CPU #0
> [    0.779999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
> hpa_sectors = 78242976
> [    0.779999] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
> [    0.796666] ata1.00: 78242976 sectors, multi 8: LBA48
> [    0.816666] ata1.00: ata_hpa_resize 1: sectors = 78242976,
> hpa_sectors = 78242976
> [    0.829999] ata1.00: configured for UDMA/33
> [    1.376666] Clocksource tsc unstable (delta = -496294688 ns)
> [   10.866665] ata2: SRST failed (errno=-16)
> [   20.899997] ata2: SRST failed (errno=-16)
> [   55.946661] ata2: SRST failed (errno=-16)
> [   60.966660] ata2: SRST failed (errno=-16)
> [   60.979993] ata2: reset failed, giving up

Aieee, so the drive doesn't like the new SRST sequence.  Indan pointed
me to your original posting and before the reset sequence update your
drive was reset successfully but only after 30sec delay, right?  If you
change the first entry in ata_eh_reset_timeouts[] table at the top of
libata-eh.c to 35 * HZ, does it work?

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-29  9:28               ` Tejun Heo
@ 2007-05-29 15:19                 ` Gregor Jasny
  2007-05-29 16:44                 ` Linus Torvalds
  1 sibling, 0 replies; 53+ messages in thread
From: Gregor Jasny @ 2007-05-29 15:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox

2007/5/29, Tejun Heo <htejun@gmail.com>:
> Aieee, so the drive doesn't like the new SRST sequence.  Indan pointed
> me to your original posting and before the reset sequence update your
> drive was reset successfully but only after 30sec delay, right?  If you
> change the first entry in ata_eh_reset_timeouts[] table at the top of
> libata-eh.c to 35 * HZ, does it work?

No, it even times out after 35 seconds. I've raised the first timeout
to 60 seconds but still no success.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-29  9:28               ` Tejun Heo
  2007-05-29 15:19                 ` Gregor Jasny
@ 2007-05-29 16:44                 ` Linus Torvalds
  2007-06-01  0:58                   ` Tejun Heo
  1 sibling, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-05-29 16:44 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Gregor Jasny, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox



On Tue, 29 May 2007, Tejun Heo wrote:
> 
> Aieee, so the drive doesn't like the new SRST sequence.

It would appear that the old code largely ignored the SRST error entirely, 
no?

If we *used* to do (in ata_bus_post_reset()):

	if (dev0)
		ata_busy_sleep(ap, ATA_TMOUT_BOOT_QUICK, ATA_TMOUT_BOOT);

and you changed that to actually care about the return value:

	if (dev0) {
		rc = ata_wait_ready(ap, deadline);
		if (rc && rc != -ENODEV)
			return rc;
	}

(in _two_ places). That change also changed the same "post_reset" handling 
in a totally _different_ way: it used to do ata_busy_sleep() twice, now it 
still does it twice, but it does it with the same "timeout" value, so if 
the first one times out, then the second one won't be given any timeout AT 
ALL!

And to make matters worse: the first timeout seems to be for ANOTHER PORT 
ENTIRELY! So you seem to break port 1 even if the timeout happened on port 
0, as far as I can read that sequence. 

So I think your ata_bus_post_reset() changes are rather suspect. The fact 
that you don't change the timeout, and use the same deadline for two 
different ports (and for multiple commands to the same port, afaik), seems 
rather suspect. The old code also didn't care about failures in certain 
phases of the reset sequence, and it appears that it did so for good 
reason.

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-05-29 16:44                 ` Linus Torvalds
@ 2007-06-01  0:58                   ` Tejun Heo
  2007-06-01  1:37                     ` Linus Torvalds
  0 siblings, 1 reply; 53+ messages in thread
From: Tejun Heo @ 2007-06-01  0:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Gregor Jasny, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox

Hello, Linus.  Sorry about late response.  I'm still recovering from jet
lag.

Linus Torvalds wrote:
> On Tue, 29 May 2007, Tejun Heo wrote:
>> Aieee, so the drive doesn't like the new SRST sequence.
> 
> It would appear that the old code largely ignored the SRST error entirely, 
> no?

Yes, we used to ignore some error conditions, which sometimes led to
very long timeout sequence after hotplugging under certain circumstances
- a few 30sec timeouts for the reset and another 30sec for the following
IDENTIFY (because reset didn't fail after the timeouts).

> If we *used* to do (in ata_bus_post_reset()):
> 
> 	if (dev0)
> 		ata_busy_sleep(ap, ATA_TMOUT_BOOT_QUICK, ATA_TMOUT_BOOT);
> 
> and you changed that to actually care about the return value:
> 
> 	if (dev0) {
> 		rc = ata_wait_ready(ap, deadline);
> 		if (rc && rc != -ENODEV)
> 			return rc;
> 	}
> 
> (in _two_ places). That change also changed the same "post_reset" handling 
> in a totally _different_ way: it used to do ata_busy_sleep() twice, now it 
> still does it twice, but it does it with the same "timeout" value, so if 
> the first one times out, then the second one won't be given any timeout AT 
> ALL!
> 
> And to make matters worse: the first timeout seems to be for ANOTHER PORT 
> ENTIRELY! So you seem to break port 1 even if the timeout happened on port 
> 0, as far as I can read that sequence. 

Yes, the whole operation uses single timeout.  If the given time runs
out, it fails for both devices on the channel.  This is because reset
timeout is channel wide.  After SRST, both devices are required to
respond in certain time (30secs max per spec).  If anyone of the device
doesn't respond, it isn't clear what we can or can't do with the bus -
reset protocol itself requires responding master, cable detection needs
both devices working, etc...

> So I think your ata_bus_post_reset() changes are rather suspect. The fact 
> that you don't change the timeout, and use the same deadline for two 
> different ports (and for multiple commands to the same port, afaik), seems 
> rather suspect. The old code also didn't care about failures in certain 
> phases of the reset sequence, and it appears that it did so for good 
> reason.

Gregor's cdrom has broken SRST support which is extremely rare and
broken.  The reason why it works flawlessly with the ide driver is
because the ide driver doesn't issue SRST during probing and libata
worked after 30sec timeout before reset-seq change because libata didn't
use to check reset failures in some cases and the device just worked
when issued commands even when it didn't report ready status after reset.

I'll try to add some code in the reset path and see where the device
fails reset protocol.  Hopefully, we can work around it.  If it doesn't,
maybe we'll need to issue IDENTIFY and see whether that works after
reset timeout.  :-(

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01  0:58                   ` Tejun Heo
@ 2007-06-01  1:37                     ` Linus Torvalds
  2007-06-01  2:19                       ` Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-06-01  1:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Gregor Jasny, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox



On Fri, 1 Jun 2007, Tejun Heo wrote:
>
> Gregor's cdrom has broken SRST support which is extremely rare and
> broken.

Well, the concept is neither rare nor arguably all that broken. 

The paper standards floating around in the industry are so much toilet 
paper. The only standard that seems to really matter is what Windows has 
traditionally done. We may not like it, but there it is...

This bites us more when it comes to the real el-cheapo stuff, notably when 
it comes to various USB mass storage things (which have some random 
USB->flash controller cobbled together by a senile llama on crack), and is 
almost unheard of for anythign that is "server-grade", but when it comes 
to no-brand random devices, it really does tend to be the case that the 
only testing they ever had was using Windows.

And hardware is seldom any different from software: if it wasn't tested, 
it probably doesn't work.

So it would be good if somebody knew what the Windows ID/startup sequence 
was/is. I think we figured it out by trial-and-error for the USB mass 
storage stuff. But it tends to boil down to: don't do things that aren't 
absolutely required (for SCSI, it was things like not asking for mode 
pages that weren't absolutely required, because some devices wouldn't 
support it, and would simply lock up if you did so!)

			Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01  1:37                     ` Linus Torvalds
@ 2007-06-01  2:19                       ` Tejun Heo
  2007-06-01 16:50                         ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Tejun Heo @ 2007-06-01  2:19 UTC (permalink / raw)
  To: Linus Torvalds, Jeff Garzik
  Cc: Gregor Jasny, Linux Kernel Mailing List, linux-ide, Alan Cox

Hello,

Linus Torvalds wrote:
> On Fri, 1 Jun 2007, Tejun Heo wrote:
>> Gregor's cdrom has broken SRST support which is extremely rare and
>> broken.
> 
> Well, the concept is neither rare nor arguably all that broken. 
> 
> The paper standards floating around in the industry are so much toilet 
> paper. The only standard that seems to really matter is what Windows has 
> traditionally done. We may not like it, but there it is...
> 
> This bites us more when it comes to the real el-cheapo stuff, notably when 
> it comes to various USB mass storage things (which have some random 
> USB->flash controller cobbled together by a senile llama on crack), and is 
> almost unheard of for anythign that is "server-grade", but when it comes 
> to no-brand random devices, it really does tend to be the case that the 
> only testing they ever had was using Windows.
> 
> And hardware is seldom any different from software: if it wasn't tested, 
> it probably doesn't work.

Yeap, agreed.  SRST on PATA works on most devices tho.  This device is
the first reported one which genuinely doesn't like SRST itself but we
also had a case where a device doesn't report proper diagnostic code and
another one which reports incorrect device class code.

Hardreset on SATA works better because it's much more integral part of
the protocol.  SRST also seems to work better but there is an emulated
PMP device which craps itself if SRST is issued to it.

> So it would be good if somebody knew what the Windows ID/startup sequence 
> was/is. I think we figured it out by trial-and-error for the USB mass 
> storage stuff. But it tends to boil down to: don't do things that aren't 
> absolutely required (for SCSI, it was things like not asking for mode 
> pages that weren't absolutely required, because some devices wouldn't 
> support it, and would simply lock up if you did so!)

Most BIOSen, Windows and old IDE driver don't reset at all during
probing.  They first issue IDENTIFY unconditionally, if that fails,
IDENTIFY_PACKET.  From the beginning, libata has issued reset during
probing.  We've had a few problems regarding this but they have been
worked around successfully till now.  One of the upsides of doing reset
during probing is that the reset code path gets tested a lot and libata
is more likely to recover properly after error conditions.  With SATA,
this is getting more important because errors are much more common and
happen occasionally even on perfectly healthy machines.

As libata gets much wider userbase including old/weird PATA devices, we
seem to be facing more of these broken devices.  We may be reaching the
point where the benefit of doing reset during probing isn't worth the
price we have to pay, at least on PATA.

Jeff, what do you think?

-- 
tejun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01  2:19                       ` Tejun Heo
@ 2007-06-01 16:50                         ` Jeff Garzik
  2007-06-01 17:04                           ` Linus Torvalds
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-01 16:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox

Tejun Heo wrote:
> Most BIOSen, Windows and old IDE driver don't reset at all during
> probing.  They first issue IDENTIFY unconditionally, if that fails,
> IDENTIFY_PACKET.  From the beginning, libata has issued reset during

Not true for BIOS.  A large sub-section of BIOS (Phoenix and/or 
Award-based BIOSen) do SRST along with the Hale Landis device detection 
(ata_devchk in libata-core.c).  Ditto for several ATA vendor BIOS found 
on the card.

I'm about to dive into some heads-down RHEL backporting (whee), so I 
cannot look at the code in depth this weekend, but here are my basic 
thoughts:

* We knew there would be fallout from the new reset-sequence code, and 
this is clearly in that category.

* It worked before #reset-seq merge AFAICT, which implies the old method 
of probing -- which included SRST -- worked.

* If this was a major problem, I would think there would be a flood of 
bug reports for Fedora 7 (just released, and in testing w/ #reset-seq 
for a little while), since it is using libata for PATA as well as SATA. 
  So this, just this one bug report right?


I would go back and look at the differences in the low-level register 
bitbanging, and what specifically changed there.  If the old stuff 
worked, that tends to imply a problem with the new stuff...

	Jeff



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 16:50                         ` Jeff Garzik
@ 2007-06-01 17:04                           ` Linus Torvalds
  2007-06-01 17:35                             ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-06-01 17:04 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox



On Fri, 1 Jun 2007, Jeff Garzik wrote:
> 
> I'm about to dive into some heads-down RHEL backporting (whee), so I cannot
> look at the code in depth this weekend, but here are my basic thoughts:
> 
> * We knew there would be fallout from the new reset-sequence code, and this is
> clearly in that category.
> 
> * It worked before #reset-seq merge AFAICT, which implies the old method of
> probing -- which included SRST -- worked.

Well, I don't think it really "worked" before. It apparently always had a 
bad 30-second timeout (probably because the reset just didn't work at 
all). It's just that the old code didn't care, and since the identify then 
worked, it was all good.

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 17:04                           ` Linus Torvalds
@ 2007-06-01 17:35                             ` Jeff Garzik
  2007-06-01 17:59                               ` Linus Torvalds
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-01 17:35 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox

Linus Torvalds wrote:
> 
> On Fri, 1 Jun 2007, Jeff Garzik wrote:
>> I'm about to dive into some heads-down RHEL backporting (whee), so I cannot
>> look at the code in depth this weekend, but here are my basic thoughts:
>>
>> * We knew there would be fallout from the new reset-sequence code, and this is
>> clearly in that category.
>>
>> * It worked before #reset-seq merge AFAICT, which implies the old method of
>> probing -- which included SRST -- worked.
> 
> Well, I don't think it really "worked" before. It apparently always had a 
> bad 30-second timeout (probably because the reset just didn't work at 
> all). It's just that the old code didn't care, and since the identify then 
> worked, it was all good.

<reviews thread again>  Ah, indeed.  I certainly prefer the old result 
to the new one.

With these old PATA devices, device reset is "six of one, half-dozen of 
the other."  Using SRST is the only way to kick some ATAPI devices into 
working: http://suif.stanford.edu/~csapuntz/blackmagic.html#reset

I'm mainly interested in hearing feedback from Fedora 7 damage, before 
making a major decision about the probing code.  If this is a single 
dain bramaged device, we should avoid punishing the majority.  But if 
this is a trend, it warrants careful reconsideration.

The current code already has the IDENTIFY retry stuff, so it sounds like 
restoring the "don't care" part should be enough to restore the older 
behavior.

	Jeff



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 17:35                             ` Jeff Garzik
@ 2007-06-01 17:59                               ` Linus Torvalds
  2007-06-01 18:20                                 ` Dave Jones
  2007-06-01 18:41                                 ` Jeff Garzik
  0 siblings, 2 replies; 53+ messages in thread
From: Linus Torvalds @ 2007-06-01 17:59 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox



On Fri, 1 Jun 2007, Jeff Garzik wrote:
> 
> With these old PATA devices, device reset is "six of one, half-dozen of the
> other."  Using SRST is the only way to kick some ATAPI devices into working:
> http://suif.stanford.edu/~csapuntz/blackmagic.html#reset

Well, wouldn't it be a good thing to
 1) if BUSY/DRQ is set even before you try the problem, obviously skip the 
    two polite cases, and go to #4
 2) try to just do an IDENTIFY 
 3) if that doesn't work, do a HOST RESET and then try again
 4) if that doesn't work, do the full SRST

(or some variation of the above).

That at least has the potential to get you six working devices, and half a 
dozen other working devices, for a total of 12. With no unnecessary resets 
or long timeouts!

(Btw, the 150ms wait after reset is really nasty. A few of those, and 
we're wasting seconds during bootup. Why the hell does it do that, when 
the old driver - and the spec - said 2ms?)

> I'm mainly interested in hearing feedback from Fedora 7 damage, before making
> a major decision about the probing code.  If this is a single dain bramaged
> device, we should avoid punishing the majority.  But if this is a trend, it
> warrants careful reconsideration.

I thought the default in Fedora was to use the PATA driver first? If so, 
you're going to miss a lot of cases that "just work", because they use the 
old driver.

At least that was what my situation was on the P4 that I recently 
complained about the "set transfer mode" problem: it used to use the old 
PATA drivers that didn't have the issue, so I never even _knew_ the new 
libata-based code had problems.

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 17:59                               ` Linus Torvalds
@ 2007-06-01 18:20                                 ` Dave Jones
  2007-06-01 18:30                                   ` Linus Torvalds
  2007-06-01 18:41                                 ` Jeff Garzik
  1 sibling, 1 reply; 53+ messages in thread
From: Dave Jones @ 2007-06-01 18:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Tejun Heo, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox

On Fri, Jun 01, 2007 at 10:59:51AM -0700, Linus Torvalds wrote:

 > > I'm mainly interested in hearing feedback from Fedora 7 damage, before making
 > > a major decision about the probing code.  If this is a single dain bramaged
 > > device, we should avoid punishing the majority.  But if this is a trend, it
 > > warrants careful reconsideration.
 > 
 > I thought the default in Fedora was to use the PATA driver first? If so, 
 > you're going to miss a lot of cases that "just work", because they use the 
 > old driver.

There are no old drivers in F7 and beyond.

# CONFIG_IDE is not set

	Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 18:20                                 ` Dave Jones
@ 2007-06-01 18:30                                   ` Linus Torvalds
  2007-06-01 18:46                                     ` Dave Jones
  0 siblings, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-06-01 18:30 UTC (permalink / raw)
  To: Dave Jones
  Cc: Jeff Garzik, Tejun Heo, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox



On Fri, 1 Jun 2007, Dave Jones wrote:
> 
> There are no old drivers in F7 and beyond.
> 
> # CONFIG_IDE is not set

Ahh, that's certainly going to root out the issues. Now let's hope that 
people install it..

(Fedora seems to make it hard on purpose to upgrade between major 
revisions, but maybe that's just me not reading the docs as usual ;)

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 17:59                               ` Linus Torvalds
  2007-06-01 18:20                                 ` Dave Jones
@ 2007-06-01 18:41                                 ` Jeff Garzik
  2007-06-01 18:48                                   ` Jeff Garzik
  2007-06-02  7:50                                   ` Tejun Heo
  1 sibling, 2 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-01 18:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox

[-- Attachment #1: Type: text/plain, Size: 1949 bytes --]

Linus Torvalds wrote:
> 
> On Fri, 1 Jun 2007, Jeff Garzik wrote:
>> With these old PATA devices, device reset is "six of one, half-dozen of the
>> other."  Using SRST is the only way to kick some ATAPI devices into working:
>> http://suif.stanford.edu/~csapuntz/blackmagic.html#reset
> 
> Well, wouldn't it be a good thing to
>  1) if BUSY/DRQ is set even before you try the problem, obviously skip the 
>     two polite cases, and go to #4
>  2) try to just do an IDENTIFY 
>  3) if that doesn't work, do a HOST RESET and then try again
>  4) if that doesn't work, do the full SRST
> 
> (or some variation of the above).

Skipping reset means it doesn't get the device away from a state that 
the previous boot may have configured itself to... standard "I didn't do 
reset" problems you see with any hardware.  Transfer modes and 
removeable media status notification are the most notable that are left 
in a semi-random state, but there are many other minor feature bits that 
fall into this category as well.


> (Btw, the 150ms wait after reset is really nasty. A few of those, and 
> we're wasting seconds during bootup. Why the hell does it do that, when 
> the old driver - and the spec - said 2ms?)

Upon quick review, it appears it should be 110ms.  And actually the spec 
says 3ms.  But to answer your question anyway... :)

The basic libata probe came from the code procedure that is found in a 
lot of PC BIOSen, even (IMO) more exposure than Linux:  Hale Landis's 
ATADRVR.  See the attached code sample for the relevant procedure, or 
the entire driver source code (PC DOS-style) at 
http://www.ata-atapi.com/atadrvr.zip

Upon further review, it looks like the 110ms delay is only per-ATAPI 
command, not during reset, as the sub_atapi_delay() calls don't actually 
delay before the ATAPI signature has been detected.


> I thought the default in Fedora was to use the PATA driver first? If so, 

<grin>  (see davej's reply)

	Jeff



[-- Attachment #2: example.txt --]
[-- Type: text/plain, Size: 7156 bytes --]


//*************************************************************
//
// reg_config() - Check the host adapter and determine the
//                number and type of drives attached.
//
// This process is not documented by any of the ATA standards.
//
// Infomation is returned by this function is in
// reg_config_info[] -- see ATAIO.H.
//
//*************************************************************

int reg_config( void )

{
   int numDev = 0;
   unsigned char sc;
   unsigned char sn;
   unsigned char cl;
   unsigned char ch;
   unsigned char st;
   unsigned char devCtrl;

   // setup register values

   devCtrl = CB_DC_HD15 | ( int_use_intr_flag ? 0 : CB_DC_NIEN );

   // mark start of config in low level trace

   trc_llt( 0, 0, TRC_LLT_S_CFG );

   // assume there are no devices

   reg_config_info[0] = REG_CONFIG_TYPE_NONE;
   reg_config_info[1] = REG_CONFIG_TYPE_NONE;

   // set up Device Control register

   pio_outbyte( CB_DC, devCtrl );

   // lets see if there is a device 0

   pio_outbyte( CB_DH, CB_DH_DEV0 );
   DELAY400NS;
   pio_outbyte( CB_SC, 0x55 );
   pio_outbyte( CB_SN, 0xaa );
   pio_outbyte( CB_SC, 0xaa );
   pio_outbyte( CB_SN, 0x55 );
   pio_outbyte( CB_SC, 0x55 );
   pio_outbyte( CB_SN, 0xaa );
   sc = pio_inbyte( CB_SC );
   sn = pio_inbyte( CB_SN );
   if ( ( sc == 0x55 ) && ( sn == 0xaa ) )
      reg_config_info[0] = REG_CONFIG_TYPE_UNKN;

   // lets see if there is a device 1

   pio_outbyte( CB_DH, CB_DH_DEV1 );
   DELAY400NS;
   pio_outbyte( CB_SC, 0x55 );
   pio_outbyte( CB_SN, 0xaa );
   pio_outbyte( CB_SC, 0xaa );
   pio_outbyte( CB_SN, 0x55 );
   pio_outbyte( CB_SC, 0x55 );
   pio_outbyte( CB_SN, 0xaa );
   sc = pio_inbyte( CB_SC );
   sn = pio_inbyte( CB_SN );
   if ( ( sc == 0x55 ) && ( sn == 0xaa ) )
      reg_config_info[1] = REG_CONFIG_TYPE_UNKN;

   // now we think we know which devices, if any are there,
   // so lets try a soft reset (ignoring any errors).

   pio_outbyte( CB_DH, CB_DH_DEV0 );
   DELAY400NS;
   reg_reset( 0, 0 );

   // lets check device 0 again, is device 0 really there?
   // is it ATA or ATAPI?

   pio_outbyte( CB_DH, CB_DH_DEV0 );
   DELAY400NS;
   sc = pio_inbyte( CB_SC );
   sn = pio_inbyte( CB_SN );
   if ( ( sc == 0x01 ) && ( sn == 0x01 ) )
   {
      reg_config_info[0] = REG_CONFIG_TYPE_UNKN;
      cl = pio_inbyte( CB_CL );
      ch = pio_inbyte( CB_CH );
      st = pio_inbyte( CB_STAT );
      if ( ( cl == 0x14 ) && ( ch == 0xeb ) )
         reg_config_info[0] = REG_CONFIG_TYPE_ATAPI;
      else
         if ( ( cl == 0x00 ) && ( ch == 0x00 ) && ( st != 0x00 ) )
            reg_config_info[0] = REG_CONFIG_TYPE_ATA;
   }

   // lets check device 1 again, is device 1 really there?
   // is it ATA or ATAPI?

   pio_outbyte( CB_DH, CB_DH_DEV1 );
   DELAY400NS;
   sc = pio_inbyte( CB_SC );
   sn = pio_inbyte( CB_SN );
   if ( ( sc == 0x01 ) && ( sn == 0x01 ) )
   {
      reg_config_info[1] = REG_CONFIG_TYPE_UNKN;
      cl = pio_inbyte( CB_CL );
      ch = pio_inbyte( CB_CH );
      st = pio_inbyte( CB_STAT );
      if ( ( cl == 0x14 ) && ( ch == 0xeb ) )
         reg_config_info[1] = REG_CONFIG_TYPE_ATAPI;
      else
         if ( ( cl == 0x00 ) && ( ch == 0x00 ) && ( st != 0x00 ) )
            reg_config_info[1] = REG_CONFIG_TYPE_ATA;
   }

   // If possible, select a device that exists, try device 0 first.

   if ( reg_config_info[1] != REG_CONFIG_TYPE_NONE )
   {
      pio_outbyte( CB_DH, CB_DH_DEV1 );
      DELAY400NS;
      numDev ++ ;
   }
   if ( reg_config_info[0] != REG_CONFIG_TYPE_NONE )
   {
      pio_outbyte( CB_DH, CB_DH_DEV0 );
      DELAY400NS;
      numDev ++ ;
   }

   // mark end of config in low level trace

   trc_llt( 0, 0, TRC_LLT_E_CFG );

   // return the number of devices found

   return numDev;
}

//*************************************************************
//
// reg_reset() - Execute a Software Reset.
//
// See ATA-2 Section 9.2, ATA-3 Section 9.2, ATA-4 Section 8.3.
//
//*************************************************************

int reg_reset( int skipFlag, int devRtrn )

{
   unsigned char sc;
   unsigned char sn;
   unsigned char status;
   unsigned char devCtrl;

   // setup register values

   devCtrl = CB_DC_HD15 | ( int_use_intr_flag ? 0 : CB_DC_NIEN );

   // mark start of reset in low level trace

   trc_llt( 0, 0, TRC_LLT_S_RST );

   // Reset error return data.

   sub_zero_return_data();
   reg_cmd_info.flg = TRC_FLAG_SRST;
   reg_cmd_info.ct  = TRC_TYPE_ASR;

   // initialize the command timeout counter

   tmr_set_timeout();

   // Set and then reset the soft reset bit in the Device Control
   // register.  This causes device 0 be selected.

   if ( ! skipFlag )
   {
      pio_outbyte( CB_DC, devCtrl | CB_DC_SRST );
      DELAY400NS;
      pio_outbyte( CB_DC, devCtrl );
      DELAY400NS;
   }

   // If there is a device 0, wait for device 0 to set BSY=0.

   if ( reg_config_info[0] != REG_CONFIG_TYPE_NONE )
   {
      sub_atapi_delay( 0 );
      trc_llt( 0, 0, TRC_LLT_PNBSY );
      while ( 1 )
      {
         status = pio_inbyte( CB_STAT );
         if ( ( status & CB_STAT_BSY ) == 0 )
            break;
         if ( tmr_chk_timeout() )
         {
            trc_llt( 0, 0, TRC_LLT_TOUT );
            reg_cmd_info.to = 1;
            reg_cmd_info.ec = 1;
            trc_llt( 0, reg_cmd_info.ec, TRC_LLT_ERROR );
            break;
         }
      }
   }

   // If there is a device 1, wait until device 1 allows
   // register access.

   if ( reg_config_info[1] != REG_CONFIG_TYPE_NONE )
   {
      sub_atapi_delay( 1 );
      trc_llt( 0, 0, TRC_LLT_PNBSY );
      while ( 1 )
      {
         pio_outbyte( CB_DH, CB_DH_DEV1 );
         DELAY400NS;
         sc = pio_inbyte( CB_SC );
         sn = pio_inbyte( CB_SN );
         if ( ( sc == 0x01 ) && ( sn == 0x01 ) )
            break;
         if ( tmr_chk_timeout() )
         {
            trc_llt( 0, 0, TRC_LLT_TOUT );
            reg_cmd_info.to = 1;
            reg_cmd_info.ec = 2;
            trc_llt( 0, reg_cmd_info.ec, TRC_LLT_ERROR );
            break;
         }
      }

      // Now check if drive 1 set BSY=0.

      if ( reg_cmd_info.ec == 0 )
      {
         if ( pio_inbyte( CB_STAT ) & CB_STAT_BSY )
         {
            reg_cmd_info.ec = 3;
            trc_llt( 0, reg_cmd_info.ec, TRC_LLT_ERROR );
         }
      }
   }

   // RESET_DONE:

   // We are done but now we must select the device the caller
   // requested before we trace the command.  This will cause
   // the correct data to be returned in reg_cmd_info.

   pio_outbyte( CB_DH, devRtrn ? CB_DH_DEV1 : CB_DH_DEV0 );
   DELAY400NS;
   sub_trace_command();

   // If possible, select a device that exists,
   // try device 0 first.

   if ( reg_config_info[1] != REG_CONFIG_TYPE_NONE )
   {
      pio_outbyte( CB_DH, CB_DH_DEV1 );
      DELAY400NS;
   }
   if ( reg_config_info[0] != REG_CONFIG_TYPE_NONE )
   {
      pio_outbyte( CB_DH, CB_DH_DEV0 );
      DELAY400NS;
   }

   // mark end of reset in low level trace

   trc_llt( 0, 0, TRC_LLT_E_RST );

   // All done.  The return values of this function are described in
   // ATAIO.H.

   if ( reg_cmd_info.ec )
      return 1;
   return 0;
}


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 18:30                                   ` Linus Torvalds
@ 2007-06-01 18:46                                     ` Dave Jones
  0 siblings, 0 replies; 53+ messages in thread
From: Dave Jones @ 2007-06-01 18:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff Garzik, Tejun Heo, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox

On Fri, Jun 01, 2007 at 11:30:49AM -0700, Linus Torvalds wrote:

 > > There are no old drivers in F7 and beyond.
 > > # CONFIG_IDE is not set
 > 
 > Ahh, that's certainly going to root out the issues. Now let's hope that 
 > people install it..

Whilst it's too early to tell yet, there are a few really nasty bugs
that made it into our final kernel (based on 2.6.21.3).
For eg, a whole class of Dell laptops won't boot unless you boot
with maxcpus=1  I root-caused this to one of tglx's patches that went
into 2.6.21.2, but it's a head-scratcher as to why it's responsible.
I'm praying that this is the worst of the 'cant install' bugs we get,
and then I can just push out a 2.6.22 some time that will magically
make all these problems go away. (Because .22 is going to be flawless right?)

 > (Fedora seems to make it hard on purpose to upgrade between major 
 > revisions, but maybe that's just me not reading the docs as usual ;)

Probably :-)   I've done two successful upgrades yesterday, one with
anaconda, and one using just rpm -U of the fedora-release rpm from F7
and then 'yum update'.  (The former did go a little smoother though tbh).

	Dave

-- 
http://www.codemonkey.org.uk

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 18:41                                 ` Jeff Garzik
@ 2007-06-01 18:48                                   ` Jeff Garzik
  2007-06-02  7:50                                   ` Tejun Heo
  1 sibling, 0 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-01 18:48 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, Tejun Heo, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Alan Cox

Jeff Garzik wrote:
> Upon quick review, it appears it should be 110ms.  And actually the spec 
> says 3ms.  But to answer your question anyway... :)


Actually the 3ms is the DRQ assertion delay.

	Jeff



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Linux v2.6.22-rc3
  2007-06-01 18:41                                 ` Jeff Garzik
  2007-06-01 18:48                                   ` Jeff Garzik
@ 2007-06-02  7:50                                   ` Tejun Heo
  1 sibling, 0 replies; 53+ messages in thread
From: Tejun Heo @ 2007-06-02  7:50 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox

Hello, Jeff, Linus.

Jeff Garzik wrote:
> Linus Torvalds wrote:
>>
>> On Fri, 1 Jun 2007, Jeff Garzik wrote:
>>> With these old PATA devices, device reset is "six of one, half-dozen
>>> of the
>>> other."  Using SRST is the only way to kick some ATAPI devices into
>>> working:
>>> http://suif.stanford.edu/~csapuntz/blackmagic.html#reset
>>
>> Well, wouldn't it be a good thing to
>>  1) if BUSY/DRQ is set even before you try the problem, obviously skip
>> the     two polite cases, and go to #4
>>  2) try to just do an IDENTIFY  3) if that doesn't work, do a HOST
>> RESET and then try again
>>  4) if that doesn't work, do the full SRST
>>
>> (or some variation of the above).
> 
> Skipping reset means it doesn't get the device away from a state that
> the previous boot may have configured itself to... standard "I didn't do
> reset" problems you see with any hardware.  Transfer modes and
> removeable media status notification are the most notable that are left
> in a semi-random state, but there are many other minor feature bits that
> fall into this category as well.

libata configures most of the stuff, so I don't think we'll see big
surprises even if we skip SRST during probing but I agree that it's nice
to give good kicks in the devices' asses during probing.

We can try IDENTIFY/IDENTIFY_PACKET with short timeout first and then
issue reset if the device isn't in reset blacklist, but it would make
probe sequence....

 IDENTIFY -> reset -> IDENTIFY -> configure -> IDENTIFY for reval

Which doesn't seem too attractive.  We can use the result from the first
IDENTIFY for configuration but it's probably a good idea to re-read
IDENTIFY page after reset.

It would be best if we can handle these braindead SRST-impaired devices
in the common code, if that's not feasible, we should at least provide
some option to allow correct (without timeout) detection of these devices.

Thanks.

-- 
tejun


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH] Re: Linux v2.6.22-rc3
  2007-05-27 15:01 ` Linux v2.6.22-rc3 Gregor Jasny
  2007-05-27 15:06   ` Jeff Garzik
@ 2007-06-02 16:11   ` Jeff Garzik
  2007-06-03 17:46     ` Gregor Jasny
  1 sibling, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-02 16:11 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ide, Alan Cox

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

Gregor Jasny wrote:
> Hi,
> 
> 2007/5/26, Linus Torvalds <torvalds@linux-foundation.org>:
>> What more could you possibly want? Some ATA updates? USB suspend problem
> 
> 22-rc3 broke the CDROM in my Dell notebook.  After I've switched to
> libata som time ago, I've got some delays/timeouts during boot [1].
> But the drive works as expected. With 2.6.22-rc3 I've got the
> following messages during bootup:

Does this patch change the behavior at all?

	Jeff




[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 11463 bytes --]

diff --git a/drivers/ata/ata_piix.c b/drivers/ata/ata_piix.c
index 9c07b88..8b58597 100644
--- a/drivers/ata/ata_piix.c
+++ b/drivers/ata/ata_piix.c
@@ -288,6 +288,7 @@ static const struct ata_port_operations piix_pata_ops = {
 	.exec_command		= ata_exec_command,
 	.dev_select		= ata_std_dev_select,
 
+	.cable_detect		= ata_cable_40wire,
 	.bmdma_setup		= ata_bmdma_setup,
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
@@ -300,7 +301,6 @@ static const struct ata_port_operations piix_pata_ops = {
 	.thaw			= ata_bmdma_thaw,
 	.error_handler		= piix_pata_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
-	.cable_detect		= ata_cable_40wire,
 
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
@@ -322,6 +322,7 @@ static const struct ata_port_operations ich_pata_ops = {
 	.exec_command		= ata_exec_command,
 	.dev_select		= ata_std_dev_select,
 
+	.cable_detect		= ich_pata_cable_detect,
 	.bmdma_setup		= ata_bmdma_setup,
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
@@ -334,7 +335,6 @@ static const struct ata_port_operations ich_pata_ops = {
 	.thaw			= ata_bmdma_thaw,
 	.error_handler		= piix_pata_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
-	.cable_detect		= ich_pata_cable_detect,
 
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
@@ -353,6 +353,7 @@ static const struct ata_port_operations piix_sata_ops = {
 	.exec_command		= ata_exec_command,
 	.dev_select		= ata_std_dev_select,
 
+	.cable_detect		= ata_cable_sata,
 	.bmdma_setup		= ata_bmdma_setup,
 	.bmdma_start		= ata_bmdma_start,
 	.bmdma_stop		= ata_bmdma_stop,
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 3ca9c61..dbd590a 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3035,7 +3035,7 @@ int ata_wait_ready(struct ata_port *ap, unsigned long deadline)
 			warned = 1;
 		}
 
-		msleep(50);
+		msleep(5);
 	}
 }
 
@@ -3072,7 +3072,7 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 			break;
 		if (time_after(jiffies, deadline))
 			return -EBUSY;
-		msleep(50);	/* give drive a breather */
+		msleep(5);	/* give drive a breather */
 	}
 	if (dev1) {
 		rc = ata_wait_ready(ap, deadline);
@@ -3101,23 +3101,13 @@ static int ata_bus_softreset(struct ata_port *ap, unsigned int devmask,
 	DPRINTK("ata%u: bus reset via SRST\n", ap->print_id);
 
 	/* software reset.  causes dev0 to be selected */
-	iowrite8(ap->ctl, ioaddr->ctl_addr);
-	udelay(20);	/* FIXME: flush */
 	iowrite8(ap->ctl | ATA_SRST, ioaddr->ctl_addr);
-	udelay(20);	/* FIXME: flush */
+	ata_pause(ap);
 	iowrite8(ap->ctl, ioaddr->ctl_addr);
+	ata_pause(ap);
 
-	/* spec mandates ">= 2ms" before checking status.
-	 * We wait 150ms, because that was the magic delay used for
-	 * ATAPI devices in Hale Landis's ATADRVR, for the period of time
-	 * between when the ATA command register is written, and then
-	 * status is checked.  Because waiting for "a while" before
-	 * checking status is fine, post SRST, we perform this magic
-	 * delay here as well.
-	 *
-	 * Old drivers/ide uses the 2mS rule and then waits for ready
-	 */
-	msleep(150);
+	/* spec mandates ">= 2ms" before checking status */
+	msleep(2);
 
 	/* Before we perform post reset processing we want to see if
 	 * the bus shows 0xFF because the odd clown forgets the D7
@@ -3363,6 +3353,35 @@ int ata_std_prereset(struct ata_port *ap, unsigned long deadline)
 					"link for reset (errno=%d)\n", rc);
 	}
 
+	return 0;
+}
+
+/**
+ *	sata_std_prereset - prepare for reset
+ *	@ap: ATA port to be reset
+ *	@deadline: deadline jiffies for the operation
+ *
+ *	@ap is about to be reset.  Initialize it.  Failure from
+ *	prereset makes libata abort whole reset sequence and give up
+ *	that port, so prereset should be best-effort.  It does its
+ *	best to prepare for reset sequence but if things go wrong, it
+ *	should just whine, not fail.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep)
+ *
+ *	RETURNS:
+ *	0 on success, -errno otherwise.
+ */
+int sata_std_prereset(struct ata_port *ap, unsigned long deadline)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	int rc;
+
+	rc = ata_std_prereset(ap, deadline);
+	if (rc)
+		return rc;
+
 	/* Wait for !BSY if the controller can wait for the first D2H
 	 * Reg FIS and we don't know that no device is attached.
 	 */
@@ -3404,9 +3423,13 @@ int ata_std_softreset(struct ata_port *ap, unsigned int *classes,
 
 	if (ata_port_offline(ap)) {
 		classes[0] = ATA_DEV_NONE;
+		classes[1] = ATA_DEV_NONE;
 		goto out;
 	}
 
+	/* set up Device Control */
+	iowrite8(ap->ctl, ap->ioaddr.ctl_addr);
+
 	/* determine if device 0/1 are present */
 	if (ata_devchk(ap, 0))
 		devmask |= (1 << 0);
@@ -3419,6 +3442,7 @@ int ata_std_softreset(struct ata_port *ap, unsigned int *classes,
 	/* issue bus reset */
 	DPRINTK("about to softreset, devmask=%x\n", devmask);
 	rc = ata_bus_softreset(ap, devmask, deadline);
+
 	/* if link is occupied, -ENODEV too is an error */
 	if (rc && (rc != -ENODEV || sata_scr_valid(ap))) {
 		ata_port_printk(ap, KERN_ERR, "SRST failed (errno=%d)\n", rc);
@@ -3534,7 +3558,7 @@ int sata_std_hardreset(struct ata_port *ap, unsigned int *class,
 	}
 
 	/* wait a while before checking status, see SRST for more info */
-	msleep(150);
+	msleep(3);
 
 	rc = ata_wait_ready(ap, deadline);
 	/* link occupied, -ENODEV too is an error */
@@ -6837,6 +6861,7 @@ EXPORT_SYMBOL_GPL(ata_bmdma_freeze);
 EXPORT_SYMBOL_GPL(ata_bmdma_thaw);
 EXPORT_SYMBOL_GPL(ata_bmdma_drive_eh);
 EXPORT_SYMBOL_GPL(ata_bmdma_error_handler);
+EXPORT_SYMBOL_GPL(sata_bmdma_error_handler);
 EXPORT_SYMBOL_GPL(ata_bmdma_post_internal_cmd);
 EXPORT_SYMBOL_GPL(ata_port_probe);
 EXPORT_SYMBOL_GPL(ata_dev_disable);
@@ -6847,6 +6872,7 @@ EXPORT_SYMBOL_GPL(sata_phy_reset);
 EXPORT_SYMBOL_GPL(__sata_phy_reset);
 EXPORT_SYMBOL_GPL(ata_bus_reset);
 EXPORT_SYMBOL_GPL(ata_std_prereset);
+EXPORT_SYMBOL_GPL(sata_std_prereset);
 EXPORT_SYMBOL_GPL(ata_std_softreset);
 EXPORT_SYMBOL_GPL(sata_port_hardreset);
 EXPORT_SYMBOL_GPL(sata_std_hardreset);
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index e35d134..3233fe9 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -503,6 +503,27 @@ void ata_bmdma_error_handler(struct ata_port *ap)
 }
 
 /**
+ *	sata_bmdma_error_handler - Stock SATA error handler for BMDMA controller
+ *	@ap: port to handle error for
+ *
+ *	Stock error handler for BMDMA controller.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep)
+ */
+void sata_bmdma_error_handler(struct ata_port *ap)
+{
+	ata_reset_fn_t hardreset;
+
+	hardreset = NULL;
+	if (sata_scr_valid(ap))
+		hardreset = sata_std_hardreset;
+
+	ata_bmdma_drive_eh(ap, sata_std_prereset, ata_std_softreset, hardreset,
+			   ata_std_postreset);
+}
+
+/**
  *	ata_bmdma_post_internal_cmd - Stock post_internal_cmd for
  *				      BMDMA controller
  *	@qc: internal command to clean up
diff --git a/drivers/ata/sata_sil.c b/drivers/ata/sata_sil.c
index a3b339b..feac2c3 100644
--- a/drivers/ata/sata_sil.c
+++ b/drivers/ata/sata_sil.c
@@ -202,7 +202,7 @@ static const struct ata_port_operations sil_ops = {
 	.data_xfer		= ata_data_xfer,
 	.freeze			= sil_freeze,
 	.thaw			= sil_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.irq_clear		= ata_bmdma_irq_clear,
 	.irq_on			= ata_irq_on,
diff --git a/drivers/ata/sata_sis.c b/drivers/ata/sata_sis.c
index 221099d..852e1e0 100644
--- a/drivers/ata/sata_sis.c
+++ b/drivers/ata/sata_sis.c
@@ -119,7 +119,7 @@ static const struct ata_port_operations sis_ops = {
 	.data_xfer		= ata_data_xfer,
 	.freeze			= ata_bmdma_freeze,
 	.thaw			= ata_bmdma_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.irq_clear		= ata_bmdma_irq_clear,
 	.irq_on			= ata_irq_on,
diff --git a/drivers/ata/sata_svw.c b/drivers/ata/sata_svw.c
index bcb2cd8..362ad0b 100644
--- a/drivers/ata/sata_svw.c
+++ b/drivers/ata/sata_svw.c
@@ -343,7 +343,7 @@ static const struct ata_port_operations k2_sata_ops = {
 	.data_xfer		= ata_data_xfer,
 	.freeze			= ata_bmdma_freeze,
 	.thaw			= ata_bmdma_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.irq_clear		= ata_bmdma_irq_clear,
 	.irq_on			= ata_irq_on,
diff --git a/drivers/ata/sata_uli.c b/drivers/ata/sata_uli.c
index 6815de7..29c7be7 100644
--- a/drivers/ata/sata_uli.c
+++ b/drivers/ata/sata_uli.c
@@ -112,7 +112,7 @@ static const struct ata_port_operations uli_ops = {
 
 	.freeze			= ata_bmdma_freeze,
 	.thaw			= ata_bmdma_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 
 	.irq_clear		= ata_bmdma_irq_clear,
diff --git a/drivers/ata/sata_via.c b/drivers/ata/sata_via.c
index e8b90e7..1a26aea 100644
--- a/drivers/ata/sata_via.c
+++ b/drivers/ata/sata_via.c
@@ -174,7 +174,7 @@ static const struct ata_port_operations vt6421_pata_ops = {
 
 	.freeze			= ata_bmdma_freeze,
 	.thaw			= ata_bmdma_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.cable_detect		= vt6421_pata_cable_detect,
 
@@ -205,7 +205,7 @@ static const struct ata_port_operations vt6421_sata_ops = {
 
 	.freeze			= ata_bmdma_freeze,
 	.thaw			= ata_bmdma_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.cable_detect		= ata_cable_sata,
 
diff --git a/drivers/ata/sata_vsc.c b/drivers/ata/sata_vsc.c
index 8133017..177552b 100644
--- a/drivers/ata/sata_vsc.c
+++ b/drivers/ata/sata_vsc.c
@@ -331,7 +331,7 @@ static const struct ata_port_operations vsc_sata_ops = {
 	.data_xfer		= ata_data_xfer,
 	.freeze			= vsc_freeze,
 	.thaw			= vsc_thaw,
-	.error_handler		= ata_bmdma_error_handler,
+	.error_handler		= sata_bmdma_error_handler,
 	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.irq_clear		= ata_bmdma_irq_clear,
 	.irq_on			= ata_irq_on,
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 85f7b1b..5f75d45 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -674,6 +674,7 @@ extern int sata_phy_debounce(struct ata_port *ap, const unsigned long *param,
 extern int sata_phy_resume(struct ata_port *ap, const unsigned long *param,
 			   unsigned long deadline);
 extern int ata_std_prereset(struct ata_port *ap, unsigned long deadline);
+extern int sata_std_prereset(struct ata_port *ap, unsigned long deadline);
 extern int ata_std_softreset(struct ata_port *ap, unsigned int *classes,
 			     unsigned long deadline);
 extern int sata_port_hardreset(struct ata_port *ap, const unsigned long *timing,
@@ -786,6 +787,7 @@ extern void ata_bmdma_drive_eh(struct ata_port *ap, ata_prereset_fn_t prereset,
 			       ata_reset_fn_t hardreset,
 			       ata_postreset_fn_t postreset);
 extern void ata_bmdma_error_handler(struct ata_port *ap);
+extern void sata_bmdma_error_handler(struct ata_port *ap);
 extern void ata_bmdma_post_internal_cmd(struct ata_queued_cmd *qc);
 extern int ata_hsm_move(struct ata_port *ap, struct ata_queued_cmd *qc,
 			u8 status, int in_wq);

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-02 16:11   ` [PATCH] " Jeff Garzik
@ 2007-06-03 17:46     ` Gregor Jasny
  2007-06-06  8:46       ` Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Gregor Jasny @ 2007-06-03 17:46 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linux Kernel Mailing List, Linus Torvalds, linux-ide, Alan Cox

2007/6/2, Jeff Garzik <jeff@garzik.org>:
> Does this patch change the behavior at all?

No. It still times out. I've raised the first timeout to 60 seconds
but still no luck.

[   19.403632] ata_piix 0000:00:07.1: version 2.11
[   19.404013] scsi0 : ata_piix
[   19.404482] scsi1 : ata_piix
[   19.404836] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
bmdma 0x00010860 irq 14
[   19.405443] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
bmdma 0x00010868 irq 15
[    0.663333] Marking TSC unstable due to: possible TSC halt in C2.
[    0.666666] Time: acpi_pm clocksource has been installed.
[    0.666666] Switched to NOHz mode on CPU #0
[    0.716666] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.716666] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
[    0.733333] ata1.00: 78242976 sectors, multi 8: LBA48
[    0.753333] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.766666] ata1.00: configured for UDMA/33
[    1.379999] Clocksource tsc unstable (delta = -498119555 ns)
[   10.793332] ata2: SRST failed (errno=-16)
[   20.813331] ata2: SRST failed (errno=-16)
[   55.833327] ata2: SRST failed (errno=-16)
[   60.853327] ata2: SRST failed (errno=-16)
[   60.866660] ata2: reset failed, giving up
[   60.879993] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
MP0402H  UC10 PQ: 0 ANSI: 5

Cheers,
Gregor

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-03 17:46     ` Gregor Jasny
@ 2007-06-06  8:46       ` Tejun Heo
  2007-06-07  6:22         ` Gregor Jasny
  0 siblings, 1 reply; 53+ messages in thread
From: Tejun Heo @ 2007-06-06  8:46 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Jeff Garzik, Linux Kernel Mailing List, Linus Torvalds, linux-ide,
	Alan Cox

[-- Attachment #1: Type: text/plain, Size: 318 bytes --]

Gregor Jasny wrote:
> 2007/6/2, Jeff Garzik <jeff@garzik.org>:
>> Does this patch change the behavior at all?
> 
> No. It still times out. I've raised the first timeout to 60 seconds
> but still no luck.

Let's see where we're failing.  Please apply the attached patch and
report what kernel says.

Thanks.

-- 
tejun

[-- Attachment #2: srst-debug.patch --]
[-- Type: text/x-patch, Size: 1363 bytes --]

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4733f00..ae6f177 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3053,8 +3053,11 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 	if (dev0) {
 		rc = ata_wait_ready(ap, deadline);
 		if (rc) {
-			if (rc != -ENODEV)
+			if (rc != -ENODEV) {
+				ata_port_printk(ap, KERN_WARNING,
+						"ata_bus_post_reset: EXIT0\n");
 				return rc;
+			}
 			ret = rc;
 		}
 	}
@@ -3070,15 +3073,21 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 		lbal = ioread8(ioaddr->lbal_addr);
 		if ((nsect == 1) && (lbal == 1))
 			break;
-		if (time_after(jiffies, deadline))
+		if (time_after(jiffies, deadline)) {
+			ata_port_printk(ap, KERN_WARNING,
+					"ata_bus_post_reset: EXIT1\n");
 			return -EBUSY;
+		}
 		msleep(50);	/* give drive a breather */
 	}
 	if (dev1) {
 		rc = ata_wait_ready(ap, deadline);
 		if (rc) {
-			if (rc != -ENODEV)
+			if (rc != -ENODEV) {
+				ata_port_printk(ap, KERN_WARNING,
+						"ata_bus_post_reset: EXIT2\n");
 				return rc;
+			}
 			ret = rc;
 		}
 	}
@@ -3090,6 +3099,7 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 	if (dev0)
 		ap->ops->dev_select(ap, 0);
 
+	ata_port_printk(ap, KERN_WARNING, "ata_bus_post_reset: EXIT3\n");
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-06  8:46       ` Tejun Heo
@ 2007-06-07  6:22         ` Gregor Jasny
  2007-06-07  7:27           ` Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Gregor Jasny @ 2007-06-07  6:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, Linux Kernel Mailing List, Linus Torvalds, linux-ide,
	Alan Cox

2007/6/6, Tejun Heo <htejun@gmail.com>:
> Let's see where we're failing.

[  186.849280] ata_piix 0000:00:07.1: version 2.11
[  186.849665] scsi0 : ata_piix
[  186.850241] scsi1 : ata_piix
[  186.850596] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
bmdma 0x00010860 irq 14
[  186.851203] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
bmdma 0x00010868 irq 15
[    0.656666] Marking TSC unstable due to: possible TSC halt in C2.
[    0.659999] Time: acpi_pm clocksource has been installed.
[    0.659999] Switched to NOHz mode on CPU #0
[    0.659999] ata1: ata_bus_post_reset: EXIT3
[    0.709999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.723333] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
[    0.739999] ata1.00: 78242976 sectors, multi 8: LBA48
[    0.763333] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.776666] ata1.00: configured for UDMA/33
[    1.376666] Clocksource tsc unstable (delta = -498091579 ns)
[   10.799998] ata2: ata_bus_post_reset: EXIT1
[   10.813332] ata2: SRST failed (errno=-16)
[   20.836664] ata2: ata_bus_post_reset: EXIT1
[   20.849997] ata2: SRST failed (errno=-16)
[   55.869994] ata2: ata_bus_post_reset: EXIT1
[   55.883327] ata2: SRST failed (errno=-16)
[   60.903327] ata2: ata_bus_post_reset: EXIT1
[   60.916660] ata2: SRST failed (errno=-16)
[   60.929993] ata2: reset failed, giving up
[   60.946660] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
MP0402H  UC10 PQ: 0 ANSI: 5

Thanks,
Gregor

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07  6:22         ` Gregor Jasny
@ 2007-06-07  7:27           ` Tejun Heo
  2007-06-07 20:37             ` Gregor Jasny
                               ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Tejun Heo @ 2007-06-07  7:27 UTC (permalink / raw)
  To: Gregor Jasny
  Cc: Jeff Garzik, Linux Kernel Mailing List, Linus Torvalds, linux-ide,
	Alan Cox

[-- Attachment #1: Type: text/plain, Size: 1777 bytes --]

Gregor Jasny wrote:
> 2007/6/6, Tejun Heo <htejun@gmail.com>:
>> Let's see where we're failing.
> 
> [  186.849280] ata_piix 0000:00:07.1: version 2.11
> [  186.849665] scsi0 : ata_piix
> [  186.850241] scsi1 : ata_piix
> [  186.850596] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
> bmdma 0x00010860 irq 14
> [  186.851203] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
> bmdma 0x00010868 irq 15
> [    0.656666] Marking TSC unstable due to: possible TSC halt in C2.
> [    0.659999] Time: acpi_pm clocksource has been installed.
> [    0.659999] Switched to NOHz mode on CPU #0
> [    0.659999] ata1: ata_bus_post_reset: EXIT3
> [    0.709999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
> hpa_sectors = 78242976
> [    0.723333] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
> [    0.739999] ata1.00: 78242976 sectors, multi 8: LBA48
> [    0.763333] ata1.00: ata_hpa_resize 1: sectors = 78242976,
> hpa_sectors = 78242976
> [    0.776666] ata1.00: configured for UDMA/33
> [    1.376666] Clocksource tsc unstable (delta = -498091579 ns)
> [   10.799998] ata2: ata_bus_post_reset: EXIT1
> [   10.813332] ata2: SRST failed (errno=-16)
> [   20.836664] ata2: ata_bus_post_reset: EXIT1
> [   20.849997] ata2: SRST failed (errno=-16)
> [   55.869994] ata2: ata_bus_post_reset: EXIT1
> [   55.883327] ata2: SRST failed (errno=-16)
> [   60.903327] ata2: ata_bus_post_reset: EXIT1
> [   60.916660] ata2: SRST failed (errno=-16)
> [   60.929993] ata2: reset failed, giving up
> [   60.946660] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
> MP0402H  UC10 PQ: 0 ANSI: 5

Ah.. okay.  Now I see what's going on.  Jeff, this is another device
which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
the attached patch.

Thanks.

-- 
tejun

[-- Attachment #2: libata-dont-test-slave-register-readiness-after-srst.patch --]
[-- Type: text/x-patch, Size: 1161 bytes --]

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4733f00..bac5e1f 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3042,7 +3042,6 @@ int ata_wait_ready(struct ata_port *ap, unsigned long deadline)
 static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 			      unsigned long deadline)
 {
-	struct ata_ioports *ioaddr = &ap->ioaddr;
 	unsigned int dev0 = devmask & (1 << 0);
 	unsigned int dev1 = devmask & (1 << 1);
 	int rc, ret = 0;
@@ -3059,22 +3058,9 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 		}
 	}
 
-	/* if device 1 was found in ata_devchk, wait for
-	 * register access, then wait for BSY to clear
-	 */
-	while (dev1) {
-		u8 nsect, lbal;
-
-		ap->ops->dev_select(ap, 1);
-		nsect = ioread8(ioaddr->nsect_addr);
-		lbal = ioread8(ioaddr->lbal_addr);
-		if ((nsect == 1) && (lbal == 1))
-			break;
-		if (time_after(jiffies, deadline))
-			return -EBUSY;
-		msleep(50);	/* give drive a breather */
-	}
+	/* wait for device 1 */
 	if (dev1) {
+		ap->ops->dev_select(ap, 1);
 		rc = ata_wait_ready(ap, deadline);
 		if (rc) {
 			if (rc != -ENODEV)

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07  7:27           ` Tejun Heo
@ 2007-06-07 20:37             ` Gregor Jasny
  2007-06-07 20:56             ` Linus Torvalds
  2007-06-08 15:55             ` [PATCH] Re: Linux v2.6.22-rc3 Jeff Garzik
  2 siblings, 0 replies; 53+ messages in thread
From: Gregor Jasny @ 2007-06-07 20:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, Linux Kernel Mailing List, Linus Torvalds, linux-ide,
	Alan Cox

2007/6/7, Tejun Heo <htejun@gmail.com>:
> Ah.. okay.  Now I see what's going on.  Jeff, this is another device
> which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
> the attached patch.

It works like a charm.

Thanks for debugging.

Gregor


[  307.605884] ata_piix 0000:00:07.1: version 2.11
[  307.606268] scsi0 : ata_piix
[  307.606728] scsi1 : ata_piix
[  307.607080] ata1: PATA max UDMA/33 cmd 0x000101f0 ctl 0x000103f6
bmdma 0x00010860 irq 14
[  307.607790] ata2: PATA max UDMA/33 cmd 0x00010170 ctl 0x00010376
bmdma 0x00010868 irq 15
[    0.656666] Marking TSC unstable due to: possible TSC halt in C2.
[    0.659999] Time: acpi_pm clocksource has been installed.
[    0.659999] Switched to NOHz mode on CPU #0
[    0.703333] ata1: ata_bus_post_reset: EXIT3
[    0.709999] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.723333] ata1.00: ATA-7: SAMSUNG MP0402H, UC100-14, max UDMA/100
[    0.739999] ata1.00: 78242976 sectors, multi 8: LBA48
[    0.763333] ata1.00: ata_hpa_resize 1: sectors = 78242976,
hpa_sectors = 78242976
[    0.776666] ata1.00: configured for UDMA/33
[    0.799999] ata2: ata_bus_post_reset: EXIT3
[    0.973333] ata2.00: ATAPI, max UDMA/33
[    1.146666] ata2.00: configured for UDMA/33
[    1.159999] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG
MP0402H  UC10 PQ: 0 ANSI: 5
[...]
[    1.326666] scsi 1:0:0:0: CD-ROM            SAMSUNG  CD-ROM SN-124
  N102 PQ: 0

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07  7:27           ` Tejun Heo
  2007-06-07 20:37             ` Gregor Jasny
@ 2007-06-07 20:56             ` Linus Torvalds
  2007-06-07 22:39               ` Alan Cox
  2007-06-07 22:47               ` Jeff Garzik
  2007-06-08 15:55             ` [PATCH] Re: Linux v2.6.22-rc3 Jeff Garzik
  2 siblings, 2 replies; 53+ messages in thread
From: Linus Torvalds @ 2007-06-07 20:56 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Gregor Jasny, Jeff Garzik, Linux Kernel Mailing List, linux-ide,
	Alan Cox, Andrew Morton



On Thu, 7 Jun 2007, Tejun Heo wrote:
>
> Ah.. okay.  Now I see what's going on.  Jeff, this is another device
> which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
> the attached patch.

Tejun, since Jeff is apparently traveling this week, and Gregor tested the 
patch successfully (and it looks sane anyway - why in Gods name _would_ we 
care what the initial setting of nsect/lbal is?), can you send this in 
with the changelog and sign-off?

Maybe it can go through Andrew like the other patch (the sata_promise 
irq/polling handling), but I can take it directly too. Whatever seems 
best.

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07 20:56             ` Linus Torvalds
@ 2007-06-07 22:39               ` Alan Cox
  2007-06-07 22:47               ` Jeff Garzik
  1 sibling, 0 replies; 53+ messages in thread
From: Alan Cox @ 2007-06-07 22:39 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, Gregor Jasny, Jeff Garzik, Linux Kernel Mailing List,
	linux-ide, Andrew Morton

On Thu, 7 Jun 2007 13:56:11 -0700 (PDT)
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> 
> 
> On Thu, 7 Jun 2007, Tejun Heo wrote:
> >
> > Ah.. okay.  Now I see what's going on.  Jeff, this is another device
> > which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
> > the attached patch.
> 
> Tejun, since Jeff is apparently traveling this week, and Gregor tested the 
> patch successfully (and it looks sane anyway - why in Gods name _would_ we 
> care what the initial setting of nsect/lbal is?), can you send this in 

Its a sanity check, and if the vendors could get their chips right a good
one, but specs/toilet paper  usual story

> with the changelog and sign-off?
> 
> Maybe it can go through Andrew like the other patch (the sata_promise 
> irq/polling handling), but I can take it directly too. Whatever seems 
> best.

I'll certainly ACK it, and it also means I can drop a pending patch to
handle the IT8212 raid mode which also gets this wrong.

Alan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07 20:56             ` Linus Torvalds
  2007-06-07 22:39               ` Alan Cox
@ 2007-06-07 22:47               ` Jeff Garzik
  2007-06-08  8:02                 ` Tejun Heo
  2007-06-09 18:12                 ` Linus Torvalds
  1 sibling, 2 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-07 22:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox, Andrew Morton

On Thu, Jun 07, 2007 at 01:56:11PM -0700, Linus Torvalds wrote:
> On Thu, 7 Jun 2007, Tejun Heo wrote:
> > Ah.. okay.  Now I see what's going on.  Jeff, this is another device
> > which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
> > the attached patch.

> Tejun, since Jeff is apparently traveling this week, and Gregor tested the 
> patch successfully (and it looks sane anyway - why in Gods name _would_ we 
> care what the initial setting of nsect/lbal is?), can you send this in 
> with the changelog and sign-off?

Ack'ing the sata_promise change was easy, but with this one it would
be nice to wait a bit before changing the core probe code that [now]
every ATA setup goes through, based on a single bug report.

The values assist in detecting ghost devices (same device appearing
on both master and slave) and TF register malfunctions, and I would
appreciate not breaking _that_ so late in 2.6.22-rc for a single
report.  Thankfully we have -some- ghost device prevention code
elsewhere, but this is part of it.

Fedora 7 reports are starting to come in, and those will help point
us in the right direction too.

I'll be home for a bit in 36 hours (->net driver fixes go then)
and for good on Tuesday, so I'm hoping you can wait until then.

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07 22:47               ` Jeff Garzik
@ 2007-06-08  8:02                 ` Tejun Heo
  2007-06-08 11:27                   ` Alan Cox
  2007-06-08 14:31                   ` Jeff Garzik
  2007-06-09 18:12                 ` Linus Torvalds
  1 sibling, 2 replies; 53+ messages in thread
From: Tejun Heo @ 2007-06-08  8:02 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox, Andrew Morton

Hello,

Jeff Garzik wrote:
> On Thu, Jun 07, 2007 at 01:56:11PM -0700, Linus Torvalds wrote:
>> On Thu, 7 Jun 2007, Tejun Heo wrote:
>>> Ah.. okay.  Now I see what's going on.  Jeff, this is another device
>>> which doesn't set nsect and lbal to 1 after reset.  Gregor, please try
>>> the attached patch.
> 
>> Tejun, since Jeff is apparently traveling this week, and Gregor tested the 
>> patch successfully (and it looks sane anyway - why in Gods name _would_ we 
>> care what the initial setting of nsect/lbal is?), can you send this in 
>> with the changelog and sign-off?
> 
> Ack'ing the sata_promise change was easy, but with this one it would
> be nice to wait a bit before changing the core probe code that [now]
> every ATA setup goes through, based on a single bug report.

Yeap, I'm not so sure about the change either.  The posted patch is more
of proof-of-concept.

Currently we have three related reports.

* this one
* IT8212 raid mode Alan is working on
* (likely) pata_pcmcia http://thread.gmane.org/gmane.linux.kernel/530099

> The values assist in detecting ghost devices (same device appearing
> on both master and slave) and TF register malfunctions, and I would
> appreciate not breaking _that_ so late in 2.6.22-rc for a single
> report.  Thankfully we have -some- ghost device prevention code
> elsewhere, but this is part of it.

Upto 2.6.21, if the same condition triggers, it delays 30secs and just
continue, so I don't think it was a worthy protection against ghost
devices or TF malfunction.  The only protection it offers is preventing
libata from accessing slave's status register too early.  SRST sequence
looks like the following.

1. SRST raised, delay, and cleared

2. libata waits for master device readiness after 150ms pause.  After
finishing reset, if slave is present, master waits for PDIAG- assert.

3. slave asserts PDIAG-, master asserts readiness.  libata goes on to
check for the slave device nsect/lbal.

4. slave sets nsect/lbal, libata goes on to check readiness

5. slave asserts readiness, SRST complete.

So, the nsect/lbal check is meaningful iff 1. slave lies about device
readiness after releasing PDIAG- but before setting nsect/lbal, or 2.
master/slave register selection is flimsy.

I don't think the first one is probable considering BSY is supposed to
set when SRST is received.  This is pretty fundamental in the protocol
and necessary for the device to work properly as master, so I think this
is one of few things we can rely upon.

To me, the second one sounds pretty far fetched too but, well, it's ATA.
 Who knows?

How about limiting nsect/lbal wait duration?  Say, 100ms or 500ms?  That
can somewhat ease our paranoia and should show acceptable behavior for
braindead devices too.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08  8:02                 ` Tejun Heo
@ 2007-06-08 11:27                   ` Alan Cox
  2007-06-08 11:32                     ` Tejun Heo
  2007-06-08 14:31                   ` Jeff Garzik
  1 sibling, 1 reply; 53+ messages in thread
From: Alan Cox @ 2007-06-08 11:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

> Upto 2.6.21, if the same condition triggers, it delays 30secs and just
> continue, so I don't think it was a worthy protection against ghost
> devices or TF malfunction.  The only protection it offers is preventing
> libata from accessing slave's status register too early.  SRST sequence
> looks like the following.

I've seen no bug where it looked like it saved something, and the only
ghost device bugs I've seen it failed to detect anyway (hence the PCMCIA
drivers own ghost detection logic)


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 11:27                   ` Alan Cox
@ 2007-06-08 11:32                     ` Tejun Heo
  2007-06-08 11:40                       ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Tejun Heo @ 2007-06-08 11:32 UTC (permalink / raw)
  To: Alan Cox
  Cc: Jeff Garzik, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

Alan Cox wrote:
>> Upto 2.6.21, if the same condition triggers, it delays 30secs and just
>> continue, so I don't think it was a worthy protection against ghost
>> devices or TF malfunction.  The only protection it offers is preventing
>> libata from accessing slave's status register too early.  SRST sequence
>> looks like the following.
> 
> I've seen no bug where it looked like it saved something, and the only
> ghost device bugs I've seen it failed to detect anyway (hence the PCMCIA
> drivers own ghost detection logic)

If we still have several rc's left, how about just removing it and
watching the fireworks?  Jeff?

-- 
tejun

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 11:32                     ` Tejun Heo
@ 2007-06-08 11:40                       ` Alan Cox
  2007-06-08 14:28                         ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Alan Cox @ 2007-06-08 11:40 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Garzik, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

> If we still have several rc's left, how about just removing it and
> watching the fireworks?  Jeff?

Seems best to me - we know the current code breaks a load of systems and
the change should not break anything but fix them all. If we ship without
it being changed then a load of people are stuck with broken ATA.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 11:40                       ` Alan Cox
@ 2007-06-08 14:28                         ` Jeff Garzik
  2007-06-08 15:36                           ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 14:28 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, Jun 08, 2007 at 12:40:58PM +0100, Alan Cox wrote:
> Seems best to me - we know the current code breaks a load of systems and
> the change should not break anything but fix them all. If we ship without
> it being changed then a load of people are stuck with broken ATA.

So you have verified that Tejun's experimental change "fixes them all"?

Proof?  Bugzilla URLs pointing to users that have ack'd this fix?

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08  8:02                 ` Tejun Heo
  2007-06-08 11:27                   ` Alan Cox
@ 2007-06-08 14:31                   ` Jeff Garzik
  2007-06-08 15:38                     ` Alan Cox
  1 sibling, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 14:31 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox, Andrew Morton

On Fri, Jun 08, 2007 at 05:02:24PM +0900, Tejun Heo wrote:
> I don't think the first one is probable considering BSY is supposed to
> set when SRST is received.  This is pretty fundamental in the protocol
> and necessary for the device to work properly as master, so I think this
> is one of few things we can rely upon.

See a URL I posted earlier in this thread.  With dumb ATAPI devices we
actually have to wait a bit for BSY to be asserted.  Not only at reset,
but also for every command.


> How about limiting nsect/lbal wait duration?  Say, 100ms or 500ms?  That
> can somewhat ease our paranoia and should show acceptable behavior for
> braindead devices too.

That's quite reasonable.

	Jeff



P.S.  We can probably reduce the msleep(150) sprinkled about the code to
msleep(100), for the dumb-ATAPI-device wait.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:36                           ` Alan Cox
@ 2007-06-08 15:32                             ` Jeff Garzik
  2007-06-08 15:46                               ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 15:32 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, Jun 08, 2007 at 04:36:15PM +0100, Alan Cox wrote:
> On Fri, 8 Jun 2007 10:28:22 -0400
> Jeff Garzik <jeff@garzik.org> wrote:
> 
> > On Fri, Jun 08, 2007 at 12:40:58PM +0100, Alan Cox wrote:
> > > Seems best to me - we know the current code breaks a load of systems and
> > > the change should not break anything but fix them all. If we ship without
> > > it being changed then a load of people are stuck with broken ATA.
> > 
> > So you have verified that Tejun's experimental change "fixes them all"?
> > 
> > Proof?  Bugzilla URLs pointing to users that have ack'd this fix?
> 
> It unbreaks the IT821x, see the proposed patches I sent before for
> discussion. Those are tested and do work. I've got a test case here and
> its the case that the IT821x card doesn't get the magic trickery you rely
> on right.

Ah, I guess I misunderstood.  I thought you were referring to Fedora
7 bug reports, since there are not a load of people with IT821x.

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:38                     ` Alan Cox
@ 2007-06-08 15:35                       ` Jeff Garzik
  2007-06-08 15:44                         ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 15:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, Jun 08, 2007 at 04:38:04PM +0100, Alan Cox wrote:
> > See a URL I posted earlier in this thread.  With dumb ATAPI devices we
> > actually have to wait a bit for BSY to be asserted.  Not only at reset,
> > but also for every command
> 
> 400nS and the current code correctly accounts for it.

No, _far_ longer than 400nS.  2ms minimum by the spec, but,

ATADRVR. blackmagic.html and googling around shows plenty of ATAPI
devices that don't get their shit together for 50-100ms.

The BIOS source I'm reviewing right now waits 50ms, after SRST.

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 14:28                         ` Jeff Garzik
@ 2007-06-08 15:36                           ` Alan Cox
  2007-06-08 15:32                             ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Alan Cox @ 2007-06-08 15:36 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, 8 Jun 2007 10:28:22 -0400
Jeff Garzik <jeff@garzik.org> wrote:

> On Fri, Jun 08, 2007 at 12:40:58PM +0100, Alan Cox wrote:
> > Seems best to me - we know the current code breaks a load of systems and
> > the change should not break anything but fix them all. If we ship without
> > it being changed then a load of people are stuck with broken ATA.
> 
> So you have verified that Tejun's experimental change "fixes them all"?
> 
> Proof?  Bugzilla URLs pointing to users that have ack'd this fix?

It unbreaks the IT821x, see the proposed patches I sent before for
discussion. Those are tested and do work. I've got a test case here and
its the case that the IT821x card doesn't get the magic trickery you rely
on right.

Alan



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 14:31                   ` Jeff Garzik
@ 2007-06-08 15:38                     ` Alan Cox
  2007-06-08 15:35                       ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Alan Cox @ 2007-06-08 15:38 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

> See a URL I posted earlier in this thread.  With dumb ATAPI devices we
> actually have to wait a bit for BSY to be asserted.  Not only at reset,
> but also for every command

400nS and the current code correctly accounts for it.

> > How about limiting nsect/lbal wait duration?  Say, 100ms or 500ms?  That
> > can somewhat ease our paranoia and should show acceptable behavior for
> > braindead devices too.
> 
> That's quite reasonable.

Agreed - means the IT821x takes a bit longer to respond but ensures that
we avoid any unpleasant suprises elsewhere.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:35                       ` Jeff Garzik
@ 2007-06-08 15:44                         ` Alan Cox
  0 siblings, 0 replies; 53+ messages in thread
From: Alan Cox @ 2007-06-08 15:44 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, 8 Jun 2007 11:35:13 -0400
Jeff Garzik <jeff@garzik.org> wrote:

> On Fri, Jun 08, 2007 at 04:38:04PM +0100, Alan Cox wrote:
> > > See a URL I posted earlier in this thread.  With dumb ATAPI devices we
> > > actually have to wait a bit for BSY to be asserted.  Not only at reset,
> > > but also for every command
> > 
> > 400nS and the current code correctly accounts for it.
> 
> No, _far_ longer than 400nS.  2ms minimum by the spec, but,
> 
> ATADRVR. blackmagic.html and googling around shows plenty of ATAPI
> devices that don't get their shit together for 50-100ms.

400nS is commands. Reset is indeed about 150mS or so if you catch a drive
right - eg in the middle of revalidating a dodgy disk. One or two have
really weird behaviour in that they take MUCH longer to come out of reset
with no media loaded.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:32                             ` Jeff Garzik
@ 2007-06-08 15:46                               ` Alan Cox
  2007-06-08 15:49                                 ` Jeff Garzik
  0 siblings, 1 reply; 53+ messages in thread
From: Alan Cox @ 2007-06-08 15:46 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

> Ah, I guess I misunderstood.  I thought you were referring to Fedora
> 7 bug reports, since there are not a load of people with IT821x.

There are. Several vendors shipped it on the motherboard and there are
either quite a few users, or they all use Fedora and they like filing
bugs 8(

Alan

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:46                               ` Alan Cox
@ 2007-06-08 15:49                                 ` Jeff Garzik
  2007-06-08 15:59                                   ` Alan Cox
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 15:49 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

On Fri, Jun 08, 2007 at 04:46:30PM +0100, Alan Cox wrote:
> > Ah, I guess I misunderstood.  I thought you were referring to Fedora
> > 7 bug reports, since there are not a load of people with IT821x.
> 
> There are. Several vendors shipped it on the motherboard and there are
> either quite a few users, or they all use Fedora and they like filing
> bugs 8(

Honestly I expected way more bug reports than just that.

I agree IT821x wants fixing, FWIW, just trying to get a handle on
the big picture.  I'm not surprised that IT821x gets reset wrong,
since it's a very non-standard BIOS.

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07  7:27           ` Tejun Heo
  2007-06-07 20:37             ` Gregor Jasny
  2007-06-07 20:56             ` Linus Torvalds
@ 2007-06-08 15:55             ` Jeff Garzik
  2 siblings, 0 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-08 15:55 UTC (permalink / raw)
  To: linux-ide; +Cc: Linux Kernel Mailing List, Linus Torvalds, Alan Cox, Tejun Heo


I just confirmed that two PATA-era major BIOS brands do SRST.  They do
check for the device signature in TF registers...  but only for the ATA
device signature.

	Jeff




^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-08 15:49                                 ` Jeff Garzik
@ 2007-06-08 15:59                                   ` Alan Cox
  0 siblings, 0 replies; 53+ messages in thread
From: Alan Cox @ 2007-06-08 15:59 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Linus Torvalds, Gregor Jasny,
	Linux Kernel Mailing List, linux-ide, Andrew Morton

> I agree IT821x wants fixing, FWIW, just trying to get a handle on
> the big picture.  I'm not surprised that IT821x gets reset wrong,
> since it's a very non-standard BIOS.

Its a firmware emulation bug, the IT821X is emulating an IDE device
rather than neccessarily exposing one directly to the kernel.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-07 22:47               ` Jeff Garzik
  2007-06-08  8:02                 ` Tejun Heo
@ 2007-06-09 18:12                 ` Linus Torvalds
  2007-06-09 19:03                   ` Jeff Garzik
  1 sibling, 1 reply; 53+ messages in thread
From: Linus Torvalds @ 2007-06-09 18:12 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Tejun Heo, Gregor Jasny, Linux Kernel Mailing List, linux-ide,
	Alan Cox, Andrew Morton



On Thu, 7 Jun 2007, Jeff Garzik wrote:
>
> Ack'ing the sata_promise change was easy, but with this one it would
> be nice to wait a bit before changing the core probe code that [now]
> every ATA setup goes through, based on a single bug report.

So what's the resolution? Right now this is apparently the reasong for 
two of our current regressions, and we need to solve it.

"Just wait" is not an option. The options are:
 - fix things
 - revert everything that caused the regressions

and quite frankly, I don't think the second alternative is palatable to 
anybody, including you.

> The values assist in detecting ghost devices (same device appearing
> on both master and slave) and TF register malfunctions, and I would
> appreciate not breaking _that_ so late in 2.6.22-rc for a single
> report.  Thankfully we have -some- ghost device prevention code
> elsewhere, but this is part of it.

Apparently this isn't even true. As far as I can tell, the old code used 
to wait for lbal to be 1, but if it never became one, it would just 
silently time out and not *do* anything about it. Later commands would 
"just work", and the device would go its merry ways.

IOW, the new code is *crap*. The old code was arguably not wonderful 
either, but because of its nature, the old code not working didn't 
actually matter all that much.

So the biggest change (and the one that broke peoples machines) is that 
the code that you claim matters, *never* apparently did matter, and now 
that we've made it matter (by returning an error, and aborting the SRST 
sequence as a failure), people are complaining.

I really think we should apply Tejun's patch. Add a delay there, by all 
means if you are nervous: but no, it shouldn't be 150ms. This is the 
_post_ reset handling, we've already done the 150ms wait for the reset!

In fact, if you look at ata_bus_post_reset(), you'll notice that for 
port0, we just do the "ata_wait_ready()" call. Tejun's patch just made us 
do the same (logical) thing for port1 too!  If you think it breaks due to 
ome timeout issue, then the port0 thing was already broken.

(And maybe we could make the 

	ap->ops->dev_select(ap, 1);
	rc = ata_wait_ready(ap, deadline);

instead be a

	ata_dev_select(ap, 1, 1, 1);

and you'd wait even more? But that's actually a bigger change than Tejun's 
thing, and it uses "ata_wait_idle()" rather than "ata_wait_ready()", and 
doesn't return any status.. I dunno. But right now, we have several known 
broken things, that apparently weren't broken before!)

		Linus

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] Re: Linux v2.6.22-rc3
  2007-06-09 18:12                 ` Linus Torvalds
@ 2007-06-09 19:03                   ` Jeff Garzik
  2007-06-10  5:26                     ` [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms Tejun Heo
  0 siblings, 1 reply; 53+ messages in thread
From: Jeff Garzik @ 2007-06-09 19:03 UTC (permalink / raw)
  To: Linus Torvalds, Tejun Heo
  Cc: Gregor Jasny, Linux Kernel Mailing List, linux-ide, Alan Cox,
	Andrew Morton

Linus Torvalds wrote:
> So what's the resolution? Right now this is apparently the reasong for 


Based on the thread it sounded like Tejun was going to post a slightly 
modified version of his patch?

	Jeff



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms
  2007-06-09 19:03                   ` Jeff Garzik
@ 2007-06-10  5:26                     ` Tejun Heo
  2007-06-10 16:23                       ` Jeff Garzik
  2007-06-11  4:59                       ` Jeff Garzik
  0 siblings, 2 replies; 53+ messages in thread
From: Tejun Heo @ 2007-06-10  5:26 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox, Andrew Morton

After SRST, libata used to wait for nsect/lbal to be set to 1/1 for
the slave device.  However, some ATAPI devices don't set nsect/lbal
after SRST and the wait itself isn't too useful as we're gonna wait
for !BSY right after that anyway.

Before reset-seq update, nsect/lbal wait failure used to be ignored
and caused 30sec delay during detection.  After reset-seq, all
timeouts are considered error conditions making libata fail to detect
such ATAPI devices.

This patch limits nsect/lbal wait to around 100ms.  This should give
acceptable behavior to such ATAPI devices while not disturbing the
heavily used code path too much.

Signed-off-by: Tejun Heo <htejun@gmail.com>
---
Okay, here it is.  Sorry for the delay.

 drivers/ata/libata-core.c |   32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4733f00..02a44ff 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3059,22 +3059,28 @@ static int ata_bus_post_reset(struct ata_port *ap, unsigned int devmask,
 		}
 	}
 
-	/* if device 1 was found in ata_devchk, wait for
-	 * register access, then wait for BSY to clear
+	/* if device 1 was found in ata_devchk, wait for register
+	 * access briefly, then wait for BSY to clear.
 	 */
-	while (dev1) {
-		u8 nsect, lbal;
+	if (dev1) {
+		int i;
 
 		ap->ops->dev_select(ap, 1);
-		nsect = ioread8(ioaddr->nsect_addr);
-		lbal = ioread8(ioaddr->lbal_addr);
-		if ((nsect == 1) && (lbal == 1))
-			break;
-		if (time_after(jiffies, deadline))
-			return -EBUSY;
-		msleep(50);	/* give drive a breather */
-	}
-	if (dev1) {
+
+		/* Wait for register access.  Some ATAPI devices fail
+		 * to set nsect/lbal after reset, so don't waste too
+		 * much time on it.  We're gonna wait for !BSY anyway.
+		 */
+		for (i = 0; i < 2; i++) {
+			u8 nsect, lbal;
+
+			nsect = ioread8(ioaddr->nsect_addr);
+			lbal = ioread8(ioaddr->lbal_addr);
+			if ((nsect == 1) && (lbal == 1))
+				break;
+			msleep(50);	/* give drive a breather */
+		}
+
 		rc = ata_wait_ready(ap, deadline);
 		if (rc) {
 			if (rc != -ENODEV)

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms
  2007-06-10  5:26                     ` [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms Tejun Heo
@ 2007-06-10 16:23                       ` Jeff Garzik
  2007-06-11  4:59                       ` Jeff Garzik
  1 sibling, 0 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-10 16:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox, Andrew Morton

Tejun Heo wrote:
> After SRST, libata used to wait for nsect/lbal to be set to 1/1 for
> the slave device.  However, some ATAPI devices don't set nsect/lbal
> after SRST and the wait itself isn't too useful as we're gonna wait
> for !BSY right after that anyway.
> 
> Before reset-seq update, nsect/lbal wait failure used to be ignored
> and caused 30sec delay during detection.  After reset-seq, all
> timeouts are considered error conditions making libata fail to detect
> such ATAPI devices.
> 
> This patch limits nsect/lbal wait to around 100ms.  This should give
> acceptable behavior to such ATAPI devices while not disturbing the
> heavily used code path too much.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> ---
> Okay, here it is.  Sorry for the delay.
> 
>  drivers/ata/libata-core.c |   32 +++++++++++++++++++-------------
>  1 file changed, 19 insertions(+), 13 deletions(-)

Thanks.  Maybe Gregor can confirm this fixes his problem?

I'll ACK and push upstream then...



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms
  2007-06-10  5:26                     ` [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms Tejun Heo
  2007-06-10 16:23                       ` Jeff Garzik
@ 2007-06-11  4:59                       ` Jeff Garzik
  1 sibling, 0 replies; 53+ messages in thread
From: Jeff Garzik @ 2007-06-11  4:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, Gregor Jasny, Linux Kernel Mailing List,
	linux-ide, Alan Cox, Andrew Morton

Tejun Heo wrote:
> After SRST, libata used to wait for nsect/lbal to be set to 1/1 for
> the slave device.  However, some ATAPI devices don't set nsect/lbal
> after SRST and the wait itself isn't too useful as we're gonna wait
> for !BSY right after that anyway.
> 
> Before reset-seq update, nsect/lbal wait failure used to be ignored
> and caused 30sec delay during detection.  After reset-seq, all
> timeouts are considered error conditions making libata fail to detect
> such ATAPI devices.
> 
> This patch limits nsect/lbal wait to around 100ms.  This should give
> acceptable behavior to such ATAPI devices while not disturbing the
> heavily used code path too much.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> ---
> Okay, here it is.  Sorry for the delay.

applied



^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2007-06-11  4:59 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <alpine.LFD.0.98.0705252008210.26602@woody.linux-foundation.org>
2007-05-27 15:01 ` Linux v2.6.22-rc3 Gregor Jasny
2007-05-27 15:06   ` Jeff Garzik
2007-05-27 16:07     ` Gregor Jasny
2007-05-27 16:24       ` Linus Torvalds
2007-05-27 20:15         ` Gregor Jasny
2007-05-28  9:47           ` Tejun Heo
2007-05-28 14:07             ` Gregor Jasny
2007-05-29  9:28               ` Tejun Heo
2007-05-29 15:19                 ` Gregor Jasny
2007-05-29 16:44                 ` Linus Torvalds
2007-06-01  0:58                   ` Tejun Heo
2007-06-01  1:37                     ` Linus Torvalds
2007-06-01  2:19                       ` Tejun Heo
2007-06-01 16:50                         ` Jeff Garzik
2007-06-01 17:04                           ` Linus Torvalds
2007-06-01 17:35                             ` Jeff Garzik
2007-06-01 17:59                               ` Linus Torvalds
2007-06-01 18:20                                 ` Dave Jones
2007-06-01 18:30                                   ` Linus Torvalds
2007-06-01 18:46                                     ` Dave Jones
2007-06-01 18:41                                 ` Jeff Garzik
2007-06-01 18:48                                   ` Jeff Garzik
2007-06-02  7:50                                   ` Tejun Heo
2007-05-28 21:50     ` Bill Davidsen
2007-06-02 16:11   ` [PATCH] " Jeff Garzik
2007-06-03 17:46     ` Gregor Jasny
2007-06-06  8:46       ` Tejun Heo
2007-06-07  6:22         ` Gregor Jasny
2007-06-07  7:27           ` Tejun Heo
2007-06-07 20:37             ` Gregor Jasny
2007-06-07 20:56             ` Linus Torvalds
2007-06-07 22:39               ` Alan Cox
2007-06-07 22:47               ` Jeff Garzik
2007-06-08  8:02                 ` Tejun Heo
2007-06-08 11:27                   ` Alan Cox
2007-06-08 11:32                     ` Tejun Heo
2007-06-08 11:40                       ` Alan Cox
2007-06-08 14:28                         ` Jeff Garzik
2007-06-08 15:36                           ` Alan Cox
2007-06-08 15:32                             ` Jeff Garzik
2007-06-08 15:46                               ` Alan Cox
2007-06-08 15:49                                 ` Jeff Garzik
2007-06-08 15:59                                   ` Alan Cox
2007-06-08 14:31                   ` Jeff Garzik
2007-06-08 15:38                     ` Alan Cox
2007-06-08 15:35                       ` Jeff Garzik
2007-06-08 15:44                         ` Alan Cox
2007-06-09 18:12                 ` Linus Torvalds
2007-06-09 19:03                   ` Jeff Garzik
2007-06-10  5:26                     ` [PATCH] libata: limit post SRST nsect/lbal wait to ~100ms Tejun Heo
2007-06-10 16:23                       ` Jeff Garzik
2007-06-11  4:59                       ` Jeff Garzik
2007-06-08 15:55             ` [PATCH] Re: Linux v2.6.22-rc3 Jeff Garzik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).