* patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
@ 2008-06-05 5:09 Nick Piggin
2008-06-05 9:24 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2008-06-05 5:09 UTC (permalink / raw)
To: alan, jgarzik, linux-ide
Hi,
I bisected a sata regression to commit a57c1bade5a0ee5cd8b74502db9cbebb7f5780b2
The details are as follows:
System is a powerpc (g5)
CONFIG_ATA=y
# CONFIG_ATA_NONSTANDARD is not set
# CONFIG_SATA_PMP is not set
# CONFIG_SATA_AHCI is not set
# CONFIG_SATA_SIL24 is not set
CONFIG_ATA_SFF=y
CONFIG_SATA_SVW=y
lspci:
0000:00:0b.0 PCI bridge: Apple Computer Inc. Device 005b
0000:0a:00.0 VGA compatible controller: nVidia Corporation NV43 [GeForce 6600] (rev a2)
0001:00:00.0 Host bridge: Apple Computer Inc. U4 HT Bridge
0001:00:01.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
0001:00:02.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-X bridge (rev a3)
0001:00:03.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:04.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:05.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:06.0 PCI bridge: Broadcom BCM5780 [HT2000] PCI-Express Bridge (rev a3)
0001:00:07.0 PCI bridge: Apple Computer Inc. Shasta PCI Bridge
0001:00:08.0 PCI bridge: Apple Computer Inc. Shasta PCI Bridge
0001:00:09.0 PCI bridge: Apple Computer Inc. Shasta PCI Bridge
0001:01:07.0 Class ff00: Apple Computer Inc. Shasta Mac I/O
0001:01:0b.0 USB Controller: NEC Corporation USB (rev 43)
0001:01:0b.1 USB Controller: NEC Corporation USB (rev 43)
0001:01:0b.2 USB Controller: NEC Corporation USB 2.0 (rev 04)
0001:03:0c.0 IDE interface: Broadcom K2 SATA
0001:03:0d.0 Class ff00: Apple Computer Inc. Shasta IDE
0001:03:0e.0 FireWire (IEEE 1394): Apple Computer Inc. Shasta Firewire
0001:05:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 Gigabit Ethernet (rev 03)
0001:05:04.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5780 Gigabit Ethernet (rev 03)
dmesg (from the kernel previous to the commit in question):
sata_svw 0001:03:0c.0: version 2.3
scsi0 : sata_svw
scsi1 : sata_svw
scsi2 : sata_svw
scsi3 : sata_svw
ata1: SATA max UDMA/133 mmio m8192@0xfa402000 port 0xfa402000 irq 18
ata2: SATA max UDMA/133 mmio m8192@0xfa402000 port 0xfa402100 irq 18
ata3: SATA max UDMA/133 mmio m8192@0xfa402000 port 0xfa402200 irq 18
ata4: SATA max UDMA/133 mmio m8192@0xfa402000 port 0xfa402300 irq 18
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-7: Maxtor 7Y250M0, YAR51HW0, max UDMA/133
ata1.00: 490234752 sectors, multi 0: LBA48
ata1.00: configured for UDMA/133
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-7: Maxtor 7Y250M0, YAR51HW0, max UDMA/133
ata2.00: 490234752 sectors, multi 0: LBA48
ata2.00: configured for UDMA/133
ata3: SATA link down (SStatus 0 SControl 0)
ata4: SATA link down (SStatus 0 SControl 0)
scsi 0:0:0:0: Direct-Access ATA Maxtor 7Y250M0 YAR5 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
sda: [mac] sda1 sda2
And when booting with the commit applied, I instead get a whole lot of
messages like this (this is the first one, copied by hand):
ata2.00: exception Emask 0x0 SAct 0x0 Serr 0x10000000 action 0x6 frozen
ata2: SError: { }
ata2.00: cmd c8/00:02:42:08:20/00:00:00:00:00/e0 tag 0 dma 1024 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2.00: status: { DRDY }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: configured for UDMA/133
ata2: EH complete
Thanks,
Nick
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
2008-06-05 5:09 patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression Nick Piggin
@ 2008-06-05 9:24 ` Alan Cox
2008-06-05 10:05 ` Jeff Garzik
2008-06-05 10:21 ` Nick Piggin
0 siblings, 2 replies; 6+ messages in thread
From: Alan Cox @ 2008-06-05 9:24 UTC (permalink / raw)
To: Nick Piggin; +Cc: jgarzik, linux-ide
> And when booting with the commit applied, I instead get a whole lot of
> messages like this (this is the first one, copied by hand):
>
> ata2.00: exception Emask 0x0 SAct 0x0 Serr 0x10000000 action 0x6 frozen
> ata2: SError: { }
> ata2.00: cmd c8/00:02:42:08:20/00:00:00:00:00/e0 tag 0 dma 1024 in
> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2.00: status: { DRDY }
> ata2: hard resetting link
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2.00: configured for UDMA/133
> ata2: EH complete
Well I've been over the patch twice now and I cannot see a single point
at which the sequence of code that *should* be executed is any different.
Stick wmb();rmb(); (or similar barriers to compiler optimisation and I/O
fencing) at the start and end of your ata_sff_altstatus() and see what
happens, if it suddenly decides to behave or forcing it no inline makes
it behave then that would be useful info.
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
2008-06-05 9:24 ` Alan Cox
@ 2008-06-05 10:05 ` Jeff Garzik
2008-06-05 10:21 ` Nick Piggin
1 sibling, 0 replies; 6+ messages in thread
From: Jeff Garzik @ 2008-06-05 10:05 UTC (permalink / raw)
To: Alan Cox; +Cc: Nick Piggin, jgarzik, linux-ide
Alan Cox wrote:
>> And when booting with the commit applied, I instead get a whole lot of
>> messages like this (this is the first one, copied by hand):
>>
>> ata2.00: exception Emask 0x0 SAct 0x0 Serr 0x10000000 action 0x6 frozen
>> ata2: SError: { }
>> ata2.00: cmd c8/00:02:42:08:20/00:00:00:00:00/e0 tag 0 dma 1024 in
>> res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> ata2.00: status: { DRDY }
>> ata2: hard resetting link
>> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>> ata2.00: configured for UDMA/133
>> ata2: EH complete
>
> Well I've been over the patch twice now and I cannot see a single point
> at which the sequence of code that *should* be executed is any different.
>
> Stick wmb();rmb(); (or similar barriers to compiler optimisation and I/O
> fencing) at the start and end of your ata_sff_altstatus() and see what
> happens, if it suddenly decides to behave or forcing it no inline makes
> it behave then that would be useful info.
If he's getting a timeout, I wonder if that points to
ata_sff_irq_status() ...
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
2008-06-05 9:24 ` Alan Cox
2008-06-05 10:05 ` Jeff Garzik
@ 2008-06-05 10:21 ` Nick Piggin
2008-06-05 10:31 ` Nick Piggin
1 sibling, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2008-06-05 10:21 UTC (permalink / raw)
To: Alan Cox; +Cc: jgarzik, linux-ide
On Thu, Jun 05, 2008 at 10:24:24AM +0100, Alan Cox wrote:
> > And when booting with the commit applied, I instead get a whole lot of
> > messages like this (this is the first one, copied by hand):
> >
> > ata2.00: exception Emask 0x0 SAct 0x0 Serr 0x10000000 action 0x6 frozen
> > ata2: SError: { }
> > ata2.00: cmd c8/00:02:42:08:20/00:00:00:00:00/e0 tag 0 dma 1024 in
> > res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> > ata2.00: status: { DRDY }
> > ata2: hard resetting link
> > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > ata2.00: configured for UDMA/133
> > ata2: EH complete
>
> Well I've been over the patch twice now and I cannot see a single point
> at which the sequence of code that *should* be executed is any different.
If it is of any help to you, doing this:
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 90d20c6..f6cb8d7 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -1583,6 +1583,12 @@ inline unsigned int ata_sff_host_intr(struct ata_port *ap
if (status & ATA_BUSY)
goto idle_irq;
+ /* check main status, clearing INTRQ */
+ status = ap->ops->sff_check_status(ap);
+ if (unlikely(status & ATA_BUSY))
+ goto idle_irq;
+
+
/* ack bmdma irq events */
ap->ops->sff_irq_clear(ap);
Gets it working again...
> Stick wmb();rmb(); (or similar barriers to compiler optimisation and I/O
> fencing) at the start and end of your ata_sff_altstatus() and see what
> happens, if it suddenly decides to behave or forcing it no inline makes
> it behave then that would be useful info.
Will give that a try next
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
2008-06-05 10:21 ` Nick Piggin
@ 2008-06-05 10:31 ` Nick Piggin
2008-06-05 11:19 ` Alan Cox
0 siblings, 1 reply; 6+ messages in thread
From: Nick Piggin @ 2008-06-05 10:31 UTC (permalink / raw)
To: Alan Cox; +Cc: jgarzik, linux-ide
On Thu, Jun 05, 2008 at 12:21:29PM +0200, Nick Piggin wrote:
> On Thu, Jun 05, 2008 at 10:24:24AM +0100, Alan Cox wrote:
> > > And when booting with the commit applied, I instead get a whole lot of
> > > messages like this (this is the first one, copied by hand):
> > >
> > > ata2.00: exception Emask 0x0 SAct 0x0 Serr 0x10000000 action 0x6 frozen
> > > ata2: SError: { }
> > > ata2.00: cmd c8/00:02:42:08:20/00:00:00:00:00/e0 tag 0 dma 1024 in
> > > res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> > > ata2.00: status: { DRDY }
> > > ata2: hard resetting link
> > > ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > > ata2.00: configured for UDMA/133
> > > ata2: EH complete
> >
> > Well I've been over the patch twice now and I cannot see a single point
> > at which the sequence of code that *should* be executed is any different.
>
> If it is of any help to you, doing this:
> diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
> index 90d20c6..f6cb8d7 100644
> --- a/drivers/ata/libata-sff.c
> +++ b/drivers/ata/libata-sff.c
> @@ -1583,6 +1583,12 @@ inline unsigned int ata_sff_host_intr(struct ata_port *ap
> if (status & ATA_BUSY)
> goto idle_irq;
>
> + /* check main status, clearing INTRQ */
> + status = ap->ops->sff_check_status(ap);
> + if (unlikely(status & ATA_BUSY))
> + goto idle_irq;
> +
> +
> /* ack bmdma irq events */
> ap->ops->sff_irq_clear(ap);
>
> Gets it working again...
>
>
> > Stick wmb();rmb(); (or similar barriers to compiler optimisation and I/O
> > fencing) at the start and end of your ata_sff_altstatus() and see what
> > happens, if it suddenly decides to behave or forcing it no inline makes
> > it behave then that would be useful info.
>
> Will give that a try next
And doing this made no difference
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 90d20c6..db6be15 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -249,10 +249,18 @@ u8 ata_sff_check_status(struct ata_port *ap)
*/
static u8 ata_sff_altstatus(struct ata_port *ap)
{
+ u8 ret;
+
+ mb();
+
if (ap->ops->sff_check_altstatus)
- return ap->ops->sff_check_altstatus(ap);
+ ret = ap->ops->sff_check_altstatus(ap);
+
+ ret = ioread8(ap->ioaddr.altstatus_addr);
- return ioread8(ap->ioaddr.altstatus_addr);
+ mb();
+
+ return ret;
}
/**
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression
2008-06-05 10:31 ` Nick Piggin
@ 2008-06-05 11:19 ` Alan Cox
0 siblings, 0 replies; 6+ messages in thread
From: Alan Cox @ 2008-06-05 11:19 UTC (permalink / raw)
To: Nick Piggin; +Cc: jgarzik, linux-ide
> > @@ -1583,6 +1583,12 @@ inline unsigned int ata_sff_host_intr(struct ata_port *ap
> > if (status & ATA_BUSY)
> > goto idle_irq;
> >
> > + /* check main status, clearing INTRQ */
> > + status = ap->ops->sff_check_status(ap);
> > + if (unlikely(status & ATA_BUSY))
> > + goto idle_irq;
> > +
> > +
Aha all is revealed: and yes it would only show up on a few boxes
libata-sff: Don't assume that check_status is an SFF read
From: Alan Cox <alan@redhat.com>
For a few controllers the sff check status is actually a method not an I/O
access and we must call the method when doing the IRQ check
---
drivers/ata/libata-sff.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
Signed-off-by: Alan Cox <alan@redhat.com>
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 90d20c6..215d186 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -278,7 +278,7 @@ static u8 ata_sff_irq_status(struct ata_port *ap)
return status;
}
/* Clear INTRQ latch */
- status = ata_sff_check_status(ap);
+ status = ap->ops->sff_check_status(ap);
return status;
}
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-06-05 11:35 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-05 5:09 patch "libata-sff: Fix oops for pnp devices with no ctl" causes regression Nick Piggin
2008-06-05 9:24 ` Alan Cox
2008-06-05 10:05 ` Jeff Garzik
2008-06-05 10:21 ` Nick Piggin
2008-06-05 10:31 ` Nick Piggin
2008-06-05 11:19 ` Alan Cox
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).