linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gary Hade <garyhade@us.ibm.com>
To: Tejun Heo <htejun@gmail.com>
Cc: Gary Hade <garyhade@us.ibm.com>,
	Kovid Goyal <kovid@theory.caltech.edu>,
	linux-ide@vger.kernel.org, lcm@us.ibm.com,
	Jeff Garzik <jgarzik@pobox.com>,
	konradr@us.ibm.com
Subject: Re: [2.6.18,19] SATA boot problems (ICH6/ICH6W)
Date: Tue, 30 Jan 2007 15:37:36 -0800	[thread overview]
Message-ID: <20070130233735.GA7483@us.ibm.com> (raw)
In-Reply-To: <45BEF492.9000000@gmail.com>

On Tue, Jan 30, 2007 at 04:32:34PM +0900, Tejun Heo wrote:
> Hello, Gary.
> 
> Gary Hade wrote:
> >>> If they verify your fix (ie,
> >>> GoVault sometimes take more than 150ms to transmit the first D2H Reg FIs
> >>> after SRST), I'll push similar patch upstream.
> >> Thanks.  If you think that changes to increase the delays are
> >> the way to go (at least until we can find a better solution)
> >> I can provide patches.
> > 
> > Tejun, 
> > I haven't heard anything from you on this so I'm including a delay
> > increase patch against 2.6.20-rc6 for the 'ata-piix' case below.  
> > I hope that you, Jeff, and others find this acceptable.
> 
> Sorry about being unresponsive.  The thing is that the change adds
> unnecessary 2 secs of delay to a lot of other normal device-not-present
> cases, so I was hesitant to ack the patch.  I'll give it more thoughts
> (and respond timely this time :-)

Thanks!  My followup was untimely so we're even. :-)

Some of my random thoughts:
There does appear to be this invalid assumption that 0xFF status 
always implies device-not-present.  The status register access 
restrictions in ATA/ATAPI-7 V1 5.14.2 include the statement "The 
contents of this register, except for BSY, shall be ignored when 
BSY is set to one." which the code does not honor.  There is apparently 
past experience that 0xFF status implies device-not-present for some
controllers (the odd clowns :) but I have no idea how common these are.
We obviously can't get rid of the check but since we cannot clear
the read-only status register and there appears to be no specification 
dictated upper limit on how long it should take for a software reset to 
complete it just seems like we need to wait long enough to support the 
slowest known device which may be the GoVault.

> 
> > With respect to the 'ahci' case w/2.6.20-rc6 the GoVault device is 
> > useable following boot although the below messages are being logged 
> > during initialization.  Please let me know if you have any thoughts 
> > on this.  
> >   scsi1 : ahci
> >   ata2: softreset failed (port busy but CLO unavailable)
> >   ata2: softreset failed, retrying in 5 secs
> >   ata2: port is slow to respond, please be patient (Status 0x80)
> >   ata2: port failed to respond (30 secs, Status 0x80)
> >   ata2: COMRESET failed (device not ready)
> >   ata2: hardreset failed, retrying in 5 secs
> >   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >   ata2.00: ATAPI, max UDMA/66
> >   ata2.00: configured for UDMA/66
> 
> The above should have been fixed in 2.6.20-rc6.  Please test it.  It was
> caused by the ahci driver incorrectly clearing ahci CAP register and
> fixed recently.

I'm clearly seeing this with 2.6.20-rc6 but unlike the ata-piix
issue it does not appear to be dependent on the port to which the
device is attached.  I've been playing around with this today and
found that it could be solved by inserting a delay between the 
ahci_stop_engine() call and BSY/DRQ check.

This change:
--- linux-2.6.20-rc6/drivers/ata/ahci.c.orig	2007-01-30 11:01:20.000000000 -0800
+++ linux-2.6.20-rc6/drivers/ata/ahci.c	2007-01-30 12:59:38.000000000 -0800
@@ -804,6 +804,19 @@ static int ahci_softreset(struct ata_por
 		goto fail_restart;
 	}
 
+	{
+		int delay;
+		u8 stat;
+		for (delay = 0; delay < 2000; delay+=100) {
+			if (!(ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)))
+				break;
+			msleep(100);
+			stat = ahci_check_status(ap);
+			ata_port_printk(ap, KERN_INFO, "delay=%d BSY=%d DRQ=%d\n",
+				delay, (stat & ATA_BUSY)?1:0, (stat & ATA_DRQ)?1:0);
+		}
+	}
+
 	/* check BUSY/DRQ, perform Command List Override if necessary */
 	if (ahci_check_status(ap) & (ATA_BUSY | ATA_DRQ)) {
 		rc = ahci_clo(ap);

Yielded this output both with and without the RDC inserted:
scsi1 : ahci
ata2: delay=0 BSY=1 DRQ=0
ata2: delay=100 BSY=1 DRQ=0
ata2: delay=200 BSY=1 DRQ=0
ata2: delay=300 BSY=1 DRQ=0
ata2: delay=400 BSY=1 DRQ=0
ata2: delay=500 BSY=1 DRQ=0
ata2: delay=600 BSY=1 DRQ=0
ata2: delay=700 BSY=1 DRQ=0
ata2: delay=800 BSY=1 DRQ=0
ata2: delay=900 BSY=1 DRQ=0
ata2: delay=1000 BSY=1 DRQ=0
ata2: delay=1100 BSY=1 DRQ=0
ata2: delay=1200 BSY=0 DRQ=0
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATAPI, max UDMA/66
ata2.00: configured for UDMA/66

So it appears that we may also have a similar device slowness issue 
with this driver.

Gary

-- 
Gary Hade
System x Enablement
IBM Linux Technology Center
503-578-4503  IBM T/L: 775-4503
garyhade@us.ibm.com
http://www.ibm.com/linux/ltc


  reply	other threads:[~2007-01-30 23:37 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-11 18:03 [2.6.18,19] SATA boot problems (ICH6/ICH6W) Kovid Goyal
2006-12-20  0:44 ` Tejun Heo
2006-12-20  2:00   ` Kovid Goyal
2006-12-20  2:13     ` Tejun Heo
2006-12-20  4:56       ` Kovid Goyal
2007-01-11 23:32       ` Kovid Goyal
2007-01-13  2:19         ` Tejun Heo
2006-12-20  3:29   ` Gary Hade
2006-12-20  3:53     ` Tejun Heo
2006-12-20  4:30       ` Tejun Heo
2006-12-21 17:10       ` Gary Hade
2007-01-30  1:55         ` Gary Hade
2007-01-30  7:32           ` Tejun Heo
2007-01-30 23:37             ` Gary Hade [this message]
2007-01-31  0:54               ` Jeff Garzik
2007-01-31 11:00                 ` Tejun Heo
2007-01-31 12:20                   ` Alan
2007-01-31 13:16                     ` Tejun Heo
2007-01-31 15:24                       ` Jeff Garzik
2007-01-31 15:30                         ` Mark Lord
2007-01-31 10:44               ` Tejun Heo
2007-01-31 10:47                 ` Jeff Garzik
2007-01-31 11:00                   ` Tejun Heo
2007-02-01  0:49                 ` Gary Hade
2007-02-17  0:34               ` Gary Hade
2007-02-21 12:40                 ` Tejun Heo
2007-02-22  0:41                   ` Gary Hade
2007-02-23  0:32                   ` Gary Hade
2007-01-23 21:49 ` danieljzhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070130233735.GA7483@us.ibm.com \
    --to=garyhade@us.ibm.com \
    --cc=htejun@gmail.com \
    --cc=jgarzik@pobox.com \
    --cc=konradr@us.ibm.com \
    --cc=kovid@theory.caltech.edu \
    --cc=lcm@us.ibm.com \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).