linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ata: increase retry count but shorten duration for Calxeda controller
@ 2013-05-01 21:34 Mark Langsdorf
  2013-05-06 15:45 ` Timur Tabi
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Langsdorf @ 2013-05-01 21:34 UTC (permalink / raw)
  To: linux-kernel, linux-ide, jgarzik; +Cc: Mark Langsdorf

The Calxeda SATA phy intermittently fails to bring up a link with Gen3
Retrying the phy hard reset can work around the issue, but the drive
may fail again. In less than 150 out of 15000 test runs, it took more
than 10 tries for the link to be established (but never more than 35).
Increase the retry count to guarantee the link is established.

Also, the default 2 second time-out on a failed drive is too long in
this situation. Shorten it to 500 ms. This was also tested 15000 times
on 24 drives and none of them experienced a time out.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
---
 drivers/ata/sata_highbank.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_highbank.c b/drivers/ata/sata_highbank.c
index 0d7c4c2..536936f 100644
--- a/drivers/ata/sata_highbank.c
+++ b/drivers/ata/sata_highbank.c
@@ -199,7 +199,7 @@ static int highbank_initialize_phys(struct device *dev, void __iomem *addr)
 static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 				unsigned long deadline)
 {
-	const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
+	unsigned long timing[] = { 5, 100, 500};
 	struct ata_port *ap = link->ap;
 	struct ahci_port_priv *pp = ap->private_data;
 	u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG;
@@ -207,7 +207,7 @@ static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 	bool online;
 	u32 sstatus;
 	int rc;
-	int retry = 10;
+	int retry = 100;
 
 	ahci_stop_engine(ap);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-01 21:34 [PATCH] " Mark Langsdorf
@ 2013-05-06 15:45 ` Timur Tabi
  0 siblings, 0 replies; 11+ messages in thread
From: Timur Tabi @ 2013-05-06 15:45 UTC (permalink / raw)
  To: Mark Langsdorf; +Cc: lkml, linux-ide, Jeff Garzik

On Wed, May 1, 2013 at 4:34 PM, Mark Langsdorf
<mark.langsdorf@calxeda.com> wrote:
>
> -       const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
> +       unsigned long timing[] = { 5, 100, 500};


Why are you dropping the 'const'?

Assuming it works, this should be more efficient:

static const unsigned long timing[] = {5, 100, 500};

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] ata: increase retry count but shorten duration for Calxeda controller
@ 2013-05-29 15:51 Mark Langsdorf
  2013-05-29 20:12 ` Timur Tabi
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Mark Langsdorf @ 2013-05-29 15:51 UTC (permalink / raw)
  To: linux-kernel, tj, linux-ide; +Cc: Mark Langsdorf

The Calxeda SATA phy intermittently fails to bring up a link with Gen3
Retrying the phy hard reset can work around the issue, but the drive
may fail again. In less than 150 out of 15000 test runs, it took more
than 10 tries for the link to be established (but never more than 35).
Increase the retry count to guarantee the link is established.

Also, the default 2 second time-out on a failed drive is too long in
this situation. Shorten it to 500 ms. This was also tested 15000 times
on 24 drives and none of them experienced a time out.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
---
 drivers/ata/sata_highbank.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_highbank.c b/drivers/ata/sata_highbank.c
index 0d7c4c2..536936f 100644
--- a/drivers/ata/sata_highbank.c
+++ b/drivers/ata/sata_highbank.c
@@ -199,7 +199,7 @@ static int highbank_initialize_phys(struct device *dev, void __iomem *addr)
 static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 				unsigned long deadline)
 {
-	const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
+	unsigned long timing[] = { 5, 100, 500};
 	struct ata_port *ap = link->ap;
 	struct ahci_port_priv *pp = ap->private_data;
 	u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG;
@@ -207,7 +207,7 @@ static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 	bool online;
 	u32 sstatus;
 	int rc;
-	int retry = 10;
+	int retry = 100;
 
 	ahci_stop_engine(ap);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-29 15:51 [PATCH] ata: increase retry count but shorten duration for Calxeda controller Mark Langsdorf
@ 2013-05-29 20:12 ` Timur Tabi
  2013-05-29 20:35   ` Mark Langsdorf
  2013-05-31 15:27 ` [PATCH v3] " Mark Langsdorf
  2013-06-03 13:22 ` [PATCH v4] " Mark Langsdorf
  2 siblings, 1 reply; 11+ messages in thread
From: Timur Tabi @ 2013-05-29 20:12 UTC (permalink / raw)
  To: Mark Langsdorf; +Cc: lkml, Tejun Heo, linux-ide

On Wed, May 29, 2013 at 10:51 AM, Mark Langsdorf
<mark.langsdorf@calxeda.com> wrote:
>
>  {
> -       const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
> +       unsigned long timing[] = { 5, 100, 500};


You didn't address my comments the last time you posted this.  I'll
post them again:


Why are you dropping the 'const'?

Assuming it works, this should be more efficient:

static const unsigned long timing[] = {5, 100, 500};

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-29 20:12 ` Timur Tabi
@ 2013-05-29 20:35   ` Mark Langsdorf
  2013-05-30  6:58     ` Tejun Heo
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Langsdorf @ 2013-05-29 20:35 UTC (permalink / raw)
  To: Timur Tabi; +Cc: lkml, Tejun Heo, linux-ide@vger.kernel.org

On 05/29/2013 03:12 PM, Timur Tabi wrote:
> On Wed, May 29, 2013 at 10:51 AM, Mark Langsdorf
> <mark.langsdorf@calxeda.com> wrote:
>>
>>  {
>> -       const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
>> +       unsigned long timing[] = { 5, 100, 500};
> 
> 
> You didn't address my comments the last time you posted this.  I'll
> post them again:
> 
> 
> Why are you dropping the 'const'?
> 
> Assuming it works, this should be more efficient:
> 
> static const unsigned long timing[] = {5, 100, 500};

I thought there was a compile issue, but I just rechecked and there
wasn't. I'll fix for the next submission.

Thanks for the review.

--Mark Langsdorf


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-29 20:35   ` Mark Langsdorf
@ 2013-05-30  6:58     ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2013-05-30  6:58 UTC (permalink / raw)
  To: Mark Langsdorf; +Cc: Timur Tabi, lkml, linux-ide@vger.kernel.org

On Wed, May 29, 2013 at 03:35:28PM -0500, Mark Langsdorf wrote:
> On 05/29/2013 03:12 PM, Timur Tabi wrote:
> > On Wed, May 29, 2013 at 10:51 AM, Mark Langsdorf
> > <mark.langsdorf@calxeda.com> wrote:
> >>
> >>  {
> >> -       const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
> >> +       unsigned long timing[] = { 5, 100, 500};
> > 
> > 
> > You didn't address my comments the last time you posted this.  I'll
> > post them again:
> > 
> > 
> > Why are you dropping the 'const'?
> > 
> > Assuming it works, this should be more efficient:
> > 
> > static const unsigned long timing[] = {5, 100, 500};
> 
> I thought there was a compile issue, but I just rechecked and there
> wasn't. I'll fix for the next submission.

Also, please add a comment explaining why those parameters are
necessary and how they're determined - ie. the bulk of the commit
message; otherwise, it looks pretty random.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-29 15:51 [PATCH] ata: increase retry count but shorten duration for Calxeda controller Mark Langsdorf
  2013-05-29 20:12 ` Timur Tabi
@ 2013-05-31 15:27 ` Mark Langsdorf
  2013-06-02  8:00   ` Tejun Heo
  2013-06-03 13:22 ` [PATCH v4] " Mark Langsdorf
  2 siblings, 1 reply; 11+ messages in thread
From: Mark Langsdorf @ 2013-05-31 15:27 UTC (permalink / raw)
  To: tj, timur, linux-kernel, linux-ide, clemens, sergei.shtylyov
  Cc: Mark Langsdorf

The Calxeda SATA phy intermittently fails to bring up a link with Gen3
Retrying the phy hard reset can work around the issue, but the drive
may fail again. In less than 150 out of 15000 test runs, it took more
than 10 tries for the link to be established (but never more than 35).
Triple the maximum observed retry count to provide plenty of margin for
rare events and to guarantee that the link is established.

Also, the default 2 second time-out on a failed drive is too long in
this situation. The uboot implementation of the same driver function
uses a much shorter time-out period and never experiences a time out
issue. Shorten the Linux time-out value for this driver to 500 ms and
keep the other timing constants the same as the stock AHCI driver. This
change was also tested 15000 times on 24 drives and none of them
experienced a time out.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
---
Changes from v2
	Add static to the timing variable definition
Changes from v1
        Add const to the timing variable definition
        Added more detail in why the various numbers were chosen

 drivers/ata/sata_highbank.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_highbank.c b/drivers/ata/sata_highbank.c
index b20aa96..46ccc1c 100644
--- a/drivers/ata/sata_highbank.c
+++ b/drivers/ata/sata_highbank.c
@@ -199,7 +199,7 @@ static int highbank_initialize_phys(struct device *dev, void __iomem *addr)
 static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 				unsigned long deadline)
 {
-	const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
+	static const unsigned long timing[] = { 5, 100, 500};
 	struct ata_port *ap = link->ap;
 	struct ahci_port_priv *pp = ap->private_data;
 	u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG;
@@ -207,7 +207,7 @@ static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 	bool online;
 	u32 sstatus;
 	int rc;
-	int retry = 10;
+	int retry = 100;
 
 	ahci_stop_engine(ap);
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-31 15:27 ` [PATCH v3] " Mark Langsdorf
@ 2013-06-02  8:00   ` Tejun Heo
  2013-06-03 12:27     ` Mark Langsdorf
  0 siblings, 1 reply; 11+ messages in thread
From: Tejun Heo @ 2013-06-02  8:00 UTC (permalink / raw)
  To: Mark Langsdorf; +Cc: timur, linux-kernel, linux-ide, clemens, sergei.shtylyov

On Fri, May 31, 2013 at 10:27:26AM -0500, Mark Langsdorf wrote:
> The Calxeda SATA phy intermittently fails to bring up a link with Gen3
> Retrying the phy hard reset can work around the issue, but the drive
> may fail again. In less than 150 out of 15000 test runs, it took more
> than 10 tries for the link to be established (but never more than 35).
> Triple the maximum observed retry count to provide plenty of margin for
> rare events and to guarantee that the link is established.
> 
> Also, the default 2 second time-out on a failed drive is too long in
> this situation. The uboot implementation of the same driver function
> uses a much shorter time-out period and never experiences a time out
> issue. Shorten the Linux time-out value for this driver to 500 ms and
> keep the other timing constants the same as the stock AHCI driver. This
> change was also tested 15000 times on 24 drives and none of them
> experienced a time out.

For the third time, explain the above in the comment; otherwise, it's
not going in.

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v3] ata: increase retry count but shorten duration for Calxeda controller
  2013-06-02  8:00   ` Tejun Heo
@ 2013-06-03 12:27     ` Mark Langsdorf
  0 siblings, 0 replies; 11+ messages in thread
From: Mark Langsdorf @ 2013-06-03 12:27 UTC (permalink / raw)
  To: Tejun Heo
  Cc: timur@tabi.org, linux-kernel@vger.kernel.org,
	linux-ide@vger.kernel.org, clemens@ladisch.de,
	sergei.shtylyov@cogentembedded.com

On 06/02/2013 03:00 AM, Tejun Heo wrote:
> On Fri, May 31, 2013 at 10:27:26AM -0500, Mark Langsdorf wrote:
> 
> For the third time, explain the above in the comment; otherwise, it's
> not going in.

Sorry, I completely misread your requirement. I'll move it to the
comment as requested.

--Mark Langsdorf
Calxeda, Inc.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v4] ata: increase retry count but shorten duration for Calxeda controller
  2013-05-29 15:51 [PATCH] ata: increase retry count but shorten duration for Calxeda controller Mark Langsdorf
  2013-05-29 20:12 ` Timur Tabi
  2013-05-31 15:27 ` [PATCH v3] " Mark Langsdorf
@ 2013-06-03 13:22 ` Mark Langsdorf
  2013-06-03 20:39   ` Tejun Heo
  2 siblings, 1 reply; 11+ messages in thread
From: Mark Langsdorf @ 2013-06-03 13:22 UTC (permalink / raw)
  To: tj, timur, linux-kernel, linux-ide, clemens, sergei.shtylyov
  Cc: Mark Langsdorf

Increase the retry count for the hard reset function to 100 but
shorten the time out period to 500 ms. See the comment for
ahci_highbank_hardreset for the reasons why those vaulues were
chosen.

Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>
---
Changes from v3
	Move the detail to a comment on the ahci_highbank_hardreset function
Changes from v2
        Add static to the timing variable definition
Changes from v1
        Add const to the timing variable definition
        Added more detail in why the various numbers were chosen

 drivers/ata/sata_highbank.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_highbank.c b/drivers/ata/sata_highbank.c
index b20aa96..c846fd3 100644
--- a/drivers/ata/sata_highbank.c
+++ b/drivers/ata/sata_highbank.c
@@ -196,10 +196,26 @@ static int highbank_initialize_phys(struct device *dev, void __iomem *addr)
 	return 0;
 }
 
+/*
+ * The Calxeda SATA phy intermittently fails to bring up a link with Gen3
+ * Retrying the phy hard reset can work around the issue, but the drive
+ * may fail again. In less than 150 out of 15000 test runs, it took more
+ * than 10 tries for the link to be established (but never more than 35).
+ * Triple the maximum observed retry count to provide plenty of margin for
+ * rare events and to guarantee that the link is established.
+ *
+ * Also, the default 2 second time-out on a failed drive is too long in
+ * this situation. The uboot implementation of the same driver function
+ * uses a much shorter time-out period and never experiences a time out
+ * issue. Reducing the time-out to 500ms improves the responsiveness.
+ * The other timing constants were kept the same as the stock AHCI driver.
+ * This change was also tested 15000 times on 24 drives and none of them
+ * experienced a time out.
+ */
 static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 				unsigned long deadline)
 {
-	const unsigned long *timing = sata_ehc_deb_timing(&link->eh_context);
+	static const unsigned long timing[] = { 5, 100, 500};
 	struct ata_port *ap = link->ap;
 	struct ahci_port_priv *pp = ap->private_data;
 	u8 *d2h_fis = pp->rx_fis + RX_FIS_D2H_REG;
@@ -207,7 +223,7 @@ static int ahci_highbank_hardreset(struct ata_link *link, unsigned int *class,
 	bool online;
 	u32 sstatus;
 	int rc;
-	int retry = 10;
+	int retry = 100;
 
 	ahci_stop_engine(ap);
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v4] ata: increase retry count but shorten duration for Calxeda controller
  2013-06-03 13:22 ` [PATCH v4] " Mark Langsdorf
@ 2013-06-03 20:39   ` Tejun Heo
  0 siblings, 0 replies; 11+ messages in thread
From: Tejun Heo @ 2013-06-03 20:39 UTC (permalink / raw)
  To: Mark Langsdorf; +Cc: timur, linux-kernel, linux-ide, clemens, sergei.shtylyov

On Mon, Jun 03, 2013 at 08:22:54AM -0500, Mark Langsdorf wrote:
> Increase the retry count for the hard reset function to 100 but
> shorten the time out period to 500 ms. See the comment for
> ahci_highbank_hardreset for the reasons why those vaulues were
> chosen.
> 
> Signed-off-by: Mark Langsdorf <mark.langsdorf@calxeda.com>

Applied to libata/for-3.10-fixes w/ stable cc'd.

Thanks!

-- 
tejun

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2013-06-03 20:39 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-29 15:51 [PATCH] ata: increase retry count but shorten duration for Calxeda controller Mark Langsdorf
2013-05-29 20:12 ` Timur Tabi
2013-05-29 20:35   ` Mark Langsdorf
2013-05-30  6:58     ` Tejun Heo
2013-05-31 15:27 ` [PATCH v3] " Mark Langsdorf
2013-06-02  8:00   ` Tejun Heo
2013-06-03 12:27     ` Mark Langsdorf
2013-06-03 13:22 ` [PATCH v4] " Mark Langsdorf
2013-06-03 20:39   ` Tejun Heo
  -- strict thread matches above, loose matches on Subject: below --
2013-05-01 21:34 [PATCH] " Mark Langsdorf
2013-05-06 15:45 ` Timur Tabi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).