* [PATCH 1/8] libata: fix EH device failure handling
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 2/8] libata: move ata_dev_disable() to libata-eh.c Tejun Heo
` (7 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary
speed down warning messages but it accidentally disabled SATA link spd
down during configuration phase after reset where PIO mode is always
zero.
This patch fixes the problem by moving the test where it belongs.
This makes libata probing sequence behave better when the connection
is flaky at higher link speeds which isn't too uncommon for eSATA
devices.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-eh.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 8147a83..c15572d 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -2979,12 +2979,13 @@ static int ata_eh_handle_dev_fail(struct ata_device *dev, int err)
/* give it just one more chance */
ehc->tries[dev->devno] = min(ehc->tries[dev->devno], 1);
case -EIO:
- if (ehc->tries[dev->devno] == 1 && dev->pio_mode > XFER_PIO_0) {
+ if (ehc->tries[dev->devno] == 1) {
/* This is the last chance, better to slow
* down than lose it.
*/
sata_down_spd_limit(ata_dev_phys_link(dev));
- ata_down_xfermask_limit(dev, ATA_DNXFER_PIO);
+ if (dev->pio_mode > XFER_PIO_0)
+ ata_down_xfermask_limit(dev, ATA_DNXFER_PIO);
}
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 2/8] libata: move ata_dev_disable() to libata-eh.c
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
2009-01-29 11:31 ` [PATCH 1/8] libata: fix EH device failure handling Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 3/8] libata: check onlineness before using SPD in sata_down_spd_limit() Tejun Heo
` (6 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
ata_dev_disable() is about to be more tightly integrated into EH
logic. Move it to libata-eh.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-core.c | 12 ------------
drivers/ata/libata-eh.c | 21 +++++++++++++++++++++
drivers/ata/libata.h | 2 +-
3 files changed, 22 insertions(+), 13 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 71218d7..20d0e27 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -1015,18 +1015,6 @@ static const char *sata_spd_string(unsigned int spd)
return spd_str[spd - 1];
}
-void ata_dev_disable(struct ata_device *dev)
-{
- if (ata_dev_enabled(dev)) {
- if (ata_msg_drv(dev->link->ap))
- ata_dev_printk(dev, KERN_WARNING, "disabled\n");
- ata_acpi_on_disable(dev);
- ata_down_xfermask_limit(dev, ATA_DNXFER_FORCE_PIO0 |
- ATA_DNXFER_QUIET);
- dev->class++;
- }
-}
-
static int ata_dev_set_dipm(struct ata_device *dev, enum link_pm policy)
{
struct ata_link *link = dev->link;
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index c15572d..aafe82b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1176,6 +1176,27 @@ void ata_eh_qc_retry(struct ata_queued_cmd *qc)
}
/**
+ * ata_dev_disable - disable ATA device
+ * @dev: ATA device to disable
+ *
+ * Disable @dev.
+ *
+ * Locking:
+ * EH context.
+ */
+void ata_dev_disable(struct ata_device *dev)
+{
+ if (!ata_dev_enabled(dev))
+ return;
+
+ if (ata_msg_drv(dev->link->ap))
+ ata_dev_printk(dev, KERN_WARNING, "disabled\n");
+ ata_acpi_on_disable(dev);
+ ata_down_xfermask_limit(dev, ATA_DNXFER_FORCE_PIO0 | ATA_DNXFER_QUIET);
+ dev->class++;
+}
+
+/**
* ata_eh_detach_dev - detach ATA device
* @dev: ATA device to detach
*
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index fe2839e..0a6f5be 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -79,7 +79,6 @@ extern int ata_build_rw_tf(struct ata_taskfile *tf, struct ata_device *dev,
u64 block, u32 n_block, unsigned int tf_flags,
unsigned int tag);
extern u64 ata_tf_read_block(struct ata_taskfile *tf, struct ata_device *dev);
-extern void ata_dev_disable(struct ata_device *dev);
extern void ata_pio_queue_task(struct ata_port *ap, void *data,
unsigned long delay);
extern void ata_port_flush_task(struct ata_port *ap);
@@ -160,6 +159,7 @@ extern void ata_scsi_error(struct Scsi_Host *host);
extern void ata_port_wait_eh(struct ata_port *ap);
extern void ata_eh_fastdrain_timerfn(unsigned long arg);
extern void ata_qc_schedule_eh(struct ata_queued_cmd *qc);
+extern void ata_dev_disable(struct ata_device *dev);
extern void ata_eh_detach_dev(struct ata_device *dev);
extern void ata_eh_about_to_do(struct ata_link *link, struct ata_device *dev,
unsigned int action);
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 3/8] libata: check onlineness before using SPD in sata_down_spd_limit()
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
2009-01-29 11:31 ` [PATCH 1/8] libata: fix EH device failure handling Tejun Heo
2009-01-29 11:31 ` [PATCH 2/8] libata: move ata_dev_disable() to libata-eh.c Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 4/8] libata: clear dev->ering in smarter way Tejun Heo
` (5 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
sata_down_spd_limit() should check whether the link is online before
using the SPD value to determine how to limit the link speed. Factor
out onlineness test and test it from sata_down_spd_limit().
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-core.c | 11 ++++++++---
1 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 20d0e27..daf0dec 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -164,6 +164,11 @@ MODULE_LICENSE("GPL");
MODULE_VERSION(DRV_VERSION);
+static bool ata_sstatus_online(u32 sstatus)
+{
+ return (sstatus & 0xf) == 0x3;
+}
+
/**
* ata_link_next - link iteration helper
* @link: the previous link, NULL to start
@@ -2891,7 +2896,7 @@ int sata_down_spd_limit(struct ata_link *link)
* If not, use cached value in link->sata_spd.
*/
rc = sata_scr_read(link, SCR_STATUS, &sstatus);
- if (rc == 0)
+ if (rc == 0 && ata_sstatus_online(sstatus))
spd = (sstatus >> 4) & 0xf;
else
spd = link->sata_spd;
@@ -5161,7 +5166,7 @@ bool ata_phys_link_online(struct ata_link *link)
u32 sstatus;
if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
- (sstatus & 0xf) == 0x3)
+ ata_sstatus_online(sstatus))
return true;
return false;
}
@@ -5185,7 +5190,7 @@ bool ata_phys_link_offline(struct ata_link *link)
u32 sstatus;
if (sata_scr_read(link, SCR_STATUS, &sstatus) == 0 &&
- (sstatus & 0xf) != 0x3)
+ !ata_sstatus_online(sstatus))
return true;
return false;
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 4/8] libata: clear dev->ering in smarter way
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (2 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 3/8] libata: check onlineness before using SPD in sata_down_spd_limit() Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 5/8] libata: add @spd_limit to sata_down_spd_limit() Tejun Heo
` (4 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
dev->ering used to be cleared together with the rest of ata_device in
ata_dev_init() which is called whenever a probing event occurs.
dev->ering is about to be used to track probing failures so it needs
to remain persistent over multiple porbing events. This patch
achieves this by doing the following.
* Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
clear between BEGIN and END. ering is moved after END. The split
of persistent area is to allow hotter items remain at the head.
* ering is explicitly cleared on ata_dev_disable() and when device
attach succeeds. So, ering is persistent throug a device's life
time (unless explicitly cleared of course) and also through periods
inbetween disablement of an attached device and successful detection
of the next one.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-core.c | 4 ++--
drivers/ata/libata-eh.c | 7 +++++++
include/linux/libata.h | 18 ++++++++++--------
3 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index daf0dec..6a5b8e2 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5403,8 +5403,8 @@ void ata_dev_init(struct ata_device *dev)
dev->horkage = 0;
spin_unlock_irqrestore(ap->lock, flags);
- memset((void *)dev + ATA_DEVICE_CLEAR_OFFSET, 0,
- sizeof(*dev) - ATA_DEVICE_CLEAR_OFFSET);
+ memset((void *)dev + ATA_DEVICE_CLEAR_BEGIN, 0,
+ ATA_DEVICE_CLEAR_END - ATA_DEVICE_CLEAR_BEGIN);
dev->pio_mask = UINT_MAX;
dev->mwdma_mask = UINT_MAX;
dev->udma_mask = UINT_MAX;
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index aafe82b..90092c1 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1194,6 +1194,11 @@ void ata_dev_disable(struct ata_device *dev)
ata_acpi_on_disable(dev);
ata_down_xfermask_limit(dev, ATA_DNXFER_FORCE_PIO0 | ATA_DNXFER_QUIET);
dev->class++;
+
+ /* From now till the next successful probe, ering is used to
+ * track probe failures. Clear accumulated device error info.
+ */
+ ata_ering_clear(&dev->ering);
}
/**
@@ -2765,6 +2770,8 @@ static int ata_eh_revalidate_and_attach(struct ata_link *link,
readid_flags, dev->id);
switch (rc) {
case 0:
+ /* clear error info accumulated during probe */
+ ata_ering_clear(&dev->ering);
new_mask |= 1 << dev->devno;
break;
case -ENOENT:
diff --git a/include/linux/libata.h b/include/linux/libata.h
index b6b8a7f..7d7bc11 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -576,7 +576,7 @@ struct ata_device {
acpi_handle acpi_handle;
union acpi_object *gtf_cache;
#endif
- /* n_sector is used as CLEAR_OFFSET, read comment above CLEAR_OFFSET */
+ /* n_sector is CLEAR_BEGIN, read comment above CLEAR_BEGIN */
u64 n_sectors; /* size of device, if ATA */
unsigned int class; /* ATA_DEV_xxx */
unsigned long unpark_deadline;
@@ -601,20 +601,22 @@ struct ata_device {
u16 heads; /* Number of heads */
u16 sectors; /* Number of sectors per track */
- /* error history */
- int spdn_cnt;
- struct ata_ering ering;
-
union {
u16 id[ATA_ID_WORDS]; /* IDENTIFY xxx DEVICE data */
u32 gscr[SATA_PMP_GSCR_DWORDS]; /* PMP GSCR block */
};
+
+ /* error history */
+ int spdn_cnt;
+ /* ering is CLEAR_END, read comment above CLEAR_END */
+ struct ata_ering ering;
};
-/* Offset into struct ata_device. Fields above it are maintained
- * acress device init. Fields below are zeroed.
+/* Fields between ATA_DEVICE_CLEAR_BEGIN and ATA_DEVICE_CLEAR_END are
+ * cleared to zero on ata_dev_init().
*/
-#define ATA_DEVICE_CLEAR_OFFSET offsetof(struct ata_device, n_sectors)
+#define ATA_DEVICE_CLEAR_BEGIN offsetof(struct ata_device, n_sectors)
+#define ATA_DEVICE_CLEAR_END offsetof(struct ata_device, ering)
struct ata_eh_info {
struct ata_device *dev; /* offending device */
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 5/8] libata: add @spd_limit to sata_down_spd_limit()
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (3 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 4/8] libata: clear dev->ering in smarter way Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 6/8] libata: improve probe failure handling Tejun Heo
` (3 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
Add @spd_limit to sata_down_spd_limit() so that the caller can specify
the SPD limit it wants. This parameter doesn't get in the way even
when it's too low. The closest possible limit is applied.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-core.c | 25 ++++++++++++++++++++-----
drivers/ata/libata-eh.c | 10 +++++-----
drivers/ata/libata-pmp.c | 2 +-
drivers/ata/libata.h | 2 +-
4 files changed, 27 insertions(+), 12 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 6a5b8e2..0cc0d8f 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2777,7 +2777,7 @@ int ata_bus_probe(struct ata_port *ap)
/* This is the last chance, better to slow
* down than lose it.
*/
- sata_down_spd_limit(&ap->link);
+ sata_down_spd_limit(&ap->link, 0);
ata_down_xfermask_limit(dev, ATA_DNXFER_PIO);
}
}
@@ -2873,21 +2873,27 @@ void ata_port_disable(struct ata_port *ap)
/**
* sata_down_spd_limit - adjust SATA spd limit downward
* @link: Link to adjust SATA spd limit for
+ * @spd_limit: Additional limit
*
* Adjust SATA spd limit of @link downward. Note that this
* function only adjusts the limit. The change must be applied
* using sata_set_spd().
*
+ * If @spd_limit is non-zero, the speed is limited to equal to or
+ * lower than @spd_limit if such speed is supported. If
+ * @spd_limit is slower than any supported speed, only the lowest
+ * supported speed is allowed.
+ *
* LOCKING:
* Inherited from caller.
*
* RETURNS:
* 0 on success, negative errno on failure
*/
-int sata_down_spd_limit(struct ata_link *link)
+int sata_down_spd_limit(struct ata_link *link, u32 spd_limit)
{
u32 sstatus, spd, mask;
- int rc, highbit;
+ int rc, bit;
if (!sata_scr_valid(link))
return -EOPNOTSUPP;
@@ -2906,8 +2912,8 @@ int sata_down_spd_limit(struct ata_link *link)
return -EINVAL;
/* unconditionally mask off the highest bit */
- highbit = fls(mask) - 1;
- mask &= ~(1 << highbit);
+ bit = fls(mask) - 1;
+ mask &= ~(1 << bit);
/* Mask off all speeds higher than or equal to the current
* one. Force 1.5Gbps if current SPD is not available.
@@ -2921,6 +2927,15 @@ int sata_down_spd_limit(struct ata_link *link)
if (!mask)
return -EINVAL;
+ if (spd_limit) {
+ if (mask & ((1 << spd_limit) - 1))
+ mask &= (1 << spd_limit) - 1;
+ else {
+ bit = ffs(mask) - 1;
+ mask = 1 << bit;
+ }
+ }
+
link->sata_spd_limit = mask;
ata_link_printk(link, KERN_WARNING, "limiting SATA link speed to %s\n",
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 90092c1..685509b 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -1875,7 +1875,7 @@ static unsigned int ata_eh_speed_down(struct ata_device *dev,
/* speed down? */
if (verdict & ATA_EH_SPDN_SPEED_DOWN) {
/* speed down SATA link speed if possible */
- if (sata_down_spd_limit(link) == 0) {
+ if (sata_down_spd_limit(link, 0) == 0) {
action |= ATA_EH_RESET;
goto done;
}
@@ -2627,11 +2627,11 @@ int ata_eh_reset(struct ata_link *link, int classify,
}
if (try == max_tries - 1) {
- sata_down_spd_limit(link);
+ sata_down_spd_limit(link, 0);
if (slave)
- sata_down_spd_limit(slave);
+ sata_down_spd_limit(slave, 0);
} else if (rc == -EPIPE)
- sata_down_spd_limit(failed_link);
+ sata_down_spd_limit(failed_link, 0);
if (hardreset)
reset = hardreset;
@@ -3011,7 +3011,7 @@ static int ata_eh_handle_dev_fail(struct ata_device *dev, int err)
/* This is the last chance, better to slow
* down than lose it.
*/
- sata_down_spd_limit(ata_dev_phys_link(dev));
+ sata_down_spd_limit(ata_dev_phys_link(dev), 0);
if (dev->pio_mode > XFER_PIO_0)
ata_down_xfermask_limit(dev, ATA_DNXFER_PIO);
}
diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c
index 98ca07a..619f2c3 100644
--- a/drivers/ata/libata-pmp.c
+++ b/drivers/ata/libata-pmp.c
@@ -729,7 +729,7 @@ static int sata_pmp_eh_recover_pmp(struct ata_port *ap,
if (tries) {
/* consecutive revalidation failures? speed down */
if (reval_failed)
- sata_down_spd_limit(link);
+ sata_down_spd_limit(link, 0);
else
reval_failed = 1;
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 0a6f5be..cea8014 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -99,7 +99,7 @@ extern int ata_dev_reread_id(struct ata_device *dev, unsigned int readid_flags);
extern int ata_dev_revalidate(struct ata_device *dev, unsigned int new_class,
unsigned int readid_flags);
extern int ata_dev_configure(struct ata_device *dev);
-extern int sata_down_spd_limit(struct ata_link *link);
+extern int sata_down_spd_limit(struct ata_link *link, u32 spd_limit);
extern int ata_down_xfermask_limit(struct ata_device *dev, unsigned int sel);
extern void ata_sg_clean(struct ata_queued_cmd *qc);
extern void ata_qc_free(struct ata_queued_cmd *qc);
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 6/8] libata: improve probe failure handling
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (4 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 5/8] libata: add @spd_limit to sata_down_spd_limit() Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 7/8] libata: add no penalty retry request for EH device handling routines Tejun Heo
` (2 subsequent siblings)
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
When link is flaky at high speed, it isn't uncommon for a device to
repeatedly fail probing sequence early after successfully negotiating
high link speed. This often leads to consecutive hotplug events
without successful probing.
This patch improves libata EH such that it remembers probing trials
and if there have been more than two unsuccessful trials in the past
60 seconds, slows down link speed to 1.5Gbps.
As link speed negotiation is the duty of the PHY layer proper, the
goal of this fallback mechanism is to provide the last resort when
everything else fails, which unfortunately happens not too
infrequently, so no fancy 6->3->1.5 speeding down or highest
successful transmission speed seen kind of logics (yet).
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-eh.c | 38 ++++++++++++++++++++++++++++++++++++++
1 files changed, 38 insertions(+), 0 deletions(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 685509b..caa1e9f 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -82,6 +82,10 @@ enum {
ATA_EH_FASTDRAIN_INTERVAL = 3000,
ATA_EH_UA_TRIES = 5,
+
+ /* probe speed down parameters, see ata_eh_schedule_probe() */
+ ATA_EH_PROBE_TRIAL_INTERVAL = 60000, /* 1 min */
+ ATA_EH_PROBE_TRIALS = 2,
};
/* The following table determines how we sequence resets. Each entry
@@ -2975,9 +2979,24 @@ static int ata_eh_skip_recovery(struct ata_link *link)
return 1;
}
+static int ata_count_probe_trials_cb(struct ata_ering_entry *ent, void *void_arg)
+{
+ u64 interval = msecs_to_jiffies(ATA_EH_PROBE_TRIAL_INTERVAL);
+ u64 now = get_jiffies_64();
+ int *trials = void_arg;
+
+ if (ent->timestamp < now - min(now, interval))
+ return -1;
+
+ (*trials)++;
+ return 0;
+}
+
static int ata_eh_schedule_probe(struct ata_device *dev)
{
struct ata_eh_context *ehc = &dev->link->eh_context;
+ struct ata_link *link = ata_dev_phys_link(dev);
+ int trials = 0;
if (!(ehc->i.probe_mask & (1 << dev->devno)) ||
(ehc->did_probe_mask & (1 << dev->devno)))
@@ -2990,6 +3009,25 @@ static int ata_eh_schedule_probe(struct ata_device *dev)
ehc->saved_xfer_mode[dev->devno] = 0;
ehc->saved_ncq_enabled &= ~(1 << dev->devno);
+ /* Record and count probe trials on the ering. The specific
+ * error mask used is irrelevant. Because a successful device
+ * detection clears the ering, this count accumulates only if
+ * there are consecutive failed probes.
+ *
+ * If the count is equal to or higher than ATA_EH_PROBE_TRIALS
+ * in the last ATA_EH_PROBE_TRIAL_INTERVAL, link speed is
+ * forced to 1.5Gbps.
+ *
+ * This is to work around cases where failed link speed
+ * negotiation results in device misdetection leading to
+ * infinite DEVXCHG or PHRDY CHG events.
+ */
+ ata_ering_record(&dev->ering, 0, AC_ERR_OTHER);
+ ata_ering_map(&dev->ering, ata_count_probe_trials_cb, &trials);
+
+ if (trials > ATA_EH_PROBE_TRIALS)
+ sata_down_spd_limit(link, 1);
+
return 1;
}
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 7/8] libata: add no penalty retry request for EH device handling routines
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (5 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 6/8] libata: improve probe failure handling Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-01-29 11:31 ` [PATCH 8/8] libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book Tejun Heo
2009-01-29 14:07 ` [PATCHSET #upstream-fixes] libata: improve flaky link handling Greg Freemyer
8 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
Let -EAGAIN from EH device handling routines trigger EH retry without
consuming its tries count. This will be used to implement link SPD
horkage which requires hardreset to adjust SPD without affecting other
EH decisions. As it bypasses the forward progress guarantee provided
by the tries count, the requester is responsible for ensuring forward
progress.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-eh.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index caa1e9f..ce2ef04 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -3035,7 +3035,11 @@ static int ata_eh_handle_dev_fail(struct ata_device *dev, int err)
{
struct ata_eh_context *ehc = &dev->link->eh_context;
- ehc->tries[dev->devno]--;
+ /* -EAGAIN from EH routine indicates retry without prejudice.
+ * The requester is responsible for ensuring forward progress.
+ */
+ if (err != -EAGAIN)
+ ehc->tries[dev->devno]--;
switch (err) {
case -ENODEV:
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* [PATCH 8/8] libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (6 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 7/8] libata: add no penalty retry request for EH device handling routines Tejun Heo
@ 2009-01-29 11:31 ` Tejun Heo
2009-02-03 4:04 ` Jeff Garzik
2009-01-29 14:07 ` [PATCHSET #upstream-fixes] libata: improve flaky link handling Greg Freemyer
8 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 11:31 UTC (permalink / raw)
To: jeff, linux-ide; +Cc: Tejun Heo
3Gbps is often much more prone to transmission failures. It's usually
okay to let EH handle speed down after transmission failures but some
WD My Book drives completely shutdown after certain transmission
failures and after it only power cycling can revive them. Combined
with the fact that external drives often end up with cable assembly
which is longer than usual and more likely to have intervening gender,
this makes these drives very likely to shutdown under certain
configurations virtually rendering them unusable.
This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
such that 1.5Gbps is forced once the device is identified.
Please take a look at the following bz for related reports.
http://bugzilla.kernel.org/show_bug.cgi?id=9913
Signed-off-by: Tejun Heo <tj@kernel.org>
---
drivers/ata/libata-core.c | 41 +++++++++++++++++++++++++++++++++++++++++
include/linux/libata.h | 1 +
2 files changed, 42 insertions(+), 0 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0cc0d8f..4394861 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2232,6 +2232,40 @@ retry:
return rc;
}
+static int ata_do_link_spd_horkage(struct ata_device *dev)
+{
+ struct ata_link *plink = ata_dev_phys_link(dev);
+ u32 target, target_limit;
+
+ if (!sata_scr_valid(plink))
+ return 0;
+
+ if (dev->horkage & ATA_HORKAGE_1_5_GBPS)
+ target = 1;
+ else
+ return 0;
+
+ target_limit = (1 << target) - 1;
+
+ /* if already on stricter limit, no need to push further */
+ if (plink->sata_spd_limit <= target_limit)
+ return 0;
+
+ plink->sata_spd_limit = target_limit;
+
+ /* Request another EH round by returning -EAGAIN if link is
+ * going faster than the target speed. Forward progress is
+ * guaranteed by setting sata_spd_limit to target_limit above.
+ */
+ if (plink->sata_spd > target) {
+ ata_dev_printk(dev, KERN_INFO,
+ "applying link speed limit horkage to %s\n",
+ sata_spd_string(target));
+ return -EAGAIN;
+ }
+ return 0;
+}
+
static inline u8 ata_dev_knobble(struct ata_device *dev)
{
struct ata_port *ap = dev->link->ap;
@@ -2322,6 +2356,10 @@ int ata_dev_configure(struct ata_device *dev)
return 0;
}
+ rc = ata_do_link_spd_horkage(dev);
+ if (rc)
+ return rc;
+
/* let ACPI work its magic */
rc = ata_acpi_on_devcfg(dev);
if (rc)
@@ -4221,6 +4259,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
/* Devices that do not need bridging limits applied */
{ "MTRON MSP-SATA*", NULL, ATA_HORKAGE_BRIDGE_OK, },
+ /* Devices which aren't very happy with higher link speeds */
+ { "WD My Book", NULL, ATA_HORKAGE_1_5_GBPS, },
+
/* End Marker */
{ }
};
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 7d7bc11..31ce91d 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -378,6 +378,7 @@ enum {
ATA_HORKAGE_ATAPI_MOD16_DMA = (1 << 11), /* use ATAPI DMA for commands
not multiple of 16 bytes */
ATA_HORKAGE_FIRMWARE_WARN = (1 << 12), /* firwmare update warning */
+ ATA_HORKAGE_1_5_GBPS = (1 << 13), /* force 1.5 Gbps */
/* DMA mask for user DMA control: User visible values; DO NOT
renumber */
--
1.6.0.2
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH 8/8] libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
2009-01-29 11:31 ` [PATCH 8/8] libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book Tejun Heo
@ 2009-02-03 4:04 ` Jeff Garzik
0 siblings, 0 replies; 13+ messages in thread
From: Jeff Garzik @ 2009-02-03 4:04 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-ide
Tejun Heo wrote:
> 3Gbps is often much more prone to transmission failures. It's usually
> okay to let EH handle speed down after transmission failures but some
> WD My Book drives completely shutdown after certain transmission
> failures and after it only power cycling can revive them. Combined
> with the fact that external drives often end up with cable assembly
> which is longer than usual and more likely to have intervening gender,
> this makes these drives very likely to shutdown under certain
> configurations virtually rendering them unusable.
>
> This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
> such that 1.5Gbps is forced once the device is identified.
>
> Please take a look at the following bz for related reports.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=9913
>
> Signed-off-by: Tejun Heo <tj@kernel.org>
applied 1-8
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET #upstream-fixes] libata: improve flaky link handling
2009-01-29 11:31 [PATCHSET #upstream-fixes] libata: improve flaky link handling Tejun Heo
` (7 preceding siblings ...)
2009-01-29 11:31 ` [PATCH 8/8] libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book Tejun Heo
@ 2009-01-29 14:07 ` Greg Freemyer
2009-01-29 15:50 ` Mark Lord
8 siblings, 1 reply; 13+ messages in thread
From: Greg Freemyer @ 2009-01-29 14:07 UTC (permalink / raw)
To: Tejun Heo; +Cc: jeff, linux-ide
On Thu, Jan 29, 2009 at 6:31 AM, Tejun Heo <tj@kernel.org> wrote:
>
> Hello,
>
> This patchset is kicked off by kernel bz#9913 which reports hot
> plugging and general detection problems with WD My Book drives. WD
> was kind enough to provide me with these drives and I played quite a
> bit with a few controllers and weird cabling configurations including
> long cables and several genders.
>
> It was surprisingly easy to trigger link layer problems. eSATA ports
> often go through extra cable and gender. eSATA cables tend to be
> longer than internal ones and as the cables are much stiffer touching
> or moving the cable can easily cause enough wiggle at the connectors
> to cause temporary transmission problems and being outside, it's much
> more likely to be moved.
>
> So, with abundant link quality problems, the following problems have
> been identified.
Tejun,
This looks like an important patchset for those of us that use eSata.
Any chance these can get into the opensuse 11.1 kernels? Or lacking
that, the os factory kernels?
Thanks
Greg
--
Greg Freemyer
Litigation Triage Solutions Specialist
http://www.linkedin.com/in/gregfreemyer
First 99 Days Litigation White Paper -
http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf
The Norcross Group
The Intersection of Evidence & Technology
http://www.norcrossgroup.com
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCHSET #upstream-fixes] libata: improve flaky link handling
2009-01-29 14:07 ` [PATCHSET #upstream-fixes] libata: improve flaky link handling Greg Freemyer
@ 2009-01-29 15:50 ` Mark Lord
2009-01-29 23:57 ` Tejun Heo
0 siblings, 1 reply; 13+ messages in thread
From: Mark Lord @ 2009-01-29 15:50 UTC (permalink / raw)
To: Greg Freemyer; +Cc: Tejun Heo, jeff, linux-ide
Greg Freemyer wrote:
> On Thu, Jan 29, 2009 at 6:31 AM, Tejun Heo <tj@kernel.org> wrote:
>> Hello,
>>
>> This patchset is kicked off by kernel bz#9913 which reports hot
>> plugging and general detection problems with WD My Book drives. WD
>> was kind enough to provide me with these drives and I played quite a
>> bit with a few controllers and weird cabling configurations including
>> long cables and several genders.
>>
>> It was surprisingly easy to trigger link layer problems. eSATA ports
>> often go through extra cable and gender. eSATA cables tend to be
>> longer than internal ones and as the cables are much stiffer touching
>> or moving the cable can easily cause enough wiggle at the connectors
>> to cause temporary transmission problems and being outside, it's much
>> more likely to be moved.
>>
>> So, with abundant link quality problems, the following problems have
>> been identified.
> Tejun,
>
> This looks like an important patchset for those of us that use eSata.
> Any chance these can get into the opensuse 11.1 kernels? Or lacking
> that, the os factory kernels?
..
Based on experience here, I'd say the problem isn't terribly widespread
beyond the specific vendor Tejun identified. I have several eSATA setups
at hand here, and never see flaky link behaviour on them.
Cheers
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET #upstream-fixes] libata: improve flaky link handling
2009-01-29 15:50 ` Mark Lord
@ 2009-01-29 23:57 ` Tejun Heo
0 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2009-01-29 23:57 UTC (permalink / raw)
To: Mark Lord; +Cc: Greg Freemyer, jeff, linux-ide
Hello, Mark, Greg.
Mark Lord wrote:
> Greg Freemyer wrote:
>> On Thu, Jan 29, 2009 at 6:31 AM, Tejun Heo <tj@kernel.org> wrote:
>>> So, with abundant link quality problems, the following problems have
>>> been identified.
>>
>> This looks like an important patchset for those of us that use eSata.
>> Any chance these can get into the opensuse 11.1 kernels? Or lacking
>> that, the os factory kernels?
I do agree it's something I'd like to see deployed fast but I'm not
really sure. Any change is dangerous and nothing sucks more than when
a machine fails to boot after a update and I did that at least once
already. :-( I'll surely get the first patch into the tree but for the
rest I think I'll wait a bit.
> Based on experience here, I'd say the problem isn't terribly widespread
> beyond the specific vendor Tejun identified. I have several eSATA setups
> at hand here, and never see flaky link behaviour on them.
Based on my experience with PMPs and these drives, problems like these
are not really dependent on which device or controller is at the end
of the cabling but depends on the length and quality of cabling. The
controller - device combination does have some influence, especially
on 3Gbps but I didn't find the WD drives to be exceptionally fragile
other than the shutdown-after-transmission-error problem.
I'm using about 30cm of internal cable + sata-eSATA gender + ~2M eSATA
cable and it's not too difficult to cause problem with any device with
such cabling. The cabling is on the extreme side (may even be out of
spec) but given the bug reports I get for eSATA devices including
PMPs, I don't think this kind of cabling is unfortunately not too
uncommon. As I wrote before, what makes it worse is the stiffness of
the eSATA cable, I can easily cause enough wiggle at the connectors by
touching the cable and it easily makes the device crap out especially
at 3Gbps.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 13+ messages in thread