[PATCHSET 03/11] new EH implementation, take 3

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCHSET 03/11] new EH implementation, take 3
@ 2006-05-11 13:21 Tejun Heo
  2006-05-11 13:21 ` [PATCH 05/11] libata-eh: implement new EH Tejun Heo
                   ` (11 more replies)
  0 siblings, 12 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide,
	htejun

Hello,

This is part of patchset series described in [T].

This is the third take of new-EH-implementation patchset.  Changes
from the last take[L] are

* (struct ata_eh_info) ap->eh_info and (struct ata_eh_context)
  ap->eh_context are added.  ehi contains all the information other
  parts of libata needs to communicate to EH, and ehc contains all the
  context EH needs to operate.  ehi is protected by host_set lock
  while ehc is owned by EH.  ehi/ehc act as standardized communication
  channel between LLDD and EH, and inside EH itself.  This simplifies
  EH implementation.

  ehc contains a struct eh_info and, on entry to EH, ap->eh_info is
  copied into ap->eh_context and cleared.  This allows LLDD to record
  exceptions detected during EH to ap->eh_info such that no exception
  info is lost (combined with ATA_FLAG_EH_PENDING) and EH to access
  ehc all it wants without bothering about synchronization.

  Another benefit of storing EH context into ap->eh_context is that it
  can easily be extended to handle multiple links.

* Due to above change, EH code are changed a lot, but basic behavior
  and function organization isn't changed too much.  One of the most
  visible change is EH interface.  As the information can be passed
  easily via ap->eh_info, the interface got simplified.  Currently,
  the only exported interface is ata_do_eh() and it seems enough for
  all the converted drivers - ata_piix, sata_sil, ahci and sata_sil24.
  EH is still done in clearly separated stages, so exporting more
  isn't difficult if needed.

* ahci and sata_sil24 now analyzes and records errors while handling
  the error IRQ.  This is more natural way (e.g. PM errors must be
  dealt with this way) and ehi makes it much easier to do.

* Single failed qc per device assumption is removed.  Multiple or no
  failed command cases are properly handled.  Also, report format is
  improved.  In general, EH is much more mature now.

* Frozen port handling in ata_interrupt() dropped.  Jeff was against
  it and I'm not very sure.

* AHCI PIO setup interrupts for multisector PIO commands don't trigger
  spurious interrupt wnarning anymore.

* Patches are in bigger lumps.  This applies to all other patchsets
  too.

This patchset is against

  upstream (acc696d93dcf993dec123d69d599979e1456ffec)
  + [1] prep-for-new-EH patchset
  + [2] new-EH-framework patchset, take 3

--
tejun

[T] http://article.gmane.org/gmane.linux.ide/9957
[L] http://article.gmane.org/gmane.linux.ide/9540
[1] http://article.gmane.org/gmane.linux.ide/9959
[2] http://article.gmane.org/gmane.linux.ide/9984

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 05/11] libata-eh: implement new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:19   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 04/11] libata-eh: implement ata_eh_info and ata_eh_context Tejun Heo
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Implement new EH.  The exported interface is ata_do_eh() which is to
be called from ->error_handler and performs the following steps to
recover the failed port.

ata_eh_autopsy() : analyze SError/TF, determine the cause of failure
		   and required recovery actions and record it in
		   ap->eh_context
ata_eh_report()	 : report the failure to user
ata_eh_recover() : perform recovery actions described in ap->eh_context
ata_eh_finish()	 : finish failed qcs

LLDDs can customize error handling by modifying eh_context before
calling ata_do_eh() or, if necessary, doing so inbetween each major
steps by calling each step explicitly.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/libata-core.c |    1 
 drivers/scsi/libata-eh.c   |  775 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/libata.h     |    5 
 3 files changed, 781 insertions(+), 0 deletions(-)

09973781fb6f52e6592be1aeb155547c3941370d
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index afb7796..35faf66 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -5440,3 +5440,4 @@ EXPORT_SYMBOL_GPL(ata_eh_freeze_port);
 EXPORT_SYMBOL_GPL(ata_eh_thaw_port);
 EXPORT_SYMBOL_GPL(ata_eh_qc_complete);
 EXPORT_SYMBOL_GPL(ata_eh_qc_retry);
+EXPORT_SYMBOL_GPL(ata_do_eh);
diff --git a/drivers/scsi/libata-eh.c b/drivers/scsi/libata-eh.c
index 5a76e22..41b36c5 100644
--- a/drivers/scsi/libata-eh.c
+++ b/drivers/scsi/libata-eh.c
@@ -632,3 +632,778 @@ void ata_eh_qc_retry(struct ata_queued_c
 		scmd->retries--;
 	__ata_eh_qc_complete(qc);
 }
+
+/**
+ *	ata_eh_about_to_do - about to perform eh_action
+ *	@ap: target ATA port
+ *	@action: action about to be performed
+ *
+ *	Called just before performing EH actions to clear related bits
+ *	in @ap->eh_info such that eh actions are not unnecessarily
+ *	repeated.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void ata_eh_about_to_do(struct ata_port *ap, unsigned int action)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+	ap->eh_info.action &= ~action;
+	ap->flags |= ATA_FLAG_RECOVERED;
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+}
+
+/**
+ *	ata_err_string - convert err_mask to descriptive string
+ *	@err_mask: error mask to convert to string
+ *
+ *	Convert @err_mask to descriptive string.  Errors are
+ *	prioritized according to severity and only the most severe
+ *	error is reported.
+ *
+ *	LOCKING:
+ *	None.
+ *
+ *	RETURNS:
+ *	Descriptive string for @err_mask
+ */
+static const char * ata_err_string(unsigned int err_mask)
+{
+	if (err_mask & AC_ERR_HOST_BUS)
+		return "host bus error";
+	if (err_mask & AC_ERR_ATA_BUS)
+		return "ATA bus error";
+	if (err_mask & AC_ERR_TIMEOUT)
+		return "timeout";
+	if (err_mask & AC_ERR_HSM)
+		return "HSM violation";
+	if (err_mask & AC_ERR_SYSTEM)
+		return "internal error";
+	if (err_mask & AC_ERR_MEDIA)
+		return "media error";
+	if (err_mask & AC_ERR_INVALID)
+		return "invalid argument";
+	if (err_mask & AC_ERR_DEV)
+		return "device error";
+	return "unknown error";
+}
+
+/**
+ *	atapi_eh_request_sense - perform ATAPI REQUEST_SENSE
+ *	@dev: device to perform REQUEST_SENSE to
+ *	@sense_buf: result sense data buffer (SCSI_SENSE_BUFFERSIZE bytes long)
+ *
+ *	Perform ATAPI REQUEST_SENSE after the device reported CHECK
+ *	SENSE.  This function is EH helper.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, AC_ERR_* mask on failure
+ */
+static unsigned int atapi_eh_request_sense(struct ata_device *dev,
+					   unsigned char *sense_buf)
+{
+	struct ata_port *ap = dev->ap;
+	struct ata_taskfile tf;
+	u8 cdb[ATAPI_CDB_LEN];
+
+	DPRINTK("ATAPI request sense\n");
+
+	ata_tf_init(dev, &tf);
+
+	/* FIXME: is this needed? */
+	memset(sense_buf, 0, SCSI_SENSE_BUFFERSIZE);
+
+	/* XXX: why tf_read here? */
+	ap->ops->tf_read(ap, &tf);
+
+	/* fill these in, for the case where they are -not- overwritten */
+	sense_buf[0] = 0x70;
+	sense_buf[2] = tf.feature >> 4;
+
+	memset(cdb, 0, ATAPI_CDB_LEN);
+	cdb[0] = REQUEST_SENSE;
+	cdb[4] = SCSI_SENSE_BUFFERSIZE;
+
+	tf.flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
+	tf.command = ATA_CMD_PACKET;
+
+	/* is it pointless to prefer PIO for "safety reasons"? */
+	if (ap->flags & ATA_FLAG_PIO_DMA) {
+		tf.protocol = ATA_PROT_ATAPI_DMA;
+		tf.feature |= ATAPI_PKT_DMA;
+	} else {
+		tf.protocol = ATA_PROT_ATAPI;
+		tf.lbam = (8 * 1024) & 0xff;
+		tf.lbah = (8 * 1024) >> 8;
+	}
+
+	return ata_exec_internal(dev, &tf, cdb, DMA_FROM_DEVICE,
+				 sense_buf, SCSI_SENSE_BUFFERSIZE);
+}
+
+/**
+ *	ata_eh_analyze_serror - analyze SError for a failed port
+ *	@ap: ATA port to analyze SError for
+ *
+ *	Analyze SError if available and further determine cause of
+ *	failure.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void ata_eh_analyze_serror(struct ata_port *ap)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	u32 serror = ehc->i.serror;
+	unsigned int err_mask = 0, action = 0;
+
+	if (serror & SERR_PERSISTENT) {
+		err_mask |= AC_ERR_ATA_BUS;
+		action |= ATA_EH_HARDRESET;
+	}
+	if (serror &
+	    (SERR_DATA_RECOVERED | SERR_COMM_RECOVERED | SERR_DATA)) {
+		err_mask |= AC_ERR_ATA_BUS;
+		action |= ATA_EH_SOFTRESET;
+	}
+	if (serror & SERR_PROTOCOL) {
+		err_mask |= AC_ERR_HSM;
+		action |= ATA_EH_SOFTRESET;
+	}
+	if (serror & SERR_INTERNAL) {
+		err_mask |= AC_ERR_SYSTEM;
+		action |= ATA_EH_SOFTRESET;
+	}
+	if (serror & (SERR_PHYRDY_CHG | SERR_DEV_XCHG)) {
+		err_mask |= AC_ERR_ATA_BUS;
+		action |= ATA_EH_HARDRESET;
+	}
+
+	ehc->i.err_mask |= err_mask;
+	ehc->i.action |= action;
+}
+
+/**
+ *	ata_eh_analyze_tf - analyze taskfile of a failed qc
+ *	@qc: qc to analyze
+ *	@tf: Taskfile registers to analyze
+ *
+ *	Analyze taskfile of @qc and further determine cause of
+ *	failure.  This function also requests ATAPI sense data if
+ *	avaliable.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	Determined recovery action
+ */
+static unsigned int ata_eh_analyze_tf(struct ata_queued_cmd *qc,
+				      const struct ata_taskfile *tf)
+{
+	unsigned int tmp, action = 0;
+	u8 stat = tf->command, err = tf->feature;
+
+	if ((stat & (ATA_BUSY | ATA_DRQ | ATA_DRDY)) != ATA_DRDY) {
+		qc->err_mask |= AC_ERR_HSM;
+		return ATA_EH_SOFTRESET;
+	}
+
+	if (!(qc->err_mask & AC_ERR_DEV))
+		return 0;
+
+	switch (qc->dev->class) {
+	case ATA_DEV_ATA:
+		if (err & ATA_ICRC)
+			qc->err_mask |= AC_ERR_ATA_BUS;
+		if (err & ATA_UNC)
+			qc->err_mask |= AC_ERR_MEDIA;
+		if (err & ATA_IDNF)
+			qc->err_mask |= AC_ERR_INVALID;
+		break;
+
+	case ATA_DEV_ATAPI:
+		tmp = atapi_eh_request_sense(qc->dev,
+					     qc->scsicmd->sense_buffer);
+		if (!tmp) {
+			/* ATA_QCFLAG_SENSE_VALID is used to tell
+			 * atapi_qc_complete() that sense data is
+			 * already valid.
+			 *
+			 * TODO: interpret sense data and set
+			 * appropriate err_mask.
+			 */
+			qc->flags |= ATA_QCFLAG_SENSE_VALID;
+		} else
+			qc->err_mask |= tmp;
+	}
+
+	if (qc->err_mask & (AC_ERR_HSM | AC_ERR_TIMEOUT | AC_ERR_ATA_BUS))
+		action |= ATA_EH_SOFTRESET;
+
+	return action;
+}
+
+static int ata_eh_categorize_ering_entry(struct ata_ering_entry *ent)
+{
+	if (ent->err_mask & (AC_ERR_ATA_BUS | AC_ERR_TIMEOUT))
+		return 1;
+
+	if (ent->is_io) {
+		if (ent->err_mask & AC_ERR_HSM)
+			return 1;
+		if ((ent->err_mask &
+		     (AC_ERR_DEV|AC_ERR_MEDIA|AC_ERR_INVALID)) == AC_ERR_DEV)
+			return 2;
+	}
+
+	return 0;
+}
+
+struct speed_down_needed_arg {
+	u64 since;
+	int nr_errors[3];
+};
+
+static int speed_down_needed_cb(struct ata_ering_entry *ent, void *void_arg)
+{
+	struct speed_down_needed_arg *arg = void_arg;
+
+	if (ent->timestamp < arg->since)
+		return -1;
+
+	arg->nr_errors[ata_eh_categorize_ering_entry(ent)]++;
+	return 0;
+}
+
+/**
+ *	ata_eh_speed_down_needed - Determine wheter speed down is necessary
+ *	@dev: Device of interest
+ *
+ *	This function examines error ring of @dev and determines
+ *	whether speed down is necessary.  Speed down is necessary if
+ *	there have been more than 3 of Cat-1 errors or 10 of Cat-2
+ *	errors during last 15 minutes.
+ *
+ *	Cat-1 errors are ATA_BUS, TIMEOUT for any command and HSM
+ *	violation for known supported commands.
+ *
+ *	Cat-2 errors are unclassified DEV error for known supported
+ *	command.
+ *
+ *	LOCKING:
+ *	Inherited from caller.
+ *
+ *	RETURNS:
+ *	1 if speed down is necessary, 0 otherwise
+ */
+static int ata_eh_speed_down_needed(struct ata_device *dev)
+{
+	const u64 interval = 15LLU * 60 * HZ;
+	static const int err_limits[3] = { -1, 3, 10 };
+	struct speed_down_needed_arg arg;
+	struct ata_ering_entry *ent;
+	int err_cat;
+	u64 j64;
+
+	ent = ata_ering_top(&dev->ering);
+	if (!ent)
+		return 0;
+
+	err_cat = ata_eh_categorize_ering_entry(ent);
+	if (err_cat == 0)
+		return 0;
+
+	memset(&arg, 0, sizeof(arg));
+
+	j64 = get_jiffies_64();
+	if (j64 >= interval)
+		arg.since = j64 - interval;
+	else
+		arg.since = 0;
+
+	ata_ering_map(&dev->ering, speed_down_needed_cb, &arg);
+
+	return arg.nr_errors[err_cat] > err_limits[err_cat];
+}
+
+/**
+ *	ata_eh_speed_down - record error and speed down if necessary
+ *	@dev: Failed device
+ *	@is_io: Did the device fail during normal IO?
+ *	@err_mask: err_mask of the error
+ *
+ *	Record error and examine error history to determine whether
+ *	adjusting transmission speed is necessary.  It also sets
+ *	transmission limits appropriately if such adjustment is
+ *	necessary.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno otherwise
+ */
+static int ata_eh_speed_down(struct ata_device *dev, int is_io,
+			     unsigned int err_mask)
+{
+	if (!err_mask)
+		return 0;
+
+	/* record error and determine whether speed down is necessary */
+	ata_ering_record(&dev->ering, is_io, err_mask);
+
+	if (!ata_eh_speed_down_needed(dev))
+		return 0;
+
+	/* speed down SATA link speed if possible */
+	if (ata_down_sata_spd_limit(dev->ap) == 0)
+		return ATA_EH_HARDRESET;
+
+	/* lower transfer mode */
+	if (ata_down_xfermask_limit(dev, 0) == 0)
+		return ATA_EH_SOFTRESET;
+
+	ata_dev_printk(dev, KERN_ERR,
+		       "speed down requested but no transfer mode left\n");
+	return 0;
+}
+
+/**
+ *	ata_eh_autopsy - analyze error and determine recovery action
+ *	@ap: ATA port to perform autopsy on
+ *
+ *	Analyze why @ap failed and determine which recovery action is
+ *	needed.  This function also sets more detailed AC_ERR_* values
+ *	and fills sense data for ATAPI CHECK SENSE.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ */
+static void ata_eh_autopsy(struct ata_port *ap)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	unsigned int action = ehc->i.action;
+	struct ata_device *failed_dev = NULL;
+	unsigned int all_err_mask = 0;
+	int tag, is_io = 0;
+	u32 serror;
+	int rc;
+
+	DPRINTK("ENTER\n");
+
+	/* obtain and analyze SError */
+	rc = ata_scr_read(ap, SCR_ERROR, &serror);
+	if (rc == 0) {
+		ehc->i.serror |= serror;
+		ata_eh_analyze_serror(ap);
+	} else if (rc != -EOPNOTSUPP)
+		action |= ATA_EH_HARDRESET;
+
+	/* any real error trumps AC_ERR_OTHER */
+	if (ehc->i.err_mask & ~AC_ERR_OTHER)
+		ehc->i.err_mask &= ~AC_ERR_OTHER;
+
+	all_err_mask |= ehc->i.err_mask;
+
+	for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+		struct ata_queued_cmd *qc = __ata_qc_from_tag(ap, tag);
+
+		if (!(qc->flags & ATA_QCFLAG_FAILED))
+			continue;
+
+		/* inherit upper level err_mask */
+		qc->err_mask |= ehc->i.err_mask;
+
+		if (qc->err_mask & AC_ERR_TIMEOUT)
+			action |= ATA_EH_SOFTRESET;
+
+		/* analyze TF */
+		action |= ata_eh_analyze_tf(qc, &qc->result_tf);
+
+		/* DEV errors are probably spurious in case of ATA_BUS error */
+		if (qc->err_mask & AC_ERR_ATA_BUS)
+			qc->err_mask &= ~(AC_ERR_DEV | AC_ERR_MEDIA |
+					  AC_ERR_INVALID);
+
+		/* any real error trumps unknown error */
+		if (qc->err_mask & ~AC_ERR_OTHER)
+			qc->err_mask &= ~AC_ERR_OTHER;
+
+		/* SENSE_VALID trumps dev/unknown error and revalidation */
+		if (qc->flags & ATA_QCFLAG_SENSE_VALID) {
+			qc->err_mask &= ~(AC_ERR_DEV | AC_ERR_OTHER);
+			action &= ~ATA_EH_REVALIDATE;
+		}
+
+		/* accumulate error info */
+		failed_dev = qc->dev;
+		all_err_mask |= qc->err_mask;
+		if (qc->flags & ATA_QCFLAG_IO)
+			is_io = 1;
+	}
+
+	/* speed down iff command was in progress */
+	if (failed_dev)
+		action |= ata_eh_speed_down(failed_dev, is_io, all_err_mask);
+
+	if (all_err_mask)
+		action |= ATA_EH_REVALIDATE;
+
+	ehc->i.dev = failed_dev;
+	ehc->i.action = action;
+
+	DPRINTK("EXIT\n");
+}
+
+/**
+ *	ata_eh_report - report error handling to user
+ *	@ap: ATA port EH is going on
+ *
+ *	Report EH to user.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void ata_eh_report(struct ata_port *ap)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	const char *frozen, *desc;
+	int tag, nr_failed = 0;
+
+	desc = NULL;
+	if (ehc->i.desc[0] != '\0')
+		desc = ehc->i.desc;
+
+	for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+		struct ata_queued_cmd *qc = __ata_qc_from_tag(ap, tag);
+
+		if (!(qc->flags & ATA_QCFLAG_FAILED))
+			continue;
+		if (qc->flags & ATA_QCFLAG_SENSE_VALID && !qc->err_mask)
+			continue;
+
+		nr_failed++;
+	}
+
+	if (!nr_failed && !ehc->i.err_mask)
+		return;
+
+	frozen = "";
+	if (ap->flags & ATA_FLAG_FROZEN)
+		frozen = " frozen";
+
+	if (ehc->i.dev) {
+		ata_dev_printk(ehc->i.dev, KERN_ERR,
+			       "exception Emask 0x%x SErr 0x%x action 0x%x%s\n",
+			       ehc->i.err_mask, ehc->i.serror, ehc->i.action,
+			       frozen);
+		if (desc)
+			ata_dev_printk(ehc->i.dev, KERN_ERR, "(%s)\n", desc);
+	} else {
+		ata_port_printk(ap, KERN_ERR,
+				"exception Emask 0x%x SErr 0x%x action 0x%x%s\n",
+				ehc->i.err_mask, ehc->i.serror, ehc->i.action,
+				frozen);
+		if (desc)
+			ata_port_printk(ap, KERN_ERR, "(%s)\n", desc);
+	}
+
+	for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+		struct ata_queued_cmd *qc = __ata_qc_from_tag(ap, tag);
+
+		if (!(qc->flags & ATA_QCFLAG_FAILED) || !qc->err_mask)
+			continue;
+
+		ata_dev_printk(qc->dev, KERN_ERR, "tag %d cmd 0x%x "
+			       "Emask 0x%x stat 0x%x err 0x%x (%s)\n",
+			       qc->tag, qc->tf.command, qc->err_mask,
+			       qc->result_tf.command, qc->result_tf.feature,
+			       ata_err_string(qc->err_mask));
+	}
+}
+
+static int ata_eh_reset(struct ata_port *ap, ata_reset_fn_t softreset,
+			ata_reset_fn_t hardreset, ata_postreset_fn_t postreset)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	unsigned int classes[ATA_MAX_DEVICES];
+	int tries = ATA_EH_RESET_TRIES;
+	ata_reset_fn_t reset;
+	int rc;
+
+	if (softreset && (!hardreset || (!ata_set_sata_spd_needed(ap) &&
+					 !(ehc->i.action & ATA_EH_HARDRESET))))
+		reset = softreset;
+	else
+		reset = hardreset;
+
+ retry:
+	ata_port_printk(ap, KERN_INFO, "%s resetting port\n",
+			reset == softreset ? "soft" : "hard");
+
+	/* reset */
+	ata_eh_about_to_do(ap, ATA_EH_RESET_MASK);
+	ehc->flags |= ATA_EHC_DID_RESET;
+
+	rc = ata_do_reset(ap, reset, classes);
+
+	if (rc && --tries) {
+		ata_port_printk(ap, KERN_WARNING,
+				"%sreset failed, retrying in 5 secs\n",
+				reset == softreset ? "soft" : "hard");
+		ssleep(5);
+
+		if (reset == hardreset)
+			ata_down_sata_spd_limit(ap);
+		if (hardreset)
+			reset = hardreset;
+		goto retry;
+	}
+
+	if (rc == 0) {
+		if (postreset)
+			postreset(ap, classes);
+
+		/* reset successful, schedule revalidation */
+		ehc->i.dev = NULL;
+		ehc->i.action &= ~ATA_EH_RESET_MASK;
+		ehc->i.action |= ATA_EH_REVALIDATE;
+	}
+
+	return rc;
+}
+
+static int ata_eh_revalidate(struct ata_port *ap,
+			     struct ata_device **r_failed_dev)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	struct ata_device *dev;
+	int i, rc = 0;
+
+	DPRINTK("ENTER\n");
+
+	for (i = 0; i < ATA_MAX_DEVICES; i++) {
+		dev = &ap->device[i];
+
+		if (ehc->i.action & ATA_EH_REVALIDATE && ata_dev_enabled(dev) &&
+		    (!ehc->i.dev || ehc->i.dev == dev)) {
+			if (ata_port_offline(ap)) {
+				rc = -EIO;
+				break;
+			}
+
+			ata_eh_about_to_do(ap, ATA_EH_REVALIDATE);
+			rc = ata_dev_revalidate(dev,
+						ehc->flags & ATA_EHC_DID_RESET);
+			if (rc)
+				break;
+
+			ehc->i.action &= ~ATA_EH_REVALIDATE;
+		}
+	}
+
+	if (rc)
+		*r_failed_dev = dev;
+
+	DPRINTK("EXIT\n");
+	return rc;
+}
+
+static int ata_port_nr_enabled(struct ata_port *ap)
+{
+	int i, cnt = 0;
+
+	for (i = 0; i < ATA_MAX_DEVICES; i++)
+		if (ata_dev_enabled(&ap->device[i]))
+			cnt++;
+	return cnt;
+}
+
+/**
+ *	ata_eh_recover - recover host port after error
+ *	@ap: host port to recover
+ *	@softreset: softreset method (can be NULL)
+ *	@hardreset: hardreset method (can be NULL)
+ *	@postreset: postreset method (can be NULL)
+ *
+ *	This is the alpha and omega, eum and yang, heart and soul of
+ *	libata exception handling.  On entry, actions required to
+ *	recover each devices are recorded in eh_context.  This
+ *	function executes all the operations with appropriate retrials
+ *	and fallbacks to resurrect failed devices.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ *
+ *	RETURNS:
+ *	0 on success, -errno on failure.
+ */
+static int ata_eh_recover(struct ata_port *ap, ata_reset_fn_t softreset,
+			  ata_reset_fn_t hardreset,
+			  ata_postreset_fn_t postreset)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+	struct ata_device *dev;
+	int down_xfermask, i, rc;
+
+	DPRINTK("ENTER\n");
+
+	/* prep for recovery */
+	for (i = 0; i < ATA_MAX_DEVICES; i++) {
+		dev = &ap->device[i];
+
+		ehc->tries[dev->devno] = ATA_EH_DEV_TRIES;
+	}
+
+ retry:
+	down_xfermask = 0;
+	rc = 0;
+
+	/* skip EH if possible. */
+	if (!ata_port_nr_enabled(ap) && !(ap->flags & ATA_FLAG_FROZEN))
+		ehc->i.action = 0;
+
+	/* reset */
+	if (ehc->i.action & ATA_EH_RESET_MASK) {
+		ata_eh_freeze_port(ap);
+
+		rc = ata_eh_reset(ap, softreset, hardreset, postreset);
+		if (rc) {
+			ata_port_printk(ap, KERN_ERR,
+					"reset failed, giving up\n");
+			goto out;
+		}
+
+		ata_eh_thaw_port(ap);
+	}
+
+	/* revalidate existing devices */
+	rc = ata_eh_revalidate(ap, &dev);
+	if (rc)
+		goto dev_fail;
+
+	/* configure transfer mode if the port has been reset */
+	if (ehc->flags & ATA_EHC_DID_RESET) {
+		rc = ata_set_mode(ap, &dev);
+		if (rc) {
+			down_xfermask = 1;
+			goto dev_fail;
+		}
+	}
+
+	goto out;
+
+ dev_fail:
+	switch (rc) {
+	case -ENODEV:
+	case -EINVAL:
+		ehc->tries[dev->devno] = 0;
+		break;
+	case -EIO:
+		ata_down_sata_spd_limit(ap);
+	default:
+		ehc->tries[dev->devno]--;
+		if (down_xfermask &&
+		    ata_down_xfermask_limit(dev, ehc->tries[dev->devno] == 1))
+			ehc->tries[dev->devno] = 0;
+	}
+
+	/* disable device if it has used up all its chances */
+	if (ata_dev_enabled(dev) && !ehc->tries[dev->devno])
+		ata_dev_disable(dev);
+
+	/* soft didn't work?  be haaaaard */
+	if (ehc->flags & ATA_EHC_DID_RESET)
+		ehc->i.action |= ATA_EH_HARDRESET;
+	else
+		ehc->i.action |= ATA_EH_SOFTRESET;
+
+	if (ata_port_nr_enabled(ap)) {
+		ata_port_printk(ap, KERN_WARNING, "failed to recover some "
+				"devices, retrying in 5 secs\n");
+		ssleep(5);
+	} else {
+		/* no device left, repeat fast */
+		msleep(500);
+	}
+
+	goto retry;
+
+ out:
+	if (rc) {
+		for (i = 0; i < ATA_MAX_DEVICES; i++)
+			ata_dev_disable(&ap->device[i]);
+	}
+
+	DPRINTK("EXIT, rc=%d\n", rc);
+	return rc;
+}
+
+/**
+ *	ata_eh_finish - finish up EH
+ *	@ap: host port to finish EH for
+ *
+ *	Recovery is complete.  Clean up EH states and retry or finish
+ *	failed qcs.
+ *
+ *	LOCKING:
+ *	None.
+ */
+static void ata_eh_finish(struct ata_port *ap)
+{
+	int tag;
+
+	/* retry or finish qcs */
+	for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+		struct ata_queued_cmd *qc = __ata_qc_from_tag(ap, tag);
+
+		if (!(qc->flags & ATA_QCFLAG_FAILED))
+			continue;
+
+		if (qc->err_mask) {
+			/* FIXME: Once EH migration is complete,
+			 * generate sense data in this function,
+			 * considering both err_mask and tf.
+			 */
+			if (qc->err_mask & AC_ERR_INVALID)
+				ata_eh_qc_complete(qc);
+			else
+				ata_eh_qc_retry(qc);
+		} else {
+			if (qc->flags & ATA_QCFLAG_SENSE_VALID) {
+				ata_eh_qc_complete(qc);
+			} else {
+				/* feed zero TF to sense generation */
+				memset(&qc->result_tf, 0, sizeof(qc->result_tf));
+				ata_eh_qc_retry(qc);
+			}
+		}
+	}
+}
+
+/**
+ *	ata_do_eh - do standard error handling
+ *	@ap: host port to handle error for
+ *	@softreset: softreset method (can be NULL)
+ *	@hardreset: hardreset method (can be NULL)
+ *	@postreset: postreset method (can be NULL)
+ *
+ *	Perform standard error handling sequence.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep).
+ */
+void ata_do_eh(struct ata_port *ap, ata_reset_fn_t softreset,
+	       ata_reset_fn_t hardreset, ata_postreset_fn_t postreset)
+{
+	ata_eh_autopsy(ap);
+	ata_eh_report(ap);
+	ata_eh_recover(ap, softreset, hardreset, postreset);
+	ata_eh_finish(ap);
+}
diff --git a/include/linux/libata.h b/include/linux/libata.h
index c5c5f18..867f1d5 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -246,6 +246,8 @@ enum {
 
 	/* how hard are we gonna try to probe/recover devices */
 	ATA_PROBE_MAX_TRIES	= 3,
+	ATA_EH_RESET_TRIES	= 3,
+	ATA_EH_DEV_TRIES	= 3,
 };
 
 enum hsm_task_states {
@@ -730,6 +732,9 @@ extern void ata_eh_thaw_port(struct ata_
 extern void ata_eh_qc_complete(struct ata_queued_cmd *qc);
 extern void ata_eh_qc_retry(struct ata_queued_cmd *qc);
 
+extern void ata_do_eh(struct ata_port *ap, ata_reset_fn_t softreset,
+		      ata_reset_fn_t hardreset, ata_postreset_fn_t postreset);
+
 /*
  * printk helpers
  */
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/11] libata-eh: implement new EH
  2006-05-11 13:21 ` [PATCH 05/11] libata-eh: implement new EH Tejun Heo
@ 2006-05-13 22:19   ` Jeff Garzik
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Implement new EH.  The exported interface is ata_do_eh() which is to
> be called from ->error_handler and performs the following steps to
> recover the failed port.
> 
> ata_eh_autopsy() : analyze SError/TF, determine the cause of failure
> 		   and required recovery actions and record it in
> 		   ap->eh_context
> ata_eh_report()	 : report the failure to user
> ata_eh_recover() : perform recovery actions described in ap->eh_context
> ata_eh_finish()	 : finish failed qcs
> 
> LLDDs can customize error handling by modifying eh_context before
> calling ata_do_eh() or, if necessary, doing so inbetween each major
> steps by calling each step explicitly.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>

ACK patches 3-5



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 04/11] libata-eh: implement ata_eh_info and ata_eh_context
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
  2006-05-11 13:21 ` [PATCH 05/11] libata-eh: implement new EH Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-11 13:21 ` [PATCH 01/11] libata-eh: add ATA and libata flags for new EH Tejun Heo
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

struct ata_eh_info serves as the communication channel between
execution path and EH.  Execution path describes detected error
condition in ap->eh_info and EH recovers the port using it.  To avoid
missing error conditions detected during EH, EH makes its own copy of
eh_info and clears it on entry allowing error info to accumulate
during EH.

Most EH states including EH's copy of eh_info are stored in
ap->eh_context (struct ata_eh_context) which is owned by EH and thus
doesn't require any synchronization to access and alter.  This
standardized context makes it easy to integrate various parts of EH
and extend EH to handle multiple links (for PM).

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/libata-eh.c |   11 ++++++++++-
 include/linux/libata.h   |   40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 1 deletions(-)

819953b0ab81f1b93f355863b23e85d9a74cd334
diff --git a/drivers/scsi/libata-eh.c b/drivers/scsi/libata-eh.c
index f6f0557..5a76e22 100644
--- a/drivers/scsi/libata-eh.c
+++ b/drivers/scsi/libata-eh.c
@@ -235,9 +235,15 @@ void ata_scsi_error(struct Scsi_Host *ho
  repeat:
 	/* invoke error handler */
 	if (ap->ops->error_handler) {
-		/* clear EH pending */
+		/* fetch & clear EH info */
 		spin_lock_irqsave(hs_lock, flags);
+
+		memset(&ap->eh_context, 0, sizeof(ap->eh_context));
+		ap->eh_context.i = ap->eh_info;
+		memset(&ap->eh_info, 0, sizeof(ap->eh_info));
+
 		ap->flags &= ~ATA_FLAG_EH_PENDING;
+
 		spin_unlock_irqrestore(hs_lock, flags);
 
 		/* invoke EH */
@@ -261,6 +267,9 @@ void ata_scsi_error(struct Scsi_Host *ho
 					"tries, giving up\n", ATA_EH_MAX_REPEAT);
 		}
 
+		/* this run is complete, make sure EH info is clear */
+		memset(&ap->eh_info, 0, sizeof(ap->eh_info));
+
 		/* Clear host_eh_scheduled while holding hs_lock such
 		 * that if exception occurs after this point but
 		 * before EH completion, SCSI midlayer will
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 1a4df14..c5c5f18 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -228,6 +228,9 @@ enum {
 	ATA_PORT_PRIMARY	= (1 << 0),
 	ATA_PORT_SECONDARY	= (1 << 1),
 
+	/* desc_len for ata_eh_info and context */
+	ATA_EH_DESC_LEN		= 80,
+
 	/* reset / recovery action types */
 	ATA_EH_REVALIDATE	= (1 << 0),
 	ATA_EH_SOFTRESET	= (1 << 1),
@@ -235,6 +238,9 @@ enum {
 
 	ATA_EH_RESET_MASK	= ATA_EH_SOFTRESET | ATA_EH_HARDRESET,
 
+	/* ata_eh_context->flags */
+	ATA_EHC_DID_RESET	= (1 << 0), /* already reset this port */
+
 	/* max repeat if error condition is still set after ->error_handler */
 	ATA_EH_MAX_REPEAT	= 5,
 
@@ -423,6 +429,21 @@ struct ata_device {
 	DEFINE_ATA_ERING	(ering, ATA_DEV_ERING_SIZE);
 };
 
+struct ata_eh_info {
+	struct ata_device	*dev;		/* offending device */
+	u32			serror;		/* SError from LLDD */
+	unsigned int		err_mask;	/* port-wide err_mask */
+	unsigned int		action;		/* ATA_EH_* action mask */
+	char			desc[ATA_EH_DESC_LEN];
+	int			desc_len;
+};
+
+struct ata_eh_context {
+	struct ata_eh_info	i;
+	int			tries[ATA_MAX_DEVICES];
+	unsigned int		flags;		/* ATA_EHC_* */
+};
+
 struct ata_port {
 	struct Scsi_Host	*host;	/* our co-allocated scsi host */
 	const struct ata_port_operations *ops;
@@ -447,6 +468,11 @@ struct ata_port {
 	unsigned int		cbl;	/* cable type; ATA_CBL_xxx */
 	unsigned int		sata_spd_limit;	/* SATA PHY speed limit */
 
+	/* record runtime error info, protected by host_set lock */
+	struct ata_eh_info	eh_info;
+	/* EH context owned by EH */
+	struct ata_eh_context	eh_context;
+
 	struct ata_device	device[ATA_MAX_DEVICES];
 
 	struct ata_queued_cmd	qcmd[ATA_MAX_QUEUE];
@@ -714,6 +740,20 @@ extern void ata_eh_qc_retry(struct ata_q
 	printk(lv"ata%u.%02u: "fmt, (dev)->ap->id, (dev)->devno , ##args)
 
 /*
+ * ata_eh_info helpers
+ */
+#define ata_ehi_push_desc(ehi, fmt, args...) do { \
+	(ehi)->desc_len += scnprintf((ehi)->desc + (ehi)->desc_len, \
+				     ATA_EH_DESC_LEN - (ehi)->desc_len, \
+				     fmt , ##args); \
+} while (0)
+
+#define ata_ehi_clear_desc(ehi) do { \
+	(ehi)->desc[0] = '\0'; \
+	(ehi)->desc_len = 0; \
+} while (0)
+
+/*
  * qc helpers
  */
 static inline int
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 01/11] libata-eh: add ATA and libata flags for new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
  2006-05-11 13:21 ` [PATCH 05/11] libata-eh: implement new EH Tejun Heo
  2006-05-11 13:21 ` [PATCH 04/11] libata-eh: implement ata_eh_info and ata_eh_context Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:15   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 02/11] libata-eh: implement ata_ering Tejun Heo
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Add ATA and libata flags to be used by new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 include/linux/ata.h    |   13 +++++++++++++
 include/linux/libata.h |    8 ++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

f39425fcb9c479e87113475fd214728caebe0c78
diff --git a/include/linux/ata.h b/include/linux/ata.h
index 312a2c0..a7c41f3 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -97,6 +97,9 @@ enum {
 	ATA_DRQ			= (1 << 3),	/* data request i/o */
 	ATA_ERR			= (1 << 0),	/* have an error */
 	ATA_SRST		= (1 << 2),	/* software reset */
+	ATA_ICRC		= (1 << 7),	/* interface CRC error */
+	ATA_UNC			= (1 << 6),	/* uncorrectable media error */
+	ATA_IDNF		= (1 << 4),	/* ID not found */
 	ATA_ABORTED		= (1 << 2),	/* command aborted */
 
 	/* ATA command block registers */
@@ -192,6 +195,16 @@ enum {
 	SCR_ACTIVE		= 3,
 	SCR_NOTIFICATION	= 4,
 
+	/* SError bits */
+	SERR_DATA_RECOVERED	= (1 << 0), /* recovered data error */
+	SERR_COMM_RECOVERED	= (1 << 1), /* recovered comm failure */
+	SERR_DATA		= (1 << 8), /* unrecovered data error */
+	SERR_PERSISTENT		= (1 << 9), /* persistent data/comm error */
+	SERR_PROTOCOL		= (1 << 10), /* protocol violation */
+	SERR_INTERNAL		= (1 << 11), /* host internal error */
+	SERR_PHYRDY_CHG		= (1 << 16), /* PHY RDY changed */
+	SERR_DEV_XCHG		= (1 << 26), /* device exchanged */
+
 	/* struct ata_taskfile flags */
 	ATA_TFLAG_LBA48		= (1 << 0), /* enable 48-bit LBA and "HOB" */
 	ATA_TFLAG_ISADDR	= (1 << 1), /* enable r/w to nsect/lba regs */
diff --git a/include/linux/libata.h b/include/linux/libata.h
index e4353cf..80c8f0e 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -155,6 +155,7 @@ enum {
 
 	ATA_FLAG_EH_PENDING	= (1 << 16), /* EH pending */
 	ATA_FLAG_FROZEN		= (1 << 17), /* port is frozen */
+	ATA_FLAG_RECOVERED	= (1 << 18), /* recovery action performed */
 
 	ATA_FLAG_DISABLED	= (1 << 22), /* port is disabled, ignore it */
 	ATA_FLAG_SUSPENDED	= (1 << 23), /* port is suspended (power) */
@@ -225,6 +226,13 @@ enum {
 	ATA_PORT_PRIMARY	= (1 << 0),
 	ATA_PORT_SECONDARY	= (1 << 1),
 
+	/* reset / recovery action types */
+	ATA_EH_REVALIDATE	= (1 << 0),
+	ATA_EH_SOFTRESET	= (1 << 1),
+	ATA_EH_HARDRESET	= (1 << 2),
+
+	ATA_EH_RESET_MASK	= ATA_EH_SOFTRESET | ATA_EH_HARDRESET,
+
 	/* max repeat if error condition is still set after ->error_handler */
 	ATA_EH_MAX_REPEAT	= 5,
 
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/11] libata-eh: add ATA and libata flags for new EH
  2006-05-11 13:21 ` [PATCH 01/11] libata-eh: add ATA and libata flags for new EH Tejun Heo
@ 2006-05-13 22:15   ` Jeff Garzik
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:15 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Add ATA and libata flags to be used by new EH.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>

ACK



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (2 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 01/11] libata-eh: add ATA and libata flags for new EH Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:16   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 03/11] libata-eh: add per-dev ata_ering Tejun Heo
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

ata_ering is a ring buffer which records libata errors - whether a
command was for normar IO request, err_mask and timestamp.  This will
be used by EH to determine recovery actions.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/libata-eh.c |   51 ++++++++++++++++++++++++++++++++++++++++++++++
 drivers/scsi/libata.h    |    2 ++
 include/linux/libata.h   |   16 ++++++++++++++
 3 files changed, 69 insertions(+), 0 deletions(-)

4a10a7d433551195a47e1bf77014fdceea66de8b
diff --git a/drivers/scsi/libata-eh.c b/drivers/scsi/libata-eh.c
index 0803231..f6f0557 100644
--- a/drivers/scsi/libata-eh.c
+++ b/drivers/scsi/libata-eh.c
@@ -46,6 +46,57 @@
 
 static void __ata_port_freeze(struct ata_port *ap);
 
+void ata_ering_init(struct ata_ering *ering, int size)
+{
+	memset(ering, 0, sizeof(*ering) + sizeof(ering->ring[0]) * size);
+	ering->size = size;
+}
+
+static void ata_ering_record(struct ata_ering *ering, int is_io,
+			     unsigned int err_mask)
+{
+	struct ata_ering_entry *ent;
+
+	WARN_ON(!err_mask);
+
+	ering->cursor++;
+	ering->cursor %= ering->size;
+
+	ent = &ering->ring[ering->cursor];
+	ent->is_io = is_io;
+	ent->err_mask = err_mask;
+	ent->timestamp = get_jiffies_64();
+}
+
+static struct ata_ering_entry * ata_ering_top(struct ata_ering *ering)
+{
+	struct ata_ering_entry *ent = &ering->ring[ering->cursor];
+	if (!ent->err_mask)
+		return NULL;
+	return ent;
+}
+
+static int ata_ering_map(struct ata_ering *ering,
+			 int (*map_fn)(struct ata_ering_entry *, void *),
+			 void *arg)
+{
+	int idx, rc = 0;
+	struct ata_ering_entry *ent;
+
+	idx = ering->cursor;
+	do {
+		ent = &ering->ring[idx];
+		if (!ent->err_mask)
+			break;
+		rc = map_fn(ent, arg);
+		if (rc)
+			break;
+		idx = (idx - 1 + ering->size) % ering->size;
+	} while (idx != ering->cursor);
+
+	return rc;
+}
+
 /**
  *	ata_scsi_timed_out - SCSI layer time out callback
  *	@cmd: timed out SCSI command
diff --git a/drivers/scsi/libata.h b/drivers/scsi/libata.h
index e2bbddf..134cb4d 100644
--- a/drivers/scsi/libata.h
+++ b/drivers/scsi/libata.h
@@ -62,6 +62,7 @@ extern int ata_check_atapi_dma(struct at
 extern void ata_dev_select(struct ata_port *ap, unsigned int device,
                            unsigned int wait, unsigned int can_sleep);
 extern void swap_buf_le16(u16 *buf, unsigned int buf_words);
+extern void ata_dev_init(struct ata_device *dev);
 extern int ata_task_ioctl(struct scsi_device *scsidev, void __user *arg);
 extern int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg);
 
@@ -101,6 +102,7 @@ extern void ata_scsi_rbuf_fill(struct at
 extern void ata_schedule_scsi_eh(struct Scsi_Host *shost);
 
 /* libata-eh.c */
+extern void ata_ering_init(struct ata_ering *ering, int size);
 extern enum scsi_eh_timer_return ata_scsi_timed_out(struct scsi_cmnd *cmd);
 extern void ata_scsi_error(struct Scsi_Host *host);
 extern void ata_qc_schedule_eh(struct ata_queued_cmd *qc);
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 80c8f0e..ad16d6b 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -375,6 +375,22 @@ struct ata_host_stats {
 	unsigned long		rw_reqbuf;
 };
 
+struct ata_ering_entry {
+	int			is_io;
+	unsigned int		err_mask;
+	u64			timestamp;
+};
+
+struct ata_ering {
+	int			cursor;
+	int			size;
+	struct ata_ering_entry	ring[];
+};
+
+#define DEFINE_ATA_ERING(name, size)	\
+	struct ata_ering	name;	\
+	struct ata_ering_entry	name_entries[size];
+
 struct ata_device {
 	struct ata_port		*ap;
 	u64			n_sectors;	/* size of device, if ATA */
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-11 13:21 ` [PATCH 02/11] libata-eh: implement ata_ering Tejun Heo
@ 2006-05-13 22:16   ` Jeff Garzik
  2006-05-13 23:36     ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:16 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> +struct ata_ering {
> +	int			cursor;
> +	int			size;
> +	struct ata_ering_entry	ring[];
> +};
> +
> +#define DEFINE_ATA_ERING(name, size)	\
> +	struct ata_ering	name;	\
> +	struct ata_ering_entry	name_entries[size];

ACK, but this is creeping dangerously close to C abuse :)

This sort of code will confuse debuggers and source checkers.

	Jeff



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-13 22:16   ` Jeff Garzik
@ 2006-05-13 23:36     ` Tejun Heo
  2006-05-14  1:05       ` Jeff Garzik
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 23:36 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> Tejun Heo wrote:
>> +struct ata_ering {
>> +    int            cursor;
>> +    int            size;
>> +    struct ata_ering_entry    ring[];
>> +};
>> +
>> +#define DEFINE_ATA_ERING(name, size)    \
>> +    struct ata_ering    name;    \
>> +    struct ata_ering_entry    name_entries[size];
> 
> ACK, but this is creeping dangerously close to C abuse :)
> 
> This sort of code will confuse debuggers and source checkers.
> 

Well, as ering is currently used in only one place and it will stay in 
libata in the future, it might be better to remove the macro and shove 
the allocation into where it's used; however, I prefer the macro because 
it's safer and I don't use any debuggers or source checkers.

Linux source has lots of cpp macros like above and IMHO the above 
doesn't even rank among C abuses.  lxr and sparse would be fine.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-13 23:36     ` Tejun Heo
@ 2006-05-14  1:05       ` Jeff Garzik
  2006-05-14  1:20         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-14  1:05 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Jeff Garzik wrote:
>> Tejun Heo wrote:
>>> +struct ata_ering {
>>> +    int            cursor;
>>> +    int            size;
>>> +    struct ata_ering_entry    ring[];
>>> +};
>>> +
>>> +#define DEFINE_ATA_ERING(name, size)    \
>>> +    struct ata_ering    name;    \
>>> +    struct ata_ering_entry    name_entries[size];
>>
>> ACK, but this is creeping dangerously close to C abuse :)
>>
>> This sort of code will confuse debuggers and source checkers.
>>
> 
> Well, as ering is currently used in only one place and it will stay in 
> libata in the future, it might be better to remove the macro and shove 
> the allocation into where it's used; however, I prefer the macro because 
> it's safer and I don't use any debuggers or source checkers.
> 
> Linux source has lots of cpp macros like above and IMHO the above 
> doesn't even rank among C abuses.  lxr and sparse would be fine.

I doubt you'll find another example of this at all.  The reason why its 
abuse is that the declaration and use are subtlely different.  In the 
above, the C compiler is free to insert padding between 'name' and 
'name_entries'.

Other counterpoints:

* Its important to support other people's use of debuggers and source 
checkers.

* I disagree that the macro is safer, simply because we are having this 
conversation :)  An allocation is easier to understand and less prone to 
subtle breakage.

This is the trap of the 0-element array:  its handy for avoiding an 
additional allocation, but if you look all over the kernel at its real 
world uses (including SCSI and libata), you'll see a hodgepodge of ugly 
casting, varied approaches, and yet similar bug patterns.

	Jeff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-14  1:05       ` Jeff Garzik
@ 2006-05-14  1:20         ` Tejun Heo
  2006-05-14  1:32           ` Jeff Garzik
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-14  1:20 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> Tejun Heo wrote:
>> Jeff Garzik wrote:
>>> Tejun Heo wrote:
>>>> +struct ata_ering {
>>>> +    int            cursor;
>>>> +    int            size;
>>>> +    struct ata_ering_entry    ring[];
>>>> +};
>>>> +
>>>> +#define DEFINE_ATA_ERING(name, size)    \
>>>> +    struct ata_ering    name;    \
>>>> +    struct ata_ering_entry    name_entries[size];
>>>
>>> ACK, but this is creeping dangerously close to C abuse :)
>>>
>>> This sort of code will confuse debuggers and source checkers.
>>>
>>
>> Well, as ering is currently used in only one place and it will stay in 
>> libata in the future, it might be better to remove the macro and shove 
>> the allocation into where it's used; however, I prefer the macro 
>> because it's safer and I don't use any debuggers or source checkers.
>>
>> Linux source has lots of cpp macros like above and IMHO the above 
>> doesn't even rank among C abuses.  lxr and sparse would be fine.
> 
> I doubt you'll find another example of this at all.  The reason why its 
> abuse is that the declaration and use are subtlely different.  In the 
> above, the C compiler is free to insert padding between 'name' and 
> 'name_entries'.
> 
> Other counterpoints:
> 
> * Its important to support other people's use of debuggers and source 
> checkers.

Agreed.

> * I disagree that the macro is safer, simply because we are having this 
> conversation :)  An allocation is easier to understand and less prone to 
> subtle breakage.
>
> This is the trap of the 0-element array:  its handy for avoiding an 
> additional allocation, but if you look all over the kernel at its real 
> world uses (including SCSI and libata), you'll see a hodgepodge of ugly 
> casting, varied approaches, and yet similar bug patterns.

With your comment about padding above, I'm a bit confused.  No matter 
whether we put the name_entries in the macro or in struct ata_device, 
the compiler is free to insert padding inbetween.  That's how 
zero/flexible-length array is supposed to be used.  The only thing 
guaranteed is that there will be enough space to store the array 
members.  Accessing the storage area directly will result in disaster. 
I consider it one another convention to learn about the great C.

The alternatives here are...

* leave it as it is
* shove ering_entry allocation into ering (fixed size is ATM)
* shove ering_entry allocation into ata_device

I'll take any of above.  It's your call.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-14  1:20         ` Tejun Heo
@ 2006-05-14  1:32           ` Jeff Garzik
  2006-05-14  1:38             ` Tejun Heo
  2006-05-15 13:36             ` Alan Cox
  0 siblings, 2 replies; 48+ messages in thread
From: Jeff Garzik @ 2006-05-14  1:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Jeff Garzik wrote:
>> Tejun Heo wrote:
>>> Jeff Garzik wrote:
>>>> Tejun Heo wrote:
>>>>> +struct ata_ering {
>>>>> +    int            cursor;
>>>>> +    int            size;
>>>>> +    struct ata_ering_entry    ring[];
>>>>> +};
>>>>> +
>>>>> +#define DEFINE_ATA_ERING(name, size)    \
>>>>> +    struct ata_ering    name;    \
>>>>> +    struct ata_ering_entry    name_entries[size];
>>>>
>>>> ACK, but this is creeping dangerously close to C abuse :)
>>>>
>>>> This sort of code will confuse debuggers and source checkers.
>>>>
>>>
>>> Well, as ering is currently used in only one place and it will stay 
>>> in libata in the future, it might be better to remove the macro and 
>>> shove the allocation into where it's used; however, I prefer the 
>>> macro because it's safer and I don't use any debuggers or source 
>>> checkers.
>>>
>>> Linux source has lots of cpp macros like above and IMHO the above 
>>> doesn't even rank among C abuses.  lxr and sparse would be fine.
>>
>> I doubt you'll find another example of this at all.  The reason why 
>> its abuse is that the declaration and use are subtlely different.  In 
>> the above, the C compiler is free to insert padding between 'name' and 
>> 'name_entries'.
>>
>> Other counterpoints:
>>
>> * Its important to support other people's use of debuggers and source 
>> checkers.
> 
> Agreed.
> 
>> * I disagree that the macro is safer, simply because we are having 
>> this conversation :)  An allocation is easier to understand and less 
>> prone to subtle breakage.
>>
>> This is the trap of the 0-element array:  its handy for avoiding an 
>> additional allocation, but if you look all over the kernel at its real 
>> world uses (including SCSI and libata), you'll see a hodgepodge of 
>> ugly casting, varied approaches, and yet similar bug patterns.
> 
> With your comment about padding above, I'm a bit confused.  No matter 
> whether we put the name_entries in the macro or in struct ata_device, 
> the compiler is free to insert padding inbetween.  That's how 
> zero/flexible-length array is supposed to be used.  The only thing 
> guaranteed is that there will be enough space to store the array 
> members.  Accessing the storage area directly will result in disaster. I 
> consider it one another convention to learn about the great C.
> 
> The alternatives here are...
> 
> * leave it as it is
> * shove ering_entry allocation into ering (fixed size is ATM)
> * shove ering_entry allocation into ata_device

I would just declare that all erings have 32 entries, and revisit the 
issue if/when pain appears.  Hardcode 32, and make things easy.

	Jeff




^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-14  1:32           ` Jeff Garzik
@ 2006-05-14  1:38             ` Tejun Heo
  2006-05-15 13:36             ` Alan Cox
  1 sibling, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-14  1:38 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> Tejun Heo wrote:
>> Jeff Garzik wrote:
>>> Tejun Heo wrote:
>>>> Jeff Garzik wrote:
>>>>> Tejun Heo wrote:
>>>>>> +struct ata_ering {
>>>>>> +    int            cursor;
>>>>>> +    int            size;
>>>>>> +    struct ata_ering_entry    ring[];
>>>>>> +};
>>>>>> +
>>>>>> +#define DEFINE_ATA_ERING(name, size)    \
>>>>>> +    struct ata_ering    name;    \
>>>>>> +    struct ata_ering_entry    name_entries[size];
>>>>>
>>>>> ACK, but this is creeping dangerously close to C abuse :)
>>>>>
>>>>> This sort of code will confuse debuggers and source checkers.
>>>>>
>>>>
>>>> Well, as ering is currently used in only one place and it will stay 
>>>> in libata in the future, it might be better to remove the macro and 
>>>> shove the allocation into where it's used; however, I prefer the 
>>>> macro because it's safer and I don't use any debuggers or source 
>>>> checkers.
>>>>
>>>> Linux source has lots of cpp macros like above and IMHO the above 
>>>> doesn't even rank among C abuses.  lxr and sparse would be fine.
>>>
>>> I doubt you'll find another example of this at all.  The reason why 
>>> its abuse is that the declaration and use are subtlely different.  In 
>>> the above, the C compiler is free to insert padding between 'name' 
>>> and 'name_entries'.
>>>
>>> Other counterpoints:
>>>
>>> * Its important to support other people's use of debuggers and source 
>>> checkers.
>>
>> Agreed.
>>
>>> * I disagree that the macro is safer, simply because we are having 
>>> this conversation :)  An allocation is easier to understand and less 
>>> prone to subtle breakage.
>>>
>>> This is the trap of the 0-element array:  its handy for avoiding an 
>>> additional allocation, but if you look all over the kernel at its 
>>> real world uses (including SCSI and libata), you'll see a hodgepodge 
>>> of ugly casting, varied approaches, and yet similar bug patterns.
>>
>> With your comment about padding above, I'm a bit confused.  No matter 
>> whether we put the name_entries in the macro or in struct ata_device, 
>> the compiler is free to insert padding inbetween.  That's how 
>> zero/flexible-length array is supposed to be used.  The only thing 
>> guaranteed is that there will be enough space to store the array 
>> members.  Accessing the storage area directly will result in disaster. 
>> I consider it one another convention to learn about the great C.
>>
>> The alternatives here are...
>>
>> * leave it as it is
>> * shove ering_entry allocation into ering (fixed size is ATM)
>> * shove ering_entry allocation into ata_device
> 
> I would just declare that all erings have 32 entries, and revisit the 
> issue if/when pain appears.  Hardcode 32, and make things easy.
> 

Okay, will do.  That's what I tried to say with alt#2 with my rubbish 
english.  :)

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-14  1:32           ` Jeff Garzik
  2006-05-14  1:38             ` Tejun Heo
@ 2006-05-15 13:36             ` Alan Cox
  2006-05-15 14:00               ` Tejun Heo
  1 sibling, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-15 13:36 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Tejun Heo, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Sad, 2006-05-13 at 21:32 -0400, Jeff Garzik wrote:
> I would just declare that all erings have 32 entries, and revisit the 
> issue if/when pain appears.  Hardcode 32, and make things easy.

Do we even need the ering. If you classify the error when it occurs it
seems all you need is a couple of 32bit words and to set bits and
shift ?


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 13:36             ` Alan Cox
@ 2006-05-15 14:00               ` Tejun Heo
  2006-05-15 14:25                 ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-15 14:00 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
> On Sad, 2006-05-13 at 21:32 -0400, Jeff Garzik wrote:
>> I would just declare that all erings have 32 entries, and revisit the 
>> issue if/when pain appears.  Hardcode 32, and make things easy.
> 
> Do we even need the ering. If you classify the error when it occurs it
> seems all you need is a couple of 32bit words and to set bits and
> shift ?
> 

Hmmm.. yeah, that sounds good.  Jeff, what do you think?

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 14:00               ` Tejun Heo
@ 2006-05-15 14:25                 ` Tejun Heo
  2006-05-15 14:50                   ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-15 14:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Alan Cox wrote:
>> On Sad, 2006-05-13 at 21:32 -0400, Jeff Garzik wrote:
>>> I would just declare that all erings have 32 entries, and revisit the 
>>> issue if/when pain appears.  Hardcode 32, and make things easy.
>> Do we even need the ering. If you classify the error when it occurs it
>> seems all you need is a couple of 32bit words and to set bits and
>> shift ?
>>
> 
> Hmmm.. yeah, that sounds good.  Jeff, what do you think?
> 

I was too quick to response.  ering keeps track of timestamps of each
error and the criteria for speeding down is something like "3 class 1
errors during last 15mins", so bitmaps doesn't cut it.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 14:25                 ` Tejun Heo
@ 2006-05-15 14:50                   ` Alan Cox
  2006-05-15 14:57                     ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-15 14:50 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Llu, 2006-05-15 at 23:25 +0900, Tejun Heo wrote:
> I was too quick to response.  ering keeps track of timestamps of each
> error and the criteria for speeding down is something like "3 class 1
> errors during last 15mins", so bitmaps doesn't cut it.

Sounds like it does


	log_class_1_error(dev)
	{
		if((dev->class1 & 3) < 3) {
			dev->class1++;
			dev->class1sum++;
		} else {
			/* 3 this minute alone */
			it_happened_now_case();
		}
	}

	class_1_each_minute(dev)
	{
		dev->class1sum -= (dev->class1 >> 30);
		dev->class1 <<= 2;
	}




	

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 14:50                   ` Alan Cox
@ 2006-05-15 14:57                     ` Tejun Heo
  2006-05-15 15:19                       ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-15 14:57 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
> On Llu, 2006-05-15 at 23:25 +0900, Tejun Heo wrote:
>> I was too quick to response.  ering keeps track of timestamps of each
>> error and the criteria for speeding down is something like "3 class 1
>> errors during last 15mins", so bitmaps doesn't cut it.
> 
> Sounds like it does
> 
> 
> 	log_class_1_error(dev)
> 	{
> 		if((dev->class1 & 3) < 3) {
> 			dev->class1++;
> 			dev->class1sum++;
> 		} else {
> 			/* 3 this minute alone */
> 			it_happened_now_case();
> 		}
> 	}
> 
> 	class_1_each_minute(dev)
> 	{
> 		dev->class1sum -= (dev->class1 >> 30);
> 		dev->class1 <<= 2;
> 	}
> 

Yeap, of course, it does with periodic shifting.  The 3 in this minute
testing looks neat.

I don't know.  I'm still a bit hesitant because the thing becomes a
little bit more complex (it has more moving parts) and it would be
difficult to define/tune speed down or any other error handling (say,
turning off NCQ) criteria.  But, I might be planning too much.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 14:57                     ` Tejun Heo
@ 2006-05-15 15:19                       ` Alan Cox
  2006-05-15 15:19                         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-15 15:19 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

> little bit more complex (it has more moving parts) and it would be
> difficult to define/tune speed down or any other error handling (say,
> turning off NCQ) criteria.  But, I might be planning too much.

The log problem is that you can get 32 errors very fast (doesn't take
much with multiple outstanding commands) so you have to choose which
info to throw away - the history or the detail in the commands 



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 15:19                       ` Alan Cox
@ 2006-05-15 15:19                         ` Tejun Heo
  2006-05-15 18:22                           ` Jeff Garzik
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-15 15:19 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
>> little bit more complex (it has more moving parts) and it would be
>> difficult to define/tune speed down or any other error handling (say,
>> turning off NCQ) criteria.  But, I might be planning too much.
> 
> The log problem is that you can get 32 errors very fast (doesn't take
> much with multiple outstanding commands)

The current EH implementation records one error per EH run.  It just
combines all failed commands and record it as single entry.  As each EH
run usually takes at least several seconds, 32 slots should give several
minutes worth of history even if things are pretty bad.

> so you have to choose which
> info to throw away - the history or the detail in the commands 

Yeap.  And I also agree that ering is pretty inefficient way to spend
512 bytes of memory.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 02/11] libata-eh: implement ata_ering
  2006-05-15 15:19                         ` Tejun Heo
@ 2006-05-15 18:22                           ` Jeff Garzik
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2006-05-15 18:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Alan Cox, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Alan Cox wrote:
>>> little bit more complex (it has more moving parts) and it would be
>>> difficult to define/tune speed down or any other error handling (say,
>>> turning off NCQ) criteria.  But, I might be planning too much.
>> The log problem is that you can get 32 errors very fast (doesn't take
>> much with multiple outstanding commands)
> 
> The current EH implementation records one error per EH run.  It just
> combines all failed commands and record it as single entry.  As each EH
> run usually takes at least several seconds, 32 slots should give several
> minutes worth of history even if things are pretty bad.
> 
>> so you have to choose which
>> info to throw away - the history or the detail in the commands 
> 
> Yeap.  And I also agree that ering is pretty inefficient way to spend
> 512 bytes of memory.

Overall its not a huge issue.  As long as we aren't constantly
allocating and freeing that 512 bytes...

	Jeff




^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 03/11] libata-eh: add per-dev ata_ering
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (3 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 02/11] libata-eh: implement ata_ering Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-11 13:21 ` [PATCH 06/11] libata-eh: implement BMDMA EH Tejun Heo
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

EH is gonna record errors per-dev.  Define per-dev ata_ering.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/libata-core.c |    1 +
 include/linux/libata.h     |    4 ++++
 2 files changed, 5 insertions(+), 0 deletions(-)

17d6991533312fcce8cb213ddabd164af7e9f88e
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index a05be86..afb7796 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -4871,6 +4871,7 @@ static void ata_host_init(struct ata_por
 		dev->pio_mask = UINT_MAX;
 		dev->mwdma_mask = UINT_MAX;
 		dev->udma_mask = UINT_MAX;
+		ata_ering_init(&dev->ering, ATA_DEV_ERING_SIZE);
 	}
 
 #ifdef ATA_IRQ_TRAP
diff --git a/include/linux/libata.h b/include/linux/libata.h
index ad16d6b..1a4df14 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -135,6 +135,8 @@ enum {
 	ATA_DEV_ATAPI_UNSUP	= 4,	/* ATAPI device (unsupported) */
 	ATA_DEV_NONE		= 5,	/* no device */
 
+	ATA_DEV_ERING_SIZE	= 32,	/* record 32 recent errors */
+
 	/* struct ata_port flags */
 	ATA_FLAG_SLAVE_POSS	= (1 << 0), /* host supports slave dev */
 					    /* (doesn't imply presence) */
@@ -417,6 +419,8 @@ struct ata_device {
 	u16			cylinders;	/* Number of cylinders */
 	u16			heads;		/* Number of heads */
 	u16			sectors;	/* Number of sectors per track */
+
+	DEFINE_ATA_ERING	(ering, ATA_DEV_ERING_SIZE);
 };
 
 struct ata_port {
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (4 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 03/11] libata-eh: add per-dev ata_ering Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:21   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 07/11] ata_piix: convert to new EH Tejun Heo
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Implement stock BMDMA error handling methods.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/libata-bmdma.c |  144 +++++++++++++++++++++++++++++++++++++++++++
 drivers/scsi/libata-core.c  |    5 +
 include/linux/libata.h      |    8 ++
 3 files changed, 157 insertions(+), 0 deletions(-)

33c8c944f97629b555fc6a14ca49803c47f9d50b
diff --git a/drivers/scsi/libata-bmdma.c b/drivers/scsi/libata-bmdma.c
index 835dff0..751ec18 100644
--- a/drivers/scsi/libata-bmdma.c
+++ b/drivers/scsi/libata-bmdma.c
@@ -652,6 +652,150 @@ void ata_bmdma_stop(struct ata_queued_cm
 	ata_altstatus(ap);        /* dummy read */
 }
 
+/**
+ *	ata_bmdma_freeze - Freeze BMDMA controller port
+ *	@ap: port to freeze
+ *
+ *	Freeze BMDMA controller port.
+ *
+ *	LOCKING:
+ *	Inherited from caller.
+ */
+void ata_bmdma_freeze(struct ata_port *ap)
+{
+	struct ata_ioports *ioaddr = &ap->ioaddr;
+
+	ap->ctl |= ATA_NIEN;
+	ap->last_ctl = ap->ctl;
+
+	if (ap->flags & ATA_FLAG_MMIO)
+		writeb(ap->ctl, (void __iomem *)ioaddr->ctl_addr);
+	else
+		outb(ap->ctl, ioaddr->ctl_addr);
+}
+
+/**
+ *	ata_bmdma_thaw - Thaw BMDMA controller port
+ *	@ap: port to thaw
+ *
+ *	Thaw BMDMA controller port.
+ *
+ *	LOCKING:
+ *	Inherited from caller.
+ */
+void ata_bmdma_thaw(struct ata_port *ap)
+{
+	/* clear & re-enable interrupts */
+	ata_chk_status(ap);
+	ap->ops->irq_clear(ap);
+	if (ap->ioaddr.ctl_addr)	/* FIXME: hack. create a hook instead */
+		ata_irq_on(ap);
+}
+
+/**
+ *	ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller
+ *	@ap: port to handle error for
+ *	@softreset: softreset method (can be NULL)
+ *	@hardreset: hardreset method (can be NULL)
+ *	@postreset: postreset method (can be NULL)
+ *
+ *	Handle error for ATA BMDMA controller.  It can handle both
+ *	PATA and SATA controllers.  Many controllers should be able to
+ *	use this EH as-is or with some added handling before and
+ *	after.
+ *
+ *	This function is intended to be used for constructing
+ *	->error_handler callback by low level drivers.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep)
+ */
+void ata_bmdma_drive_eh(struct ata_port *ap, ata_reset_fn_t softreset,
+			ata_reset_fn_t hardreset, ata_postreset_fn_t postreset)
+{
+	struct ata_host_set *host_set = ap->host_set;
+	struct ata_eh_context *ehc = &ap->eh_context;
+	struct ata_queued_cmd *qc;
+	unsigned long flags;
+	int thaw = 0;
+
+	qc = __ata_qc_from_tag(ap, ap->active_tag);
+	if (qc && !(qc->flags & ATA_QCFLAG_FAILED))
+		qc = NULL;
+
+	/* reset PIO HSM and stop DMA engine */
+	spin_lock_irqsave(&host_set->lock, flags);
+
+	ap->flags &= ~ATA_FLAG_NOINTR;
+	ap->hsm_task_state = HSM_ST_IDLE;
+
+	if (qc && (qc->tf.protocol == ATA_PROT_DMA ||
+		   qc->tf.protocol == ATA_PROT_ATAPI_DMA)) {
+		u8 host_stat;
+
+		host_stat = ata_bmdma_status(ap);
+
+		ata_ehi_push_desc(&ehc->i, "BMDMA stat 0x%x", host_stat);
+
+		/* BMDMA controllers indicate host bus error by
+		 * setting DMA_ERR bit and timing out.  As it wasn't
+		 * really a timeout event, adjust error mask and
+		 * cancel frozen state.
+		 */
+		if (qc->err_mask == AC_ERR_TIMEOUT && host_stat & ATA_DMA_ERR) {
+			qc->err_mask = AC_ERR_HOST_BUS;
+			thaw = 1;
+		}
+
+		ap->ops->bmdma_stop(qc);
+	}
+
+	ata_altstatus(ap);
+	ata_chk_status(ap);
+	ap->ops->irq_clear(ap);
+
+	spin_unlock_irqrestore(&host_set->lock, flags);
+
+	if (thaw)
+		ata_eh_thaw_port(ap);
+
+	/* PIO and DMA engines have been stopped, perform recovery */
+	ata_do_eh(ap, softreset, hardreset, postreset);
+}
+
+/**
+ *	ata_bmdma_error_handler - Stock error handler for BMDMA controller
+ *	@ap: port to handle error for
+ *
+ *	Stock error handler for BMDMA controller.
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep)
+ */
+void ata_bmdma_error_handler(struct ata_port *ap)
+{
+	ata_reset_fn_t hardreset;
+
+	hardreset = NULL;
+	if (ata_scr_valid(ap))
+		hardreset = sata_std_hardreset;
+
+	ata_bmdma_drive_eh(ap, ata_std_softreset, hardreset, ata_std_postreset);
+}
+
+/**
+ *	ata_bmdma_post_internal_cmd - Stock post_internal_cmd for
+ *				      BMDMA controller
+ *	@qc: internal command to clean up
+ *
+ *	LOCKING:
+ *	Kernel thread context (may sleep)
+ */
+void ata_bmdma_post_internal_cmd(struct ata_queued_cmd *qc)
+{
+	ata_bmdma_stop(qc);
+}
+
 #ifdef CONFIG_PCI
 static struct ata_probe_ent *
 ata_probe_ent_alloc(struct device *dev, const struct ata_port_info *port)
diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c
index 35faf66..f50fea6 100644
--- a/drivers/scsi/libata-core.c
+++ b/drivers/scsi/libata-core.c
@@ -5377,6 +5377,11 @@ EXPORT_SYMBOL_GPL(ata_bmdma_start);
 EXPORT_SYMBOL_GPL(ata_bmdma_irq_clear);
 EXPORT_SYMBOL_GPL(ata_bmdma_status);
 EXPORT_SYMBOL_GPL(ata_bmdma_stop);
+EXPORT_SYMBOL_GPL(ata_bmdma_freeze);
+EXPORT_SYMBOL_GPL(ata_bmdma_thaw);
+EXPORT_SYMBOL_GPL(ata_bmdma_drive_eh);
+EXPORT_SYMBOL_GPL(ata_bmdma_error_handler);
+EXPORT_SYMBOL_GPL(ata_bmdma_post_internal_cmd);
 EXPORT_SYMBOL_GPL(ata_port_probe);
 EXPORT_SYMBOL_GPL(ata_set_sata_spd);
 EXPORT_SYMBOL_GPL(sata_phy_reset);
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 867f1d5..6ac840f 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -664,6 +664,14 @@ extern void ata_bmdma_start (struct ata_
 extern void ata_bmdma_stop(struct ata_queued_cmd *qc);
 extern u8   ata_bmdma_status(struct ata_port *ap);
 extern void ata_bmdma_irq_clear(struct ata_port *ap);
+extern void ata_bmdma_freeze(struct ata_port *ap);
+extern void ata_bmdma_thaw(struct ata_port *ap);
+extern void ata_bmdma_drive_eh(struct ata_port *ap,
+			       ata_reset_fn_t softreset,
+			       ata_reset_fn_t hardreset,
+			       ata_postreset_fn_t postreset);
+extern void ata_bmdma_error_handler(struct ata_port *ap);
+extern void ata_bmdma_post_internal_cmd(struct ata_queued_cmd *qc);
 extern void ata_qc_complete(struct ata_queued_cmd *qc);
 extern void ata_scsi_simulate(struct ata_device *dev, struct scsi_cmnd *cmd,
 			      void (*done)(struct scsi_cmnd *));
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-11 13:21 ` [PATCH 06/11] libata-eh: implement BMDMA EH Tejun Heo
@ 2006-05-13 22:21   ` Jeff Garzik
  2006-05-13 23:41     ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:21 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Implement stock BMDMA error handling methods.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> 
> ---
> 
>  drivers/scsi/libata-bmdma.c |  144 +++++++++++++++++++++++++++++++++++++++++++
>  drivers/scsi/libata-core.c  |    5 +
>  include/linux/libata.h      |    8 ++
>  3 files changed, 157 insertions(+), 0 deletions(-)
> 
> 33c8c944f97629b555fc6a14ca49803c47f9d50b
> diff --git a/drivers/scsi/libata-bmdma.c b/drivers/scsi/libata-bmdma.c
> index 835dff0..751ec18 100644
> --- a/drivers/scsi/libata-bmdma.c
> +++ b/drivers/scsi/libata-bmdma.c
> @@ -652,6 +652,150 @@ void ata_bmdma_stop(struct ata_queued_cm
>  	ata_altstatus(ap);        /* dummy read */
>  }
>  
> +/**
> + *	ata_bmdma_freeze - Freeze BMDMA controller port
> + *	@ap: port to freeze
> + *
> + *	Freeze BMDMA controller port.
> + *
> + *	LOCKING:
> + *	Inherited from caller.
> + */
> +void ata_bmdma_freeze(struct ata_port *ap)
> +{
> +	struct ata_ioports *ioaddr = &ap->ioaddr;
> +
> +	ap->ctl |= ATA_NIEN;
> +	ap->last_ctl = ap->ctl;
> +
> +	if (ap->flags & ATA_FLAG_MMIO)
> +		writeb(ap->ctl, (void __iomem *)ioaddr->ctl_addr);
> +	else
> +		outb(ap->ctl, ioaddr->ctl_addr);
> +}
> +
> +/**
> + *	ata_bmdma_thaw - Thaw BMDMA controller port
> + *	@ap: port to thaw
> + *
> + *	Thaw BMDMA controller port.
> + *
> + *	LOCKING:
> + *	Inherited from caller.
> + */
> +void ata_bmdma_thaw(struct ata_port *ap)
> +{
> +	/* clear & re-enable interrupts */
> +	ata_chk_status(ap);
> +	ap->ops->irq_clear(ap);
> +	if (ap->ioaddr.ctl_addr)	/* FIXME: hack. create a hook instead */
> +		ata_irq_on(ap);
> +}

What's the plan for the class of PATA controllers without Device Control 
shadow registers?

There are multiple examples of such in drivers/ide...

I ACK the patch's content, but don't forget we definitely need a plan 
for PATA controllers without nIEN support.

	Jeff





^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-13 22:21   ` Jeff Garzik
@ 2006-05-13 23:41     ` Tejun Heo
  2006-05-15 13:38       ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 23:41 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:

> 
> What's the plan for the class of PATA controllers without Device Control 
> shadow registers?
> 
> There are multiple examples of such in drivers/ide...
> 
> I ACK the patch's content, but don't forget we definitely need a plan 
> for PATA controllers without nIEN support.

I can't think of any solution other than ack'ing and clearing interrupts 
blindly while frozen.

* it only happens from the freezing point until the device/controller is 
reset.

* we can't detect spurious interrupts during that period but nothing 
else is lost.  Also, please note that most traditional IDE controllers 
tend to have their own IRQ.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-13 23:41     ` Tejun Heo
@ 2006-05-15 13:38       ` Alan Cox
  2006-05-15 13:59         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-15 13:38 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Sul, 2006-05-14 at 08:41 +0900, Tejun Heo wrote: 
> else is lost.  Also, please note that most traditional IDE controllers 
> tend to have their own IRQ.

Not really. You normally have one IRQ per controller for any plug in
card and that is quite often shared.



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-15 13:38       ` Alan Cox
@ 2006-05-15 13:59         ` Tejun Heo
  2006-05-15 14:43           ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-15 13:59 UTC (permalink / raw)
  To: Alan Cox; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
> On Sul, 2006-05-14 at 08:41 +0900, Tejun Heo wrote: 
>> else is lost.  Also, please note that most traditional IDE controllers 
>> tend to have their own IRQ.
> 
> Not really. You normally have one IRQ per controller for any plug in
> card and that is quite often shared.
> 

I see.  But even with the shared IRQ, unconditionally clearing IRQ for
short duration wouldn't cause much problem, would it?

@ Out of curiosity, how old does a controller have to be to not have the
ctl shadow register?

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 06/11] libata-eh: implement BMDMA EH
  2006-05-15 13:59         ` Tejun Heo
@ 2006-05-15 14:43           ` Alan Cox
  0 siblings, 0 replies; 48+ messages in thread
From: Alan Cox @ 2006-05-15 14:43 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jeff Garzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Llu, 2006-05-15 at 22:59 +0900, Tejun Heo wrote:
> I see.  But even with the shared IRQ, unconditionally clearing IRQ for
> short duration wouldn't cause much problem, would it?

Clearing in what sense. If you just clear it then you may well just get
another one. If you ack it on the hardware then you need to recover
anything that may be lost before the next IRQ.

Wouldn't think its a big problem with BMDMA at all. If a given channel
interrupts us then we know what state it is in and we also know there
are no other outstanding commands even possible on the device. Since you
freeze the device not the channel (at least if I read the code right for
native mode paths) this ought to be fine. Legacy mode you freeze the
channel but there is a per channel IRQ (and the per channel host needs
fixing anyway).

The same need to clean up is true btw of PIO. Several controllers have
errata if you read a control register during a PIO transfer so you must
halt the PIO block transfer first. Intel PIIX4/440MX have this problem.

> @ Out of curiosity, how old does a controller have to be to not have the
> ctl shadow register?

You'll find controllers today that don't appear to have one (eg cf
flash)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 07/11] ata_piix: convert to new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (5 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 06/11] libata-eh: implement BMDMA EH Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:23   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 08/11] sata_sil: " Tejun Heo
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

ata_piix can use stock BMDMA EH routines.  Convert to new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/ata_piix.c |   10 ++++++++--
 1 files changed, 8 insertions(+), 2 deletions(-)

9a3ac8d5632a5209d3b6fb1efd94b7bd86af24a3
diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index af1d46e..e3184a7 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -243,7 +243,10 @@ static const struct ata_port_operations 
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
 
-	.eng_timeout		= ata_eng_timeout,
+	.freeze			= ata_bmdma_freeze,
+	.thaw			= ata_bmdma_thaw,
+	.error_handler		= ata_bmdma_error_handler,
+	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
@@ -271,7 +274,10 @@ static const struct ata_port_operations 
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
 
-	.eng_timeout		= ata_eng_timeout,
+	.freeze			= ata_bmdma_freeze,
+	.thaw			= ata_bmdma_thaw,
+	.error_handler		= ata_bmdma_error_handler,
+	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/11] ata_piix: convert to new EH
  2006-05-11 13:21 ` [PATCH 07/11] ata_piix: convert to new EH Tejun Heo
@ 2006-05-13 22:23   ` Jeff Garzik
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:23 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> ata_piix can use stock BMDMA EH routines.  Convert to new EH.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>

ACK



^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (6 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 07/11] ata_piix: convert to new EH Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-11 14:22   ` Alan Cox
  2006-05-13 22:26   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 10/11] ahci: add PIOS interim interrupt handling Tejun Heo
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Convert sata_sil to new EH.  As these controllers have hardware
interrupt mask and are known to have screaming interrupts issues, use
hardware IRQ masking for freezing.  sil_freeze() masks interrupts for
the port and sil_thaw() unmasks them.  As ports are automatically
frozen before probing reset, there is no need to initialize interrupt
masks sil_init_onde().  Remove related code.

Other than freezing, sata_sil uses stock BMDMA EH routines.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/sata_sil.c |   49 +++++++++++++++++++++++++++++++++--------------
 1 files changed, 34 insertions(+), 15 deletions(-)

0eb051e052c5472c8e2abb0ab5caa267f6f72437
diff --git a/drivers/scsi/sata_sil.c b/drivers/scsi/sata_sil.c
index bfcece1..3e998b7 100644
--- a/drivers/scsi/sata_sil.c
+++ b/drivers/scsi/sata_sil.c
@@ -96,6 +96,8 @@ static void sil_dev_config(struct ata_po
 static u32 sil_scr_read (struct ata_port *ap, unsigned int sc_reg);
 static void sil_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val);
 static void sil_post_set_mode (struct ata_port *ap);
+static void sil_freeze(struct ata_port *ap);
+static void sil_thaw(struct ata_port *ap);
 
 
 static const struct pci_device_id sil_pci_tbl[] = {
@@ -174,7 +176,10 @@ static const struct ata_port_operations 
 	.bmdma_status		= ata_bmdma_status,
 	.qc_prep		= ata_qc_prep,
 	.qc_issue		= ata_qc_issue_prot,
-	.eng_timeout		= ata_eng_timeout,
+	.freeze			= sil_freeze,
+	.thaw			= sil_thaw,
+	.error_handler		= ata_bmdma_error_handler,
+	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
 	.irq_handler		= ata_interrupt,
 	.irq_clear		= ata_bmdma_irq_clear,
 	.scr_read		= sil_scr_read,
@@ -314,6 +319,33 @@ static void sil_scr_write (struct ata_po
 		writel(val, mmio);
 }
 
+static void sil_freeze(struct ata_port *ap)
+{
+	void __iomem *mmio_base = ap->host_set->mmio_base;
+	u32 tmp;
+
+	/* plug IRQ */
+	tmp = readl(mmio_base + SIL_SYSCFG);
+	tmp |= SIL_MASK_IDE0_INT << ap->port_no;
+	writel(tmp, mmio_base + SIL_SYSCFG);
+	readl(mmio_base + SIL_SYSCFG);	/* flush */
+}
+
+static void sil_thaw(struct ata_port *ap)
+{
+	void __iomem *mmio_base = ap->host_set->mmio_base;
+	u32 tmp;
+
+	/* clear IRQ */
+	ata_chk_status(ap);
+	ap->ops->irq_clear(ap);
+
+	/* turn on IRQ */
+	tmp = readl(mmio_base + SIL_SYSCFG);
+	tmp &= ~(SIL_MASK_IDE0_INT << ap->port_no);
+	writel(tmp, mmio_base + SIL_SYSCFG);
+}
+
 /**
  *	sil_dev_config - Apply device/host-specific errata fixups
  *	@ap: Port containing device to be examined
@@ -384,7 +416,7 @@ static int sil_init_one (struct pci_dev 
 	int rc;
 	unsigned int i;
 	int pci_dev_busy = 0;
-	u32 tmp, irq_mask;
+	u32 tmp;
 	u8 cls;
 
 	if (!printed_version++)
@@ -474,24 +506,11 @@ static int sil_init_one (struct pci_dev 
 	}
 
 	if (ent->driver_data == sil_3114) {
-		irq_mask = SIL_MASK_4PORT;
-
 		/* flip the magic "make 4 ports work" bit */
 		tmp = readl(mmio_base + sil_port[2].bmdma);
 		if ((tmp & SIL_INTR_STEERING) == 0)
 			writel(tmp | SIL_INTR_STEERING,
 			       mmio_base + sil_port[2].bmdma);
-
-	} else {
-		irq_mask = SIL_MASK_2PORT;
-	}
-
-	/* make sure IDE0/1/2/3 interrupts are not masked */
-	tmp = readl(mmio_base + SIL_SYSCFG);
-	if (tmp & irq_mask) {
-		tmp &= ~irq_mask;
-		writel(tmp, mmio_base + SIL_SYSCFG);
-		readl(mmio_base + SIL_SYSCFG);	/* flush */
 	}
 
 	/* mask all SATA phy-related interrupts */
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 13:21 ` [PATCH 08/11] sata_sil: " Tejun Heo
@ 2006-05-11 14:22   ` Alan Cox
  2006-05-11 14:39     ` Tejun Heo
  2006-05-13 22:26   ` Jeff Garzik
  1 sibling, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-11 14:22 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Iau, 2006-05-11 at 22:21 +0900, Tejun Heo wrote:
> Convert sata_sil to new EH.  As these controllers have hardware
> interrupt mask and are known to have screaming interrupts issues, use
> hardware IRQ masking for freezing.  sil_freeze() masks interrupts for
> the port and sil_thaw() unmasks them.  As ports are automatically
> frozen before probing reset, there is no need to initialize interrupt
> masks sil_init_onde().  Remove related code.

Word of caution. When you mask an interrupt on chip like this remember
that an IRQ may still be pending so the freeze is *not* safe for
synchronization, merely for shutting the chip up. Otherwise this patch
set looks pretty nice.

Patchsets all look good with my PATA hat on. The dev->ap stuff tidies up
the ->data_xfer methods I need to add and the other stuff fits nicely.

Alan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 14:22   ` Alan Cox
@ 2006-05-11 14:39     ` Tejun Heo
  2006-05-11 15:46       ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 14:39 UTC (permalink / raw)
  To: Alan Cox; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Hello, Alan.

Alan Cox wrote:
> On Iau, 2006-05-11 at 22:21 +0900, Tejun Heo wrote:
>> Convert sata_sil to new EH.  As these controllers have hardware
>> interrupt mask and are known to have screaming interrupts issues, use
>> hardware IRQ masking for freezing.  sil_freeze() masks interrupts for
>> the port and sil_thaw() unmasks them.  As ports are automatically
>> frozen before probing reset, there is no need to initialize interrupt
>> masks sil_init_onde().  Remove related code.
> 
> Word of caution. When you mask an interrupt on chip like this remember
> that an IRQ may still be pending so the freeze is *not* safe for
> synchronization, merely for shutting the chip up. Otherwise this patch
> set looks pretty nice.

I'm not sure what you mean by synchronization.  But, freezing is indeed
 for shutting the chip up.  If the port is frozen, EH doesn't touch
active qcs nor tries to issue new command.  The port is thawed/used only
after the port & device are put into known good state via successful
reset.  Is this what you're talking about?

> Patchsets all look good with my PATA hat on. The dev->ap stuff tidies up
> the ->data_xfer methods I need to add and the other stuff fits nicely.

Great.  I haven't removed @ap from methods in ata_port_operations() yet
because the change affects all LLDDs and later ata_link addition would
require another sweep all over again.  I think the change can wait until
EH stuff settles down.

Thanks for reviewing this.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 14:39     ` Tejun Heo
@ 2006-05-11 15:46       ` Alan Cox
  2006-05-11 15:45         ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-11 15:46 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Iau, 2006-05-11 at 23:39 +0900, Tejun Heo wrote:
> I'm not sure what you mean by synchronization.  But, freezing is indeed
>  for shutting the chip up.  If the port is frozen, EH doesn't touch
> active qcs nor tries to issue new command.  The port is thawed/used only
> after the port & device are put into known good state via successful
> reset.  Is this what you're talking about?

Sort of. I wanted to point out that this can occur

	IRQ is issued on chip

	->freeze
	returns

	->intr finally arrives and is serviced

Alan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 15:46       ` Alan Cox
@ 2006-05-11 15:45         ` Tejun Heo
  2006-05-11 16:12           ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 15:45 UTC (permalink / raw)
  To: Alan Cox; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
> On Iau, 2006-05-11 at 23:39 +0900, Tejun Heo wrote:
>> I'm not sure what you mean by synchronization.  But, freezing is indeed
>>  for shutting the chip up.  If the port is frozen, EH doesn't touch
>> active qcs nor tries to issue new command.  The port is thawed/used only
>> after the port & device are put into known good state via successful
>> reset.  Is this what you're talking about?
> 
> Sort of. I wanted to point out that this can occur
> 
> 	IRQ is issued on chip
> 
> 	->freeze
> 	returns
> 
> 	->intr finally arrives and is serviced

I see.  As far as EH is concerned, this should be okay.  Once qc
ownership is transferred, irq handler cannot access the qc
(ata_qc_from_tag() prevents it) and has no way other than handling it as
a spurious interrupt or HSM violation.  But, it sounds like it could
result in screaming IRQ for controllers without IRQ pending bit.  Could
it?  If so, maybe unconditional irq clearing in ata_interrupt() should
be resurrected to handle such conditions.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 15:45         ` Tejun Heo
@ 2006-05-11 16:12           ` Alan Cox
  2006-05-11 16:10             ` Tejun Heo
  0 siblings, 1 reply; 48+ messages in thread
From: Alan Cox @ 2006-05-11 16:12 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Gwe, 2006-05-12 at 00:45 +0900, Tejun Heo wrote:
> a spurious interrupt or HSM violation.  But, it sounds like it could
> result in screaming IRQ for controllers without IRQ pending bit.  Could
> it?  If so, maybe unconditional irq clearing in ata_interrupt() should
> be resurrected to handle such conditions.

Not sure how the sil24 handles it but the PATA chips seem (its not well
documented) to drop any pending interrupt once it is masked, thus you
would take the interrupt once (or maybe 2 or 3 times in weird obscure
corner cases on a pentium II SMP box) and then no more.

Alan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 16:12           ` Alan Cox
@ 2006-05-11 16:10             ` Tejun Heo
  2006-05-11 17:16               ` Alan Cox
  0 siblings, 1 reply; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 16:10 UTC (permalink / raw)
  To: Alan Cox; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

Alan Cox wrote:
> On Gwe, 2006-05-12 at 00:45 +0900, Tejun Heo wrote:
>> a spurious interrupt or HSM violation.  But, it sounds like it could
>> result in screaming IRQ for controllers without IRQ pending bit.  Could
>> it?  If so, maybe unconditional irq clearing in ata_interrupt() should
>> be resurrected to handle such conditions.
> 
> Not sure how the sil24 handles it but the PATA chips seem (its not well

I'm pretty sure most modern controllers including sil24 drops IRQ once
masked.  Also, as they have IRQ pending bit, such IRQs can easily be
cleared.

> documented) to drop any pending interrupt once it is masked, thus you
> would take the interrupt once (or maybe 2 or 3 times in weird obscure
> corner cases on a pentium II SMP box) and then no more.

So, they will drop interrupt after several unhandled IRQs at most even
if the driver doesn't read TF status register to clear it, right?

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 16:10             ` Tejun Heo
@ 2006-05-11 17:16               ` Alan Cox
  0 siblings, 0 replies; 48+ messages in thread
From: Alan Cox @ 2006-05-11 17:16 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, axboe, albertcc, forrest.zhao, efalk, linux-ide

On Gwe, 2006-05-12 at 01:10 +0900, Tejun Heo wrote:
> > documented) to drop any pending interrupt once it is masked, thus you
> > would take the interrupt once (or maybe 2 or 3 times in weird obscure
> > corner cases on a pentium II SMP box) and then no more.
> 
> So, they will drop interrupt after several unhandled IRQs at most even
> if the driver doesn't read TF status register to clear it, right?

Yes. The IRQ is no longer asserted at source so will not be re-triggered


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-11 13:21 ` [PATCH 08/11] sata_sil: " Tejun Heo
  2006-05-11 14:22   ` Alan Cox
@ 2006-05-13 22:26   ` Jeff Garzik
  2006-05-13 23:43     ` Tejun Heo
  1 sibling, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:26 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Convert sata_sil to new EH.  As these controllers have hardware
> interrupt mask and are known to have screaming interrupts issues, use
> hardware IRQ masking for freezing.  sil_freeze() masks interrupts for
> the port and sil_thaw() unmasks them.  As ports are automatically
> frozen before probing reset, there is no need to initialize interrupt
> masks sil_init_onde().  Remove related code.
> 
> Other than freezing, sata_sil uses stock BMDMA EH routines.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>
> 
> ---
> 
>  drivers/scsi/sata_sil.c |   49 +++++++++++++++++++++++++++++++++--------------
>  1 files changed, 34 insertions(+), 15 deletions(-)
> 
> 0eb051e052c5472c8e2abb0ab5caa267f6f72437
> diff --git a/drivers/scsi/sata_sil.c b/drivers/scsi/sata_sil.c
> index bfcece1..3e998b7 100644
> --- a/drivers/scsi/sata_sil.c
> +++ b/drivers/scsi/sata_sil.c
> @@ -96,6 +96,8 @@ static void sil_dev_config(struct ata_po
>  static u32 sil_scr_read (struct ata_port *ap, unsigned int sc_reg);
>  static void sil_scr_write (struct ata_port *ap, unsigned int sc_reg, u32 val);
>  static void sil_post_set_mode (struct ata_port *ap);
> +static void sil_freeze(struct ata_port *ap);
> +static void sil_thaw(struct ata_port *ap);
>  
>  
>  static const struct pci_device_id sil_pci_tbl[] = {
> @@ -174,7 +176,10 @@ static const struct ata_port_operations 
>  	.bmdma_status		= ata_bmdma_status,
>  	.qc_prep		= ata_qc_prep,
>  	.qc_issue		= ata_qc_issue_prot,
> -	.eng_timeout		= ata_eng_timeout,
> +	.freeze			= sil_freeze,
> +	.thaw			= sil_thaw,
> +	.error_handler		= ata_bmdma_error_handler,
> +	.post_internal_cmd	= ata_bmdma_post_internal_cmd,
>  	.irq_handler		= ata_interrupt,
>  	.irq_clear		= ata_bmdma_irq_clear,
>  	.scr_read		= sil_scr_read,
> @@ -314,6 +319,33 @@ static void sil_scr_write (struct ata_po
>  		writel(val, mmio);
>  }
>  
> +static void sil_freeze(struct ata_port *ap)
> +{
> +	void __iomem *mmio_base = ap->host_set->mmio_base;
> +	u32 tmp;
> +
> +	/* plug IRQ */
> +	tmp = readl(mmio_base + SIL_SYSCFG);
> +	tmp |= SIL_MASK_IDE0_INT << ap->port_no;
> +	writel(tmp, mmio_base + SIL_SYSCFG);
> +	readl(mmio_base + SIL_SYSCFG);	/* flush */
> +}
> +
> +static void sil_thaw(struct ata_port *ap)
> +{
> +	void __iomem *mmio_base = ap->host_set->mmio_base;
> +	u32 tmp;
> +
> +	/* clear IRQ */
> +	ata_chk_status(ap);
> +	ap->ops->irq_clear(ap);

General policy:  An LLDD should directly call a function it implements, 
rather than a pointer found via one or more pointer de-refs.  Eliminates 
the pointer derefs, and gives more information to the compiler (and/or 
debugger).


> +	/* turn on IRQ */
> +	tmp = readl(mmio_base + SIL_SYSCFG);
> +	tmp &= ~(SIL_MASK_IDE0_INT << ap->port_no);
> +	writel(tmp, mmio_base + SIL_SYSCFG);
> +}
> +
>  /**
>   *	sil_dev_config - Apply device/host-specific errata fixups
>   *	@ap: Port containing device to be examined
> @@ -384,7 +416,7 @@ static int sil_init_one (struct pci_dev 
>  	int rc;
>  	unsigned int i;
>  	int pci_dev_busy = 0;
> -	u32 tmp, irq_mask;
> +	u32 tmp;
>  	u8 cls;
>  
>  	if (!printed_version++)
> @@ -474,24 +506,11 @@ static int sil_init_one (struct pci_dev 
>  	}
>  
>  	if (ent->driver_data == sil_3114) {
> -		irq_mask = SIL_MASK_4PORT;
> -
>  		/* flip the magic "make 4 ports work" bit */
>  		tmp = readl(mmio_base + sil_port[2].bmdma);
>  		if ((tmp & SIL_INTR_STEERING) == 0)
>  			writel(tmp | SIL_INTR_STEERING,
>  			       mmio_base + sil_port[2].bmdma);
> -
> -	} else {
> -		irq_mask = SIL_MASK_2PORT;
> -	}
> -
> -	/* make sure IDE0/1/2/3 interrupts are not masked */
> -	tmp = readl(mmio_base + SIL_SYSCFG);
> -	if (tmp & irq_mask) {
> -		tmp &= ~irq_mask;
> -		writel(tmp, mmio_base + SIL_SYSCFG);
> -		readl(mmio_base + SIL_SYSCFG);	/* flush */
>  	}
>  
>  	/* mask all SATA phy-related interrupts */


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/11] sata_sil: convert to new EH
  2006-05-13 22:26   ` Jeff Garzik
@ 2006-05-13 23:43     ` Tejun Heo
  0 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 23:43 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> 
> General policy:  An LLDD should directly call a function it implements, 
> rather than a pointer found via one or more pointer de-refs.  Eliminates 
> the pointer derefs, and gives more information to the compiler (and/or 
> debugger).

I see.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 10/11] ahci: add PIOS interim interrupt handling
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (7 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 08/11] sata_sil: " Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-11 13:21 ` [PATCH 09/11] ahci: convert to new EH Tejun Heo
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

During multiblock PIO, multiple PIOS interrupts are generated before
qc compltion.  Current code prints unnecessary message for such cases.
This is exposed when new EH slows down attached device into PIO mode.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/ahci.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

ad7facff3665ce6527929ec5b9c6b2a04689010c
diff --git a/drivers/scsi/ahci.c b/drivers/scsi/ahci.c
index c2b9c93..680dd5b 100644
--- a/drivers/scsi/ahci.c
+++ b/drivers/scsi/ahci.c
@@ -899,7 +899,18 @@ static void ahci_host_intr(struct ata_po
 		}
 	}
 
-	/* spurious interrupt */
+	/* hmmm... a spurious interupt */
+
+	/* ignore interim PIO setup fis interrupts */
+	if (ata_tag_valid(ap->active_tag)) {
+		struct ata_queued_cmd *qc =
+			ata_qc_from_tag(ap, ap->active_tag);
+
+		if (qc && qc->tf.protocol == ATA_PROT_PIO &&
+		    (status & PORT_IRQ_PIOS_FIS))
+			return;
+	}
+
 	if (ata_ratelimit())
 		ata_port_printk(ap, KERN_INFO, "spurious interrupt "
 				"(irq_stat 0x%x active_tag %d)\n",
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/11] ahci: convert to new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (8 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 10/11] ahci: add PIOS interim interrupt handling Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 10:53   ` Tejun Heo
  2006-05-13 22:30   ` Jeff Garzik
  2006-05-11 13:21 ` [PATCH 11/11] sata_sil24: " Tejun Heo
  2006-05-13 22:34 ` [PATCHSET 03/11] new EH implementation, take 3 Jeff Garzik
  11 siblings, 2 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Convert AHCI to new EH.  Unfortunately, ICH7 AHCI reacts badly if IRQ
mask is diddled during operation.  So, freezing is implemented by
unconditionally clearing interrupt conditions while frozen.

* Interrupts are categorized according to required action.
  e.g. Connection status or unknown FIS error requires freezing the
  port while TF or HBUS_DATA don't.

* Only CONNECT (reflects SErr.X) interrupt is taken into account not
  PHYRDY (SErr.N), as CONNECT is better cue for starting EH.

* AHCI may be invoked without any active command.  e.g. CONNECT irq
  occuring while no qc in progress still triggers EH and will reset
  the port and revalidate attached device.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/ahci.c |  222 +++++++++++++++++++++++++++++----------------------
 1 files changed, 127 insertions(+), 95 deletions(-)

9e72390810673cbad9f1a994e7b2dd897c924178
diff --git a/drivers/scsi/ahci.c b/drivers/scsi/ahci.c
index a4fb8d0..c2b9c93 100644
--- a/drivers/scsi/ahci.c
+++ b/drivers/scsi/ahci.c
@@ -71,6 +71,7 @@ enum {
 	AHCI_CMD_CLR_BUSY	= (1 << 10),
 
 	RX_FIS_D2H_REG		= 0x40,	/* offset of D2H Register FIS data */
+	RX_FIS_UNK		= 0x60, /* offset of Unknown FIS data */
 
 	board_ahci		= 0,
 	board_ahci_vt8251	= 1,
@@ -128,15 +129,16 @@ enum {
 	PORT_IRQ_PIOS_FIS	= (1 << 1), /* PIO Setup FIS rx'd */
 	PORT_IRQ_D2H_REG_FIS	= (1 << 0), /* D2H Register FIS rx'd */
 
-	PORT_IRQ_FATAL		= PORT_IRQ_TF_ERR |
-				  PORT_IRQ_HBUS_ERR |
-				  PORT_IRQ_HBUS_DATA_ERR |
-				  PORT_IRQ_IF_ERR,
-	DEF_PORT_IRQ		= PORT_IRQ_FATAL | PORT_IRQ_PHYRDY |
-				  PORT_IRQ_CONNECT | PORT_IRQ_SG_DONE |
-				  PORT_IRQ_UNK_FIS | PORT_IRQ_SDB_FIS |
-				  PORT_IRQ_DMAS_FIS | PORT_IRQ_PIOS_FIS |
-				  PORT_IRQ_D2H_REG_FIS,
+	PORT_IRQ_FREEZE		= PORT_IRQ_HBUS_ERR |
+				  PORT_IRQ_IF_ERR |
+				  PORT_IRQ_CONNECT |
+				  PORT_IRQ_UNK_FIS,
+	PORT_IRQ_ERROR		= PORT_IRQ_FREEZE |
+				  PORT_IRQ_TF_ERR |
+				  PORT_IRQ_HBUS_DATA_ERR,
+	DEF_PORT_IRQ		= PORT_IRQ_ERROR | PORT_IRQ_SG_DONE |
+				  PORT_IRQ_SDB_FIS | PORT_IRQ_DMAS_FIS |
+				  PORT_IRQ_PIOS_FIS | PORT_IRQ_D2H_REG_FIS,
 
 	/* PORT_CMD bits */
 	PORT_CMD_ATAPI		= (1 << 24), /* Device is ATAPI */
@@ -197,13 +199,13 @@ static unsigned int ahci_qc_issue(struct
 static irqreturn_t ahci_interrupt (int irq, void *dev_instance, struct pt_regs *regs);
 static int ahci_probe_reset(struct ata_port *ap, unsigned int *classes);
 static void ahci_irq_clear(struct ata_port *ap);
-static void ahci_eng_timeout(struct ata_port *ap);
 static int ahci_port_start(struct ata_port *ap);
 static void ahci_port_stop(struct ata_port *ap);
 static void ahci_tf_read(struct ata_port *ap, struct ata_taskfile *tf);
 static void ahci_qc_prep(struct ata_queued_cmd *qc);
 static u8 ahci_check_status(struct ata_port *ap);
-static inline int ahci_host_intr(struct ata_port *ap, struct ata_queued_cmd *qc);
+static void ahci_error_handler(struct ata_port *ap);
+static void ahci_post_internal_cmd(struct ata_queued_cmd *qc);
 static void ahci_remove_one (struct pci_dev *pdev);
 
 static struct scsi_host_template ahci_sht = {
@@ -237,14 +239,15 @@ static const struct ata_port_operations 
 	.qc_prep		= ahci_qc_prep,
 	.qc_issue		= ahci_qc_issue,
 
-	.eng_timeout		= ahci_eng_timeout,
-
 	.irq_handler		= ahci_interrupt,
 	.irq_clear		= ahci_irq_clear,
 
 	.scr_read		= ahci_scr_read,
 	.scr_write		= ahci_scr_write,
 
+	.error_handler		= ahci_error_handler,
+	.post_internal_cmd	= ahci_post_internal_cmd,
+
 	.port_start		= ahci_port_start,
 	.port_stop		= ahci_port_stop,
 };
@@ -681,10 +684,16 @@ static int ahci_hardreset(struct ata_por
 static void ahci_postreset(struct ata_port *ap, unsigned int *class)
 {
 	void __iomem *port_mmio = (void __iomem *) ap->ioaddr.cmd_addr;
+	unsigned long flags;
 	u32 new_tmp, tmp;
 
 	ata_std_postreset(ap, class);
 
+	/* clear stored SError */
+	spin_lock_irqsave(&ap->host_set->lock, flags);
+	ap->eh_info.serror = 0;
+	spin_unlock_irqrestore(&ap->host_set->lock, flags);
+
 	/* Make sure port's ATAPI bit is set appropriately */
 	new_tmp = tmp = readl(port_mmio + PORT_CMD);
 	if (*class == ATA_DEV_ATAPI)
@@ -789,108 +798,112 @@ static void ahci_qc_prep(struct ata_queu
 	ahci_fill_cmd_slot(pp, opts);
 }
 
-static void ahci_restart_port(struct ata_port *ap, u32 irq_stat)
+static void ahci_error_intr(struct ata_port *ap, u32 irq_stat)
 {
-	void __iomem *mmio = ap->host_set->mmio_base;
-	void __iomem *port_mmio = ahci_port_base(mmio, ap->port_no);
-	u32 tmp;
+	struct ahci_port_priv *pp = ap->private_data;
+	struct ata_eh_info *ehi = &ap->eh_info;
+	unsigned int err_mask = 0, action = 0;
+	struct ata_queued_cmd *qc;
+	u32 serror;
 
-	if ((ap->device[0].class != ATA_DEV_ATAPI) ||
-	    ((irq_stat & PORT_IRQ_TF_ERR) == 0))
-		ata_port_printk(ap, KERN_WARNING, "port reset, "
-		       "p_is %x is %x pis %x cmd %x tf %x ss %x se %x\n",
-			irq_stat,
-			readl(mmio + HOST_IRQ_STAT),
-			readl(port_mmio + PORT_IRQ_STAT),
-			readl(port_mmio + PORT_CMD),
-			readl(port_mmio + PORT_TFDATA),
-			readl(port_mmio + PORT_SCR_STAT),
-			readl(port_mmio + PORT_SCR_ERR));
+	ata_ehi_clear_desc(ehi);
 
-	/* stop DMA */
-	ahci_stop_engine(ap);
+	/* AHCI needs SError cleared; otherwise, it might lock up */
+	serror = ahci_scr_read(ap, SCR_ERROR);
+	ahci_scr_write(ap, SCR_ERROR, serror);
 
-	/* clear SATA phy error, if any */
-	tmp = readl(port_mmio + PORT_SCR_ERR);
-	writel(tmp, port_mmio + PORT_SCR_ERR);
+	/* analyze @irq_stat */
+	ata_ehi_push_desc(ehi, "irq_stat 0x%08x", irq_stat);
 
-	/* if DRQ/BSY is set, device needs to be reset.
-	 * if so, issue COMRESET
-	 */
-	tmp = readl(port_mmio + PORT_TFDATA);
-	if (tmp & (ATA_BUSY | ATA_DRQ)) {
-		writel(0x301, port_mmio + PORT_SCR_CTL);
-		readl(port_mmio + PORT_SCR_CTL); /* flush */
-		udelay(10);
-		writel(0x300, port_mmio + PORT_SCR_CTL);
-		readl(port_mmio + PORT_SCR_CTL); /* flush */
+	if (irq_stat & PORT_IRQ_TF_ERR)
+		err_mask |= AC_ERR_DEV;
+
+	if (irq_stat & (PORT_IRQ_HBUS_ERR | PORT_IRQ_HBUS_DATA_ERR)) {
+		err_mask |= AC_ERR_HOST_BUS;
+		action |= ATA_EH_SOFTRESET;
 	}
 
-	/* re-start DMA */
-	ahci_start_engine(ap);
-}
+	if (irq_stat & PORT_IRQ_IF_ERR) {
+		err_mask |= AC_ERR_ATA_BUS;
+		action |= ATA_EH_SOFTRESET;
+		ata_ehi_push_desc(ehi, ", interface fatal error");
+	}
 
-static void ahci_eng_timeout(struct ata_port *ap)
-{
-	struct ata_host_set *host_set = ap->host_set;
-	void __iomem *mmio = host_set->mmio_base;
-	void __iomem *port_mmio = ahci_port_base(mmio, ap->port_no);
-	struct ata_queued_cmd *qc;
-	unsigned long flags;
+	if (irq_stat & (PORT_IRQ_CONNECT | PORT_IRQ_PHYRDY)) {
+		err_mask |= AC_ERR_ATA_BUS;
+		action |= ATA_EH_SOFTRESET;
+		ata_ehi_push_desc(ehi, ", %s", irq_stat & PORT_IRQ_CONNECT ?
+			"connection status changed" : "PHY RDY changed");
+	}
 
-	ata_port_printk(ap, KERN_WARNING, "handling error/timeout\n");
+	if (irq_stat & PORT_IRQ_UNK_FIS) {
+		u32 *unk = (u32 *)(pp->rx_fis + RX_FIS_UNK);
 
-	spin_lock_irqsave(&host_set->lock, flags);
+		err_mask |= AC_ERR_HSM;
+		action |= ATA_EH_SOFTRESET;
+		ata_ehi_push_desc(ehi, ", unknown FIS %08x %08x %08x %08x",
+				  unk[0], unk[1], unk[2], unk[3]);
+	}
 
-	ahci_restart_port(ap, readl(port_mmio + PORT_IRQ_STAT));
-	qc = ata_qc_from_tag(ap, ap->active_tag);
-	qc->err_mask |= AC_ERR_TIMEOUT;
+	/* okay, let's hand over to EH */
+	ehi->serror |= serror;
+	ehi->action |= action;
 
-	spin_unlock_irqrestore(&host_set->lock, flags);
+	qc = ata_qc_from_tag(ap, ap->active_tag);
+	if (qc)
+		qc->err_mask |= err_mask;
+	else
+		ehi->err_mask |= err_mask;
 
-	ata_eh_qc_complete(qc);
+	if (irq_stat & PORT_IRQ_FREEZE)
+		ata_port_freeze(ap);
+	else
+		ata_port_abort(ap);
 }
 
-static inline int ahci_host_intr(struct ata_port *ap, struct ata_queued_cmd *qc)
+static void ahci_host_intr(struct ata_port *ap)
 {
 	void __iomem *mmio = ap->host_set->mmio_base;
 	void __iomem *port_mmio = ahci_port_base(mmio, ap->port_no);
-	u32 status, serr, ci;
-
-	serr = readl(port_mmio + PORT_SCR_ERR);
-	writel(serr, port_mmio + PORT_SCR_ERR);
+	struct ata_eh_info *ehi = &ap->eh_info;
+	struct ata_queued_cmd *qc;
+	u32 status, serror, ci;
 
 	status = readl(port_mmio + PORT_IRQ_STAT);
 	writel(status, port_mmio + PORT_IRQ_STAT);
 
-	ci = readl(port_mmio + PORT_CMD_ISSUE);
-	if (likely((ci & 0x1) == 0)) {
-		if (qc) {
-			WARN_ON(qc->err_mask);
-			ata_qc_complete(qc);
-			qc = NULL;
-		}
+	/* AHCI gets unhappy if IRQ mask is diddled with while the
+	 * port is active, so we cannot disable IRQ when freezing.
+	 * Clear IRQ conditions and hope screaming IRQs don't happen.
+	 */
+	if (unlikely(ap->flags & ATA_FLAG_FROZEN)) {
+		/* some AHCI errors hang the controller until SError
+		 * is cleared.  Store & clear it.
+		 */
+		serror = ahci_scr_read(ap, SCR_ERROR);
+		ahci_scr_write(ap, SCR_ERROR, serror);
+		ehi->serror |= serror;
+		return;
 	}
 
-	if (status & PORT_IRQ_FATAL) {
-		unsigned int err_mask;
-		if (status & PORT_IRQ_TF_ERR)
-			err_mask = AC_ERR_DEV;
-		else if (status & PORT_IRQ_IF_ERR)
-			err_mask = AC_ERR_ATA_BUS;
-		else
-			err_mask = AC_ERR_HOST_BUS;
-
-		/* command processing has stopped due to error; restart */
-		ahci_restart_port(ap, status);
+	if (unlikely(status & PORT_IRQ_ERROR)) {
+		ahci_error_intr(ap, status);
+		return;
+	}
 
-		if (qc) {
-			qc->err_mask |= err_mask;
+	if ((qc = ata_qc_from_tag(ap, ap->active_tag))) {
+		ci = readl(port_mmio + PORT_CMD_ISSUE);
+		if ((ci & 0x1) == 0) {
 			ata_qc_complete(qc);
+			return;
 		}
 	}
 
-	return 1;
+	/* spurious interrupt */
+	if (ata_ratelimit())
+		ata_port_printk(ap, KERN_INFO, "spurious interrupt "
+				"(irq_stat 0x%x active_tag %d)\n",
+				status, ap->active_tag);
 }
 
 static void ahci_irq_clear(struct ata_port *ap)
@@ -927,14 +940,7 @@ static irqreturn_t ahci_interrupt (int i
 
 		ap = host_set->ports[i];
 		if (ap) {
-			struct ata_queued_cmd *qc;
-			qc = ata_qc_from_tag(ap, ap->active_tag);
-			if (!ahci_host_intr(ap, qc))
-				if (ata_ratelimit())
-					dev_printk(KERN_WARNING, host_set->dev,
-					  "unhandled interrupt on port %u\n",
-					  i);
-
+			ahci_host_intr(ap);
 			VPRINTK("port %u\n", i);
 		} else {
 			VPRINTK("port %u (no irq)\n", i);
@@ -951,7 +957,7 @@ static irqreturn_t ahci_interrupt (int i
 		handled = 1;
 	}
 
-        spin_unlock(&host_set->lock);
+	spin_unlock(&host_set->lock);
 
 	VPRINTK("EXIT\n");
 
@@ -969,6 +975,32 @@ static unsigned int ahci_qc_issue(struct
 	return 0;
 }
 
+static void ahci_error_handler(struct ata_port *ap)
+{
+	if (!(ap->flags & ATA_FLAG_FROZEN)) {
+		/* restart engine */
+		ahci_stop_engine(ap);
+		ahci_start_engine(ap);
+	}
+
+	/* perform recovery */
+	ata_do_eh(ap, ahci_softreset, ahci_hardreset, ahci_postreset);
+}
+
+static void ahci_post_internal_cmd(struct ata_queued_cmd *qc)
+{
+	struct ata_port *ap = qc->ap;
+
+	if (qc->flags & ATA_QCFLAG_FAILED)
+		qc->err_mask |= AC_ERR_OTHER;
+
+	if (qc->err_mask) {
+		/* make DMA engine forget about the failed command */
+		ahci_stop_engine(ap);
+		ahci_start_engine(ap);
+	}
+}
+
 static void ahci_setup_port(struct ata_ioports *port, unsigned long base,
 			    unsigned int port_idx)
 {
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/11] ahci: convert to new EH
  2006-05-11 13:21 ` [PATCH 09/11] ahci: convert to new EH Tejun Heo
@ 2006-05-13 10:53   ` Tejun Heo
  2006-05-13 22:30   ` Jeff Garzik
  1 sibling, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 10:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Convert AHCI to new EH.  Unfortunately, ICH7 AHCI reacts badly if IRQ
> mask is diddled during operation.  So, freezing is implemented by
> unconditionally clearing interrupt conditions while frozen.

I've been testing AHCI EH again for PM and it seems I might have 
confused device problem and HBA problem when implementing AHCI EH the 
first time.  I'm still verifying things bug IRQ diddling doesn't seems 
to cause any problem.

Please don't pull this patchset till things are verified.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/11] ahci: convert to new EH
  2006-05-11 13:21 ` [PATCH 09/11] ahci: convert to new EH Tejun Heo
  2006-05-13 10:53   ` Tejun Heo
@ 2006-05-13 22:30   ` Jeff Garzik
  2006-05-13 23:49     ` Tejun Heo
  1 sibling, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:30 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> Convert AHCI to new EH.  Unfortunately, ICH7 AHCI reacts badly if IRQ
> mask is diddled during operation.  So, freezing is implemented by
> unconditionally clearing interrupt conditions while frozen.

For standard BMDMA hardware, you -absolutely must- stop DMA, before 
doing -anything- else, including EH/recovery.

This sort of scenario will be quite common.  I would recommend doing so 
on sata_sil, sata_promise, ...


> * Interrupts are categorized according to required action.
>   e.g. Connection status or unknown FIS error requires freezing the
>   port while TF or HBUS_DATA don't.
> 
> * Only CONNECT (reflects SErr.X) interrupt is taken into account not
>   PHYRDY (SErr.N), as CONNECT is better cue for starting EH.
> 
> * AHCI may be invoked without any active command.  e.g. CONNECT irq
>   occuring while no qc in progress still triggers EH and will reset
>   the port and revalidate attached device.
> 
> Signed-off-by: Tejun Heo <htejun@gmail.com>

ACK


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/11] ahci: convert to new EH
  2006-05-13 22:30   ` Jeff Garzik
@ 2006-05-13 23:49     ` Tejun Heo
  0 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 23:49 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> Tejun Heo wrote:
>> Convert AHCI to new EH.  Unfortunately, ICH7 AHCI reacts badly if IRQ
>> mask is diddled during operation.  So, freezing is implemented by
>> unconditionally clearing interrupt conditions while frozen.
> 
> For standard BMDMA hardware, you -absolutely must- stop DMA, before 
> doing -anything- else, including EH/recovery.
> 
> This sort of scenario will be quite common.  I would recommend doing so 
> on sata_sil, sata_promise, ...
>

AHCI problem seems to be my honest mistake.  I misinterpreted device 
failures as HBA failures.  For ahci and sata_sil24, making them forget 
about pending commands take time, so it's awkward to do while freezing. 
  And, as they are pretty modern controllers, I expect them to do the 
right thing when IRQ mask is diddled with.

For more traditional BMDMA controllers, I agree with you.  I'll turn off 
BMDMA in bmdma_freeze() before setting nIEN.  But, for sata_sil, I wanna 
  leave it as it is.  The code is tested on sil3112 and 3114 and both 
work fine.  It's not always possible but I wanna avoid adding code which 
isn't necessary to LLDDs.  After a while, nobody would remember why the 
code is there and people will assume sata_sil needs it, which isn't true.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 11/11] sata_sil24: convert to new EH
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (9 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 09/11] ahci: convert to new EH Tejun Heo
@ 2006-05-11 13:21 ` Tejun Heo
  2006-05-13 22:34 ` [PATCHSET 03/11] new EH implementation, take 3 Jeff Garzik
  11 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-11 13:21 UTC (permalink / raw)
  To: jgarzik, alan, axboe, albertcc, forrest.zhao, efalk, linux-ide; +Cc: Tejun Heo

Convert sata_sil24 to new EH.

* When port is frozen, IRQ for the port is masked.

* sil24_softreset() doesn't need to mangle with IRQ mask anymore.
  libata ensures that the port is frozen during reset.

* Only turn on interrupts which are handled by interrupt handler and
  EH.  As we don't handle SDB notify yet, turn it off. DEV_XCHG and
  UNK_FIS are handled by EH and thus turned on.

* sil24_softreset() usually fails to recover the port after DEV_XCHG.
  ATA_PORT_HARDRESET is used as recovery action for DEV_XCHG.

* sil24 may be invoked without any active command.  e.g. DEV_XCHG irq
  occuring while no qc in progress still triggers EH and will reset
  the port and revalidate attached device.

Signed-off-by: Tejun Heo <htejun@gmail.com>

---

 drivers/scsi/sata_sil24.c |  330 ++++++++++++++++++++++++++-------------------
 1 files changed, 188 insertions(+), 142 deletions(-)

5b5d5120add6debc3e80842b94464678aeee4036
diff --git a/drivers/scsi/sata_sil24.c b/drivers/scsi/sata_sil24.c
index 6144f7c..adbb5a6 100644
--- a/drivers/scsi/sata_sil24.c
+++ b/drivers/scsi/sata_sil24.c
@@ -156,6 +156,9 @@ enum {
 	PORT_IRQ_HANDSHAKE	= (1 << 10), /* handshake error threshold */
 	PORT_IRQ_SDB_NOTIFY	= (1 << 11), /* SDB notify received */
 
+	DEF_PORT_IRQ		= PORT_IRQ_COMPLETE | PORT_IRQ_ERROR |
+				  PORT_IRQ_DEV_XCHG | PORT_IRQ_UNK_FIS,
+
 	/* bits[27:16] are unmasked (raw) */
 	PORT_IRQ_RAW_SHIFT	= 16,
 	PORT_IRQ_MASKED_MASK	= 0x7ff,
@@ -242,6 +245,58 @@ union sil24_cmd_block {
 	struct sil24_atapi_block atapi;
 };
 
+static struct sil24_cerr_info {
+	unsigned int err_mask, action;
+	const char *desc;
+} sil24_cerr_db[] = {
+	[0]			= { AC_ERR_DEV, ATA_EH_REVALIDATE,
+				    "device error" },
+	[PORT_CERR_DEV]		= { AC_ERR_DEV, ATA_EH_REVALIDATE,
+				    "device error via D2H FIS" },
+	[PORT_CERR_SDB]		= { AC_ERR_DEV, ATA_EH_REVALIDATE,
+				    "device error via SDB FIS" },
+	[PORT_CERR_DATA]	= { AC_ERR_ATA_BUS, ATA_EH_SOFTRESET,
+				    "error in data FIS" },
+	[PORT_CERR_SEND]	= { AC_ERR_ATA_BUS, ATA_EH_SOFTRESET,
+				    "failed to transmit command FIS" },
+	[PORT_CERR_INCONSISTENT] = { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				     "protocol mismatch" },
+	[PORT_CERR_DIRECTION]	= { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				    "data directon mismatch" },
+	[PORT_CERR_UNDERRUN]	= { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				    "ran out of SGEs while writing" },
+	[PORT_CERR_OVERRUN]	= { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				    "ran out of SGEs while reading" },
+	[PORT_CERR_PKT_PROT]	= { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				    "invalid data directon for ATAPI CDB" },
+	[PORT_CERR_SGT_BOUNDARY] = { AC_ERR_SYSTEM, ATA_EH_SOFTRESET,
+				     "SGT no on qword boundary" },
+	[PORT_CERR_SGT_TGTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI target abort while fetching SGT" },
+	[PORT_CERR_SGT_MSTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI master abort while fetching SGT" },
+	[PORT_CERR_SGT_PCIPERR]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI parity error while fetching SGT" },
+	[PORT_CERR_CMD_BOUNDARY] = { AC_ERR_SYSTEM, ATA_EH_SOFTRESET,
+				     "PRB not on qword boundary" },
+	[PORT_CERR_CMD_TGTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI target abort while fetching PRB" },
+	[PORT_CERR_CMD_MSTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI master abort while fetching PRB" },
+	[PORT_CERR_CMD_PCIPERR]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI parity error while fetching PRB" },
+	[PORT_CERR_XFR_UNDEF]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "undefined error while transferring data" },
+	[PORT_CERR_XFR_TGTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI target abort while transferring data" },
+	[PORT_CERR_XFR_MSTABRT]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI master abort while transferring data" },
+	[PORT_CERR_XFR_PCIPERR]	= { AC_ERR_HOST_BUS, ATA_EH_SOFTRESET,
+				    "PCI parity error while transferring data" },
+	[PORT_CERR_SENDSERVICE]	= { AC_ERR_HSM, ATA_EH_SOFTRESET,
+				    "FIS received while sending service FIS" },
+};
+
 /*
  * ap->private_data
  *
@@ -269,8 +324,11 @@ static int sil24_probe_reset(struct ata_
 static void sil24_qc_prep(struct ata_queued_cmd *qc);
 static unsigned int sil24_qc_issue(struct ata_queued_cmd *qc);
 static void sil24_irq_clear(struct ata_port *ap);
-static void sil24_eng_timeout(struct ata_port *ap);
 static irqreturn_t sil24_interrupt(int irq, void *dev_instance, struct pt_regs *regs);
+static void sil24_freeze(struct ata_port *ap);
+static void sil24_thaw(struct ata_port *ap);
+static void sil24_error_handler(struct ata_port *ap);
+static void sil24_post_internal_cmd(struct ata_queued_cmd *qc);
 static int sil24_port_start(struct ata_port *ap);
 static void sil24_port_stop(struct ata_port *ap);
 static void sil24_host_stop(struct ata_host_set *host_set);
@@ -325,14 +383,17 @@ static const struct ata_port_operations 
 	.qc_prep		= sil24_qc_prep,
 	.qc_issue		= sil24_qc_issue,
 
-	.eng_timeout		= sil24_eng_timeout,
-
 	.irq_handler		= sil24_interrupt,
 	.irq_clear		= sil24_irq_clear,
 
 	.scr_read		= sil24_scr_read,
 	.scr_write		= sil24_scr_write,
 
+	.freeze			= sil24_freeze,
+	.thaw			= sil24_thaw,
+	.error_handler		= sil24_error_handler,
+	.post_internal_cmd	= sil24_post_internal_cmd,
+
 	.port_start		= sil24_port_start,
 	.port_stop		= sil24_port_stop,
 	.host_stop		= sil24_host_stop,
@@ -459,7 +520,7 @@ static int sil24_softreset(struct ata_po
 	struct sil24_port_priv *pp = ap->private_data;
 	struct sil24_prb *prb = &pp->cmd_block[0].ata.prb;
 	dma_addr_t paddr = pp->cmd_block_dma;
-	u32 mask, irq_enable, irq_stat;
+	u32 mask, irq_stat;
 	const char *reason;
 
 	DPRINTK("ENTER\n");
@@ -470,10 +531,6 @@ static int sil24_softreset(struct ata_po
 		goto out;
 	}
 
-	/* temporarily turn off IRQs during SRST */
-	irq_enable = readl(port + PORT_IRQ_ENABLE_SET);
-	writel(irq_enable, port + PORT_IRQ_ENABLE_CLR);
-
 	/* put the port into known state */
 	if (sil24_init_port(ap)) {
 		reason ="port not ready";
@@ -494,9 +551,6 @@ static int sil24_softreset(struct ata_po
 	writel(irq_stat, port + PORT_IRQ_STAT); /* clear IRQs */
 	irq_stat >>= PORT_IRQ_RAW_SHIFT;
 
-	/* restore IRQs */
-	writel(irq_enable, port + PORT_IRQ_ENABLE_SET);
-
 	if (!(irq_stat & PORT_IRQ_COMPLETE)) {
 		if (irq_stat & PORT_IRQ_ERROR)
 			reason = "SRST command error";
@@ -655,158 +709,134 @@ static void sil24_irq_clear(struct ata_p
 	/* unused */
 }
 
-static int __sil24_restart_controller(void __iomem *port)
+static void sil24_freeze(struct ata_port *ap)
 {
-	u32 tmp;
-	int cnt;
-
-	writel(PORT_CS_INIT, port + PORT_CTRL_STAT);
-
-	/* Max ~10ms */
-	for (cnt = 0; cnt < 10000; cnt++) {
-		tmp = readl(port + PORT_CTRL_STAT);
-		if (tmp & PORT_CS_RDY)
-			return 0;
-		udelay(1);
-	}
-
-	return -1;
-}
+	void __iomem *port = (void __iomem *)ap->ioaddr.cmd_addr;
 
-static void sil24_restart_controller(struct ata_port *ap)
-{
-	if (__sil24_restart_controller((void __iomem *)ap->ioaddr.cmd_addr))
-		printk(KERN_ERR DRV_NAME
-		       " ata%u: failed to restart controller\n", ap->id);
+	/* Port-wide IRQ mask in HOST_CTRL doesn't really work, clear
+	 * PORT_IRQ_ENABLE instead.
+	 */
+	writel(0xffff, port + PORT_IRQ_ENABLE_CLR);
 }
 
-static int __sil24_reset_controller(void __iomem *port)
+static void sil24_thaw(struct ata_port *ap)
 {
-	int cnt;
+	void __iomem *port = (void __iomem *)ap->ioaddr.cmd_addr;
 	u32 tmp;
 
-	/* Reset controller state.  Is this correct? */
-	writel(PORT_CS_DEV_RST, port + PORT_CTRL_STAT);
-	readl(port + PORT_CTRL_STAT);	/* sync */
-
-	/* Max ~100ms */
-	for (cnt = 0; cnt < 1000; cnt++) {
-		udelay(100);
-		tmp = readl(port + PORT_CTRL_STAT);
-		if (!(tmp & PORT_CS_DEV_RST))
-			break;
-	}
-
-	if (tmp & PORT_CS_DEV_RST)
-		return -1;
+	/* clear IRQ */
+	tmp = readl(port + PORT_IRQ_STAT);
+	writel(tmp, port + PORT_IRQ_STAT);
 
-	if (tmp & PORT_CS_RDY)
-		return 0;
-
-	return __sil24_restart_controller(port);
+	/* turn IRQ back on */
+	writel(DEF_PORT_IRQ, port + PORT_IRQ_ENABLE_SET);
 }
 
-static void sil24_reset_controller(struct ata_port *ap)
+static void sil24_error_intr(struct ata_port *ap)
 {
-	printk(KERN_NOTICE DRV_NAME
-	       " ata%u: resetting controller...\n", ap->id);
-	if (__sil24_reset_controller((void __iomem *)ap->ioaddr.cmd_addr))
-                printk(KERN_ERR DRV_NAME
-                       " ata%u: failed to reset controller\n", ap->id);
-}
-
-static void sil24_eng_timeout(struct ata_port *ap)
-{
-	struct ata_queued_cmd *qc;
-
-	qc = ata_qc_from_tag(ap, ap->active_tag);
-
-	ata_port_printk(ap, KERN_ERR, "command timeout\n");
-	qc->err_mask |= AC_ERR_TIMEOUT;
-	ata_eh_qc_complete(qc);
-
-	sil24_reset_controller(ap);
-}
-
-static void sil24_error_intr(struct ata_port *ap, u32 slot_stat)
-{
-	struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
-	struct sil24_port_priv *pp = ap->private_data;
 	void __iomem *port = (void __iomem *)ap->ioaddr.cmd_addr;
-	u32 irq_stat, cmd_err, sstatus, serror;
-	unsigned int err_mask;
+	struct ata_eh_info *ehi = &ap->eh_info;
+	int freeze = 0;
+	u32 irq_stat;
 
+	/* on error, we need to clear IRQ explicitly */
 	irq_stat = readl(port + PORT_IRQ_STAT);
-	writel(irq_stat, port + PORT_IRQ_STAT);		/* clear irq */
+	writel(irq_stat, port + PORT_IRQ_STAT);
 
-	if (!(irq_stat & PORT_IRQ_ERROR)) {
-		/* ignore non-completion, non-error irqs for now */
-		printk(KERN_WARNING DRV_NAME
-		       "ata%u: non-error exception irq (irq_stat %x)\n",
-		       ap->id, irq_stat);
-		return;
-	}
+	/* first, analyze and record host port events */
+	ata_ehi_clear_desc(ehi);
 
-	cmd_err = readl(port + PORT_CMD_ERR);
-	sstatus = readl(port + PORT_SSTATUS);
-	serror = readl(port + PORT_SERROR);
-	if (serror)
-		writel(serror, port + PORT_SERROR);
+	ata_ehi_push_desc(ehi, "irq_stat 0x%08x", irq_stat);
 
-	/*
-	 * Don't log ATAPI device errors.  They're supposed to happen
-	 * and any serious errors will be logged using sense data by
-	 * the SCSI layer.
-	 */
-	if (ap->device[0].class != ATA_DEV_ATAPI || cmd_err > PORT_CERR_SDB)
-		printk("ata%u: error interrupt on port%d\n"
-		       "  stat=0x%x irq=0x%x cmd_err=%d sstatus=0x%x serror=0x%x\n",
-		       ap->id, ap->port_no, slot_stat, irq_stat, cmd_err, sstatus, serror);
-
-	if (cmd_err == PORT_CERR_DEV || cmd_err == PORT_CERR_SDB) {
-		/*
-		 * Device is reporting error, tf registers are valid.
-		 */
-		sil24_update_tf(ap);
-		err_mask = ac_err_mask(pp->tf.command);
-		sil24_restart_controller(ap);
-	} else {
-		/*
-		 * Other errors.  libata currently doesn't have any
-		 * mechanism to report these errors.  Just turn on
-		 * ATA_ERR.
+	if (irq_stat & PORT_IRQ_DEV_XCHG) {
+		ehi->err_mask |= AC_ERR_ATA_BUS;
+		/* sil24 doesn't recover very well from phy
+		 * disconnection with a softreset.  Force hardreset.
 		 */
-		err_mask = AC_ERR_OTHER;
-		sil24_reset_controller(ap);
-	}
+		ehi->action |= ATA_EH_HARDRESET;
+		ata_ehi_push_desc(ehi, ", device_exchanged");
+		freeze = 1;
+	}
+
+	if (irq_stat & PORT_IRQ_UNK_FIS) {
+		ehi->err_mask |= AC_ERR_HSM;
+		ehi->action |= ATA_EH_SOFTRESET;
+		ata_ehi_push_desc(ehi , ", unknown FIS");
+		freeze = 1;
+	}
+
+	/* deal with command error */
+	if (irq_stat & PORT_IRQ_ERROR) {
+		struct sil24_cerr_info *ci = NULL;
+		unsigned int err_mask = 0, action = 0;
+		struct ata_queued_cmd *qc;
+		u32 cerr;
+
+		/* analyze CMD_ERR */
+		cerr = readl(port + PORT_CMD_ERR);
+		if (cerr < ARRAY_SIZE(sil24_cerr_db))
+			ci = &sil24_cerr_db[cerr];
+
+		if (ci && ci->desc) {
+			err_mask |= ci->err_mask;
+			action |= ci->action;
+			ata_ehi_push_desc(ehi, ", %s", ci->desc);
+		} else {
+			err_mask |= AC_ERR_OTHER;
+			action |= ATA_EH_SOFTRESET;
+			ata_ehi_push_desc(ehi, ", unknown command error %d",
+					  cerr);
+		}
 
-	if (qc) {
-		qc->err_mask |= err_mask;
-		ata_qc_complete(qc);
+		/* record error info */
+		qc = ata_qc_from_tag(ap, ap->active_tag);
+		if (qc) {
+			int tag = qc->tag;
+			if (unlikely(ata_tag_internal(tag)))
+				tag = 0;
+			sil24_update_tf(ap);
+			qc->err_mask |= err_mask;
+		} else
+			ehi->err_mask |= err_mask;
+
+		ehi->action |= action;
 	}
+
+	/* freeze or abort */
+	if (freeze)
+		ata_port_freeze(ap);
+	else
+		ata_port_abort(ap);
 }
 
 static inline void sil24_host_intr(struct ata_port *ap)
 {
-	struct ata_queued_cmd *qc = ata_qc_from_tag(ap, ap->active_tag);
 	void __iomem *port = (void __iomem *)ap->ioaddr.cmd_addr;
+	struct ata_queued_cmd *qc;
 	u32 slot_stat;
 
 	slot_stat = readl(port + PORT_SLOT_STAT);
-	if (!(slot_stat & HOST_SSTAT_ATTN)) {
-		struct sil24_port_priv *pp = ap->private_data;
 
-		if (ap->flags & SIL24_FLAG_PCIX_IRQ_WOC)
-			writel(PORT_IRQ_COMPLETE, port + PORT_IRQ_STAT);
+	if (unlikely(slot_stat & HOST_SSTAT_ATTN)) {
+		sil24_error_intr(ap);
+		return;
+	}
 
-		if (qc) {
-			if (qc->flags & ATA_QCFLAG_RESULT_TF)
-				sil24_update_tf(ap);
-			qc->err_mask |= ac_err_mask(pp->tf.command);
-			ata_qc_complete(qc);
-		}
-	} else
-		sil24_error_intr(ap, slot_stat);
+	if (ap->flags & SIL24_FLAG_PCIX_IRQ_WOC)
+		writel(PORT_IRQ_COMPLETE, port + PORT_IRQ_STAT);
+
+	qc = ata_qc_from_tag(ap, ap->active_tag);
+	if (qc) {
+		if (qc->flags & ATA_QCFLAG_RESULT_TF)
+			sil24_update_tf(ap);
+		ata_qc_complete(qc);
+		return;
+	}
+
+	if (ata_ratelimit())
+		ata_port_printk(ap, KERN_INFO, "spurious interrupt "
+			"(slot_stat 0x%x active_tag %d)\n",
+			slot_stat, ap->active_tag);
 }
 
 static irqreturn_t sil24_interrupt(int irq, void *dev_instance, struct pt_regs *regs)
@@ -846,6 +876,31 @@ static irqreturn_t sil24_interrupt(int i
 	return IRQ_RETVAL(handled);
 }
 
+static void sil24_error_handler(struct ata_port *ap)
+{
+	struct ata_eh_context *ehc = &ap->eh_context;
+
+	if (sil24_init_port(ap)) {
+		ata_eh_freeze_port(ap);
+		ehc->i.action |= ATA_EH_HARDRESET;
+	}
+
+	/* perform recovery */
+	ata_do_eh(ap, sil24_softreset, sil24_hardreset, ata_std_postreset);
+}
+
+static void sil24_post_internal_cmd(struct ata_queued_cmd *qc)
+{
+	struct ata_port *ap = qc->ap;
+
+	if (qc->flags & ATA_QCFLAG_FAILED)
+		qc->err_mask |= AC_ERR_OTHER;
+
+	/* make DMA engine forget about the failed command */
+	if (qc->err_mask)
+		sil24_init_port(ap);
+}
+
 static inline void sil24_cblk_free(struct sil24_port_priv *pp, struct device *dev)
 {
 	const size_t cb_size = sizeof(*pp->cmd_block);
@@ -1058,15 +1113,6 @@ static int sil24_init_one(struct pci_dev
 		/* Always use 64bit activation */
 		writel(PORT_CS_32BIT_ACTV, port + PORT_CTRL_CLR);
 
-		/* Configure interrupts */
-		writel(0xffff, port + PORT_IRQ_ENABLE_CLR);
-		writel(PORT_IRQ_COMPLETE | PORT_IRQ_ERROR |
-		       PORT_IRQ_SDB_NOTIFY, port + PORT_IRQ_ENABLE_SET);
-
-		/* Clear interrupts */
-		writel(0x0fff0fff, port + PORT_IRQ_STAT);
-		writel(PORT_CS_IRQ_WOC, port + PORT_CTRL_CLR);
-
 		/* Clear port multiplier enable and resume bits */
 		writel(PORT_CS_PM_EN | PORT_CS_RESUME, port + PORT_CTRL_CLR);
 	}
-- 
1.2.4



^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCHSET 03/11] new EH implementation, take 3
  2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
                   ` (10 preceding siblings ...)
  2006-05-11 13:21 ` [PATCH 11/11] sata_sil24: " Tejun Heo
@ 2006-05-13 22:34 ` Jeff Garzik
  2006-05-13 23:58   ` Tejun Heo
  11 siblings, 1 reply; 48+ messages in thread
From: Jeff Garzik @ 2006-05-13 22:34 UTC (permalink / raw)
  To: Tejun Heo; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Tejun Heo wrote:
> This is part of patchset series described in [T].
> 
> This is the third take of new-EH-implementation patchset.  Changes
> from the last take[L] are
[...]
> [T] http://article.gmane.org/gmane.linux.ide/9957
> [L] http://article.gmane.org/gmane.linux.ide/9540
> [1] http://article.gmane.org/gmane.linux.ide/9959
> [2] http://article.gmane.org/gmane.linux.ide/9984

Aside from one minor requested change, this patchset is ACK'd, but with 
a few comments and worries noted.

Once this goes in we need to move all the drivers over as soon as 
possible, and get them all re-tested.

	Jeff



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCHSET 03/11] new EH implementation, take 3
  2006-05-13 22:34 ` [PATCHSET 03/11] new EH implementation, take 3 Jeff Garzik
@ 2006-05-13 23:58   ` Tejun Heo
  0 siblings, 0 replies; 48+ messages in thread
From: Tejun Heo @ 2006-05-13 23:58 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: alan, axboe, albertcc, forrest.zhao, efalk, linux-ide

Jeff Garzik wrote:
> Tejun Heo wrote:
>> This is part of patchset series described in [T].
>>
>> This is the third take of new-EH-implementation patchset.  Changes
>> from the last take[L] are
> [...]
>> [T] http://article.gmane.org/gmane.linux.ide/9957
>> [L] http://article.gmane.org/gmane.linux.ide/9540
>> [1] http://article.gmane.org/gmane.linux.ide/9959
>> [2] http://article.gmane.org/gmane.linux.ide/9984
> 
> Aside from one minor requested change, this patchset is ACK'd, but with 
> a few comments and worries noted.
> 
> Once this goes in we need to move all the drivers over as soon as 
> possible, and get them all re-tested.
> 

I will be a lot of work.  As new EH uses features unused before, it 
sometimes exposes new hardware problems (things like sata_sil24 port 
wide IRQ plugging doesn't really work).  Those problems take time to 
diagnose and work around.  Also, different strains of a controller might 
show different symptoms.  So, we need a lot of testing.

I agree that drivers should be converted ASAP but honestly don't know 
how fast those can be done; however, libata will continue to work with 
all the old-EH drivers as well as before, so we have some headroom.

-- 
tejun

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2006-05-15 18:22 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-11 13:21 [PATCHSET 03/11] new EH implementation, take 3 Tejun Heo
2006-05-11 13:21 ` [PATCH 05/11] libata-eh: implement new EH Tejun Heo
2006-05-13 22:19   ` Jeff Garzik
2006-05-11 13:21 ` [PATCH 04/11] libata-eh: implement ata_eh_info and ata_eh_context Tejun Heo
2006-05-11 13:21 ` [PATCH 01/11] libata-eh: add ATA and libata flags for new EH Tejun Heo
2006-05-13 22:15   ` Jeff Garzik
2006-05-11 13:21 ` [PATCH 02/11] libata-eh: implement ata_ering Tejun Heo
2006-05-13 22:16   ` Jeff Garzik
2006-05-13 23:36     ` Tejun Heo
2006-05-14  1:05       ` Jeff Garzik
2006-05-14  1:20         ` Tejun Heo
2006-05-14  1:32           ` Jeff Garzik
2006-05-14  1:38             ` Tejun Heo
2006-05-15 13:36             ` Alan Cox
2006-05-15 14:00               ` Tejun Heo
2006-05-15 14:25                 ` Tejun Heo
2006-05-15 14:50                   ` Alan Cox
2006-05-15 14:57                     ` Tejun Heo
2006-05-15 15:19                       ` Alan Cox
2006-05-15 15:19                         ` Tejun Heo
2006-05-15 18:22                           ` Jeff Garzik
2006-05-11 13:21 ` [PATCH 03/11] libata-eh: add per-dev ata_ering Tejun Heo
2006-05-11 13:21 ` [PATCH 06/11] libata-eh: implement BMDMA EH Tejun Heo
2006-05-13 22:21   ` Jeff Garzik
2006-05-13 23:41     ` Tejun Heo
2006-05-15 13:38       ` Alan Cox
2006-05-15 13:59         ` Tejun Heo
2006-05-15 14:43           ` Alan Cox
2006-05-11 13:21 ` [PATCH 07/11] ata_piix: convert to new EH Tejun Heo
2006-05-13 22:23   ` Jeff Garzik
2006-05-11 13:21 ` [PATCH 08/11] sata_sil: " Tejun Heo
2006-05-11 14:22   ` Alan Cox
2006-05-11 14:39     ` Tejun Heo
2006-05-11 15:46       ` Alan Cox
2006-05-11 15:45         ` Tejun Heo
2006-05-11 16:12           ` Alan Cox
2006-05-11 16:10             ` Tejun Heo
2006-05-11 17:16               ` Alan Cox
2006-05-13 22:26   ` Jeff Garzik
2006-05-13 23:43     ` Tejun Heo
2006-05-11 13:21 ` [PATCH 10/11] ahci: add PIOS interim interrupt handling Tejun Heo
2006-05-11 13:21 ` [PATCH 09/11] ahci: convert to new EH Tejun Heo
2006-05-13 10:53   ` Tejun Heo
2006-05-13 22:30   ` Jeff Garzik
2006-05-13 23:49     ` Tejun Heo
2006-05-11 13:21 ` [PATCH 11/11] sata_sil24: " Tejun Heo
2006-05-13 22:34 ` [PATCHSET 03/11] new EH implementation, take 3 Jeff Garzik
2006-05-13 23:58   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).