[PATCH v8 00/13] libsas error handling + discovery v8

linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v8 00/13] libsas error handling + discovery v8
@ 2012-02-10  8:44 Dan Williams
  2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
                   ` (12 more replies)
  0 siblings, 13 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

Changes since v7: http://marc.info/?l=linux-scsi&m=132739159230095&w=2

1/ Dropped "libsas: feed the scsi_block_when_processing_errors() meter"
   [1].  Patches 1-3 are just resends of the patches affected by that
   rebase.

2/ Fixed up the phy identify data that is exported to userspace.
   sas_phy/phyX/device_type and sas_phy/phyX/target_port_protocols are
   now populated more reliably.  Patches 4-5

3/ User-disabled ssp phys are now no longer inadvertantly re-enabled by
   error recovery (they need an explicit reset from userspace).  This
   enables scripted hotplug testing. Patches 6-8

4/ Fixed up the eh_strategy_handlers to prevent them being called from
   outside eh context, added an abort handler, and made sure all resets
   are notified to the lldd. Patches 9-12

5/ A small diet for sas_task to move slow path infrastructue out of the
   fast path use case. Patch 13

[1]: http://marc.info/?l=linux-scsi&m=132677728817896&w=2

---

[PATCH 01/13] libsas: close scsi_remove_target() vs libata-eh race
[PATCH 02/13] libsas: improve debug statements
[PATCH 03/13] libsas: async ata scanning
[PATCH 04/13] libsas: set attached device type and target protocols for local phys
[PATCH 05/13] libsas: fixup target_port_protocols for expanders that don't report sata
[PATCH 06/13] libsas: don't recover end devices attached to disabled phys
[PATCH 07/13] scsi_transport_sas: 'enable' phys on reset
[PATCH 08/13] libsas: libsas.force_hard_reset module parameter
[PATCH 09/13] libsas: enforce eh strategy handlers only in eh context
[PATCH 10/13] libsas: add sas_eh_abort_handler
[PATCH 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler
[PATCH 12/13] isci: use sas eh strategy handlers
[PATCH 13/13] libsas: trim sas_task of slow path infrastructure

The following changes since commit d65b4e98d7ea3038b767b70fe8be959b2913f16d:

  Linux 3.3-rc3 (2012-02-08 19:21:53 -0800)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci.git libsas-eh-reworks-v8

Dan Williams (58):
      libsas: remove unused ata_task_resp fields
      libsas: kill sas_slave_destroy
      libsas: fix domain_device leak
      libsas: fix leak of dev->sata_dev.identify_[packet_]device
      libsas: replace event locks with atomic bitops
      libsas: convert ha->state to flags
      libsas: introduce sas_drain_work()
      libsas: remove ata_port.lock management duties from lldds
      libsas: convert dev->gone to flags
      libsas: prevent domain rediscovery competing with ata error handling
      libsas: use ->set_dmamode to notify lldds of NCQ parameters
      libsas: kill invocation of scsi_eh_finish_cmd from sas_ata_task_done
      libsas: close error handling vs sas_ata_task_done() race
      libsas: prevent double completion of scmds from eh
      libsas: fix timeout vs completion race
      libsas: let libata handle command timeouts
      libsas: defer SAS_TASK_NEED_DEV_RESET commands to libata
      libsas: use libata-eh-reset for sata rediscovery fis transmit failures
      libsas: perform sas-transport resets in shost->workq context
      libsas: execute transport link resets with libata-eh via host workqueue
      libsas: sas_phy_enable via transport_sas_phy_reset
      libsas: async ata-eh
      libsas: poll for ata device readiness after reset
      libsas: don't mark expanders as gone when a child device is removed
      libsas: check for 'gone' expanders in smp_execute_task()
      libsas: fix sas_find_local_phy(), take phy references
      libsas: don't recover 'gone' devices in sas_ata_hard_reset()
      isci: kill iphy->isci_port lookups
      isci: kill isci_port->status
      isci: fix interpretation of "hard" reset
      isci: stop interpreting ->lldd_lu_reset() as an ata soft-reset
      isci: ->lldd_ata_check_ready handler
      isci: remove bus and reset handlers
      isci: remove IDEV_EH hack to disable "discovery-time" ata resets
      libsas: pre-clean commands that won the eh vs completion race
      libsas: mark all domain devices gone if root port disappears
      libsas: close scsi_remove_target() vs libata-eh race
      libsas: fix mixed topology recovery
      libsas: route local link resets through ata-eh
      libsas: fix sas_unregister_ports vs sas_drain_work
      libsas: improve debug statements
      libsas: fix sas port naming
      libsas: let libata recover links that fail to transmit initial sig-fis
      libsas: delete device on sas address changed
      libsas: restore scan order
      libsas: async ata scanning
      libsas: fix lifetime of SAS_HA_FROZEN
      libsas: revert ata srst
      libsas: set attached device type and target protocols for local phys
      libsas: fixup target_port_protocols for expanders that don't report sata
      libsas: don't recover end devices attached to disabled phys
      scsi_transport_sas: 'enable' phys on reset
      libsas: libsas.force_hard_reset module parameter
      libsas: enforce eh strategy handlers only in eh context
      libsas: add sas_eh_abort_handler
      libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler
      isci: use sas eh strategy handlers
      libsas: trim sas_task of slow path infrastructure

Jeff Skirvin (2):
      libsas: Remove redundant phy state notification calls.
      libsas: add mutex for SMP task execution

Maciej Trela (1):
      libsas: kill spurious sas_put_device

 Documentation/kernel-parameters.txt |    6 +
 Documentation/scsi/libsas.txt       |   15 -
 drivers/ata/libata-core.c           |   34 +-
 drivers/ata/libata-eh.c             |    1 +
 drivers/ata/libata-scsi.c           |   13 +
 drivers/ata/libata.h                |    2 +-
 drivers/scsi/aic94xx/aic94xx.h      |    2 +
 drivers/scsi/aic94xx/aic94xx_dev.c  |   38 +-
 drivers/scsi/aic94xx/aic94xx_init.c |    6 +-
 drivers/scsi/aic94xx/aic94xx_tmf.c  |   11 +-
 drivers/scsi/isci/host.c            |    8 +-
 drivers/scsi/isci/host.h            |   19 +-
 drivers/scsi/isci/init.c            |   17 +-
 drivers/scsi/isci/phy.c             |   18 +-
 drivers/scsi/isci/phy.h             |    1 -
 drivers/scsi/isci/port.c            |  217 +++++-----
 drivers/scsi/isci/port.h            |   11 +-
 drivers/scsi/isci/remote_device.c   |   32 +-
 drivers/scsi/isci/remote_device.h   |    7 +-
 drivers/scsi/isci/request.c         |  198 +--------
 drivers/scsi/isci/request.h         |    9 +-
 drivers/scsi/isci/task.c            |  158 ++------
 drivers/scsi/isci/task.h            |   40 --
 drivers/scsi/libsas/sas_ata.c       |  827 +++++++++++++++++------------------
 drivers/scsi/libsas/sas_discover.c  |  216 ++++++++--
 drivers/scsi/libsas/sas_event.c     |   96 ++++-
 drivers/scsi/libsas/sas_expander.c  |  360 ++++++++++------
 drivers/scsi/libsas/sas_host_smp.c  |   11 +-
 drivers/scsi/libsas/sas_init.c      |  243 +++++++++--
 drivers/scsi/libsas/sas_internal.h  |   97 +++--
 drivers/scsi/libsas/sas_phy.c       |   12 +-
 drivers/scsi/libsas/sas_port.c      |   32 +-
 drivers/scsi/libsas/sas_scsi_host.c |  421 +++++++++----------
 drivers/scsi/mvsas/mv_init.c        |    2 -
 drivers/scsi/mvsas/mv_sas.c         |   31 +-
 drivers/scsi/pm8001/pm8001_init.c   |    2 -
 drivers/scsi/pm8001/pm8001_sas.c    |   59 ++--
 drivers/scsi/scsi_transport_sas.c   |   60 +++-
 include/linux/libata.h              |    2 +
 include/scsi/libsas.h               |   86 +++--
 include/scsi/sas.h                  |    4 +-
 include/scsi/sas_ata.h              |   44 ++-
 include/scsi/scsi_transport_sas.h   |   12 +-
 43 files changed, 1859 insertions(+), 1621 deletions(-)

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
@ 2012-02-10  8:44 ` Dan Williams
  2012-02-10 18:32   ` Jeff Garzik
  2012-02-10 19:09   ` Dan Williams
  2012-02-10  8:44 ` [PATCH v8 02/13] libsas: improve debug statements Dan Williams
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Marcin Tomczak

ata_port lifetime in libata follows the host.  In libsas it follows the
scsi_target.  Once scsi_remove_device() has caused all commands to be
completed it allows scsi_remove_target() to immediately proceed to
freeing the ata_port causing bug reports like:

[  848.393333] BUG: spinlock bad magic on CPU#4, kworker/u:2/5107
[  848.400262] general protection fault: 0000 [#1] SMP
[  848.406244] CPU 4
[  848.408310] Modules linked in: nls_utf8 ipv6 uinput i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca sg sd_mod sr_mod cdrom ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
[  848.432060]
[  848.434137] Pid: 5107, comm: kworker/u:2 Not tainted 3.2.0-isci+ #8 Intel Corporation S2600CP/S2600CP
[  848.445310] RIP: 0010:[<ffffffff8126a68c>]  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.454787] RSP: 0018:ffff8807f868dca0  EFLAGS: 00010002
[  848.461137] RAX: 0000000000000048 RBX: ffff8807fe86a630 RCX: ffffffff817d0be0
[  848.469520] RDX: 0000000000000000 RSI: ffffffff814af1cf RDI: 0000000000000002
[  848.477959] RBP: ffff8807f868dcb0 R08: 00000000ffffffff R09: 000000006b6b6b6b
[  848.486327] R10: 000000000003fb8c R11: ffffffff81a19448 R12: 6b6b6b6b6b6b6b6b
[  848.494699] R13: ffff8808027dc520 R14: 0000000000000000 R15: 000000000000001e
[  848.503067] FS:  0000000000000000(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000
[  848.512899] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  848.519710] CR2: 00007ff77d001000 CR3: 00000007f7a5d000 CR4: 00000000000406e0
[  848.528072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  848.536446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  848.544831] Process kworker/u:2 (pid: 5107, threadinfo ffff8807f868c000, task ffff8807ff348000)
[  848.555327] Stack:
[  848.557959]  ffff8807fe86a630 ffff8807fe86a630 ffff8807f868dcd0 ffffffff8126a6e0
[  848.567072]  ffffffff817c142f ffff8807fe86a630 ffff8807f868dcf0 ffffffff8126a703
[  848.576190]  ffff8808027dc520 0000000000000286 ffff8807f868dd10 ffffffff814af1bb
[  848.585281] Call Trace:
[  848.588409]  [<ffffffff8126a6e0>] spin_bug+0x26/0x28
[  848.594357]  [<ffffffff8126a703>] do_raw_spin_unlock+0x21/0x88
[  848.601283]  [<ffffffff814af1bb>] _raw_spin_unlock_irqrestore+0x2c/0x65
[  848.609089]  [<ffffffffa001c103>] ata_scsi_port_error_handler+0x548/0x557 [libata]
[  848.618331]  [<ffffffff81061813>] ? async_schedule+0x17/0x17
[  848.625060]  [<ffffffffa004f30f>] async_sas_ata_eh+0x45/0x69 [libsas]
[  848.632655]  [<ffffffff810618aa>] async_run_entry_fn+0x97/0x125
[  848.639670]  [<ffffffff81057439>] process_one_work+0x207/0x38d
[  848.646577]  [<ffffffff8105738c>] ? process_one_work+0x15a/0x38d
[  848.653681]  [<ffffffff810576f7>] worker_thread+0x138/0x21c
[  848.660305]  [<ffffffff810575bf>] ? process_one_work+0x38d/0x38d
[  848.667493]  [<ffffffff8105b098>] kthread+0x9d/0xa5
[  848.673382]  [<ffffffff8106e1bd>] ? trace_hardirqs_on_caller+0x12f/0x166
[  848.681304]  [<ffffffff814b7704>] kernel_thread_helper+0x4/0x10
[  848.688324]  [<ffffffff814af534>] ? retint_restore_args+0x13/0x13
[  848.695530]  [<ffffffff8105affb>] ? __init_kthread_worker+0x5b/0x5b
[  848.702929]  [<ffffffff814b7700>] ? gs_change+0x13/0x13
[  848.709155] Code: 00 00 48 8d 88 38 04 00 00 44 8b 80 84 02 00 00 31 c0 e8 cf 1b 24 00 41 83 c8 ff 44 8b 4b 08 48 c7 c1 e0 0b 7d 81 4d 85 e4 74 10 <45> 8b 84 24 84 02 00 00 49 8d 8c 24 38 04 00 00 8b 53 04 48 89
[  848.732467] RIP  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
[  848.738905]  RSP <ffff8807f868dca0>
[  848.743743] ---[ end trace 143161646eee8caa ]---

...so arrange for the ata_port to have the same end of life as the domain
device.

Reported-by: Marcin Tomczak <marcin.tomczak@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_ata.c       |    4 ++++
 drivers/scsi/libsas/sas_discover.c  |    5 +++++
 drivers/scsi/libsas/sas_scsi_host.c |    3 ---
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 37a9e73..852b1b1 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -661,6 +661,10 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
 	struct ata_port *ap = dev->sata_dev.ap;
 	struct sas_ha_struct *ha = dev->port->ha;
 
+	/* hold a reference over eh since we may be racing with final
+	 * remove once all commands are completed
+	 */
+	kref_get(&dev->kref);
 	ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata port error handler");
 	ata_scsi_port_error_handler(ha->core.shost, ap);
 }
diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index 3ef7741..9367101 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -242,6 +242,11 @@ void sas_free_device(struct kref *kref)
 	if (dev->dev_type == EDGE_DEV || dev->dev_type == FANOUT_DEV)
 		kfree(dev->ex_dev.ex_phy);
 
+	if (dev_is_sata(dev) && dev->sata_dev.ap) {
+		ata_sas_port_destroy(dev->sata_dev.ap);
+		dev->sata_dev.ap = NULL;
+	}
+
 	kfree(dev);
 }
 
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 731c892..b563ff2 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1028,9 +1028,6 @@ void sas_target_destroy(struct scsi_target *starget)
 	if (!found_dev)
 		return;
 
-	if (dev_is_sata(found_dev))
-		ata_sas_port_destroy(found_dev->sata_dev.ap);
-
 	starget->hostdata = NULL;
 	sas_put_device(found_dev);
 }


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 02/13] libsas: improve debug statements
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
  2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
@ 2012-02-10  8:44 ` Dan Williams
  2012-02-10  8:44 ` [PATCH v8 03/13] libsas: async ata scanning Dan Williams
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Douglas Gilbert

It's difficult to determine which domain_device is triggering error recovery,
so convert messages like:

  sas: ex 5001b4da000e703f phy08:T attached: 5001b4da000e7028
  sas: ex 5001b4da000e703f phy09:T attached: 5001b4da000e7029
  ...
  ata7: sas eh calling libata port error handler
  ata8: sas eh calling libata port error handler

...into:

  sas: ex 5001517e85cfefff phy05:T:9 attached: 5001517e85cfefe5 (stp)
  sas: ex 5001517e3b0af0bf phy11:T:8 attached: 5001517e3b0af0ab (stp)
  ...
  sas: ata7: end_device-21:1: dev error handler
  sas: ata8: end_device-20:0:5: dev error handler

which shows attached link rate, device type, and associates a
domain_device with its ata_port id to correlate messages emitted from
libata-eh.

As Doug notes, we can also take the opportunity to clarify expander phy
routing capabilities.

Cc: Douglas Gilbert <dgilbert@interlog.com>
[dgilbert@interlog.com: clarify table2table with 'U']
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_ata.c      |   43 ++++++++++++++++-----
 drivers/scsi/libsas/sas_expander.c |   74 +++++++++++++++++++++++++-----------
 2 files changed, 85 insertions(+), 32 deletions(-)

diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 4f5e147..82868d6 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -317,6 +317,28 @@ static int local_ata_check_ready(struct ata_link *link)
 	}
 }
 
+static int sas_ata_printk(const char *level, const struct domain_device *ddev,
+			  const char *fmt, ...)
+{
+	struct ata_port *ap = ddev->sata_dev.ap;
+	struct device *dev = &ddev->rphy->dev;
+	struct va_format vaf;
+	va_list args;
+	int r;
+
+	va_start(args, fmt);
+
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	r = printk("%ssas: ata%u: %s: %pV",
+		   level, ap->print_id, dev_name(dev), &vaf);
+
+	va_end(args);
+
+	return r;
+}
+
 static int sas_ata_hard_reset(struct ata_link *link, unsigned int *class,
 			      unsigned long deadline)
 {
@@ -333,7 +355,7 @@ static int sas_ata_hard_reset(struct ata_link *link, unsigned int *class,
 	res = i->dft->lldd_I_T_nexus_reset(dev);
 
 	if (res != TMF_RESP_FUNC_COMPLETE)
-		SAS_DPRINTK("%s: Unable to reset ata device?\n", __func__);
+		sas_ata_printk(KERN_DEBUG, dev, "Unable to reset ata device?\n");
 
 	phy = sas_get_local_phy(dev);
 	if (scsi_is_sas_phy_local(phy))
@@ -344,7 +366,7 @@ static int sas_ata_hard_reset(struct ata_link *link, unsigned int *class,
 
 	ret = ata_wait_after_reset(link, deadline, check_ready);
 	if (ret && ret != -EAGAIN)
-		ata_link_err(link, "COMRESET failed (errno=%d)\n", ret);
+		sas_ata_printk(KERN_ERR, dev, "reset failed (errno=%d)\n", ret);
 
 	/* XXX: if the class changes during the reset the upper layer
 	 * should be informed, if the device has gone away we assume
@@ -665,7 +687,7 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
 	 * remove once all commands are completed
 	 */
 	kref_get(&dev->kref);
-	ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata port error handler");
+	sas_ata_printk(KERN_DEBUG, dev, "dev error handler\n");
 	ata_scsi_port_error_handler(ha->core.shost, ap);
 }
 
@@ -702,26 +724,27 @@ void sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
 		struct list_head *done_q)
 {
 	struct scsi_cmnd *cmd, *n;
-	struct ata_port *ap;
+	struct domain_device *eh_dev;
 
 	do {
 		LIST_HEAD(sata_q);
-
-		ap = NULL;
+		eh_dev = NULL;
 
 		list_for_each_entry_safe(cmd, n, work_q, eh_entry) {
 			struct domain_device *ddev = cmd_to_domain_dev(cmd);
 
 			if (!dev_is_sata(ddev) || TO_SAS_TASK(cmd))
 				continue;
-			if (ap && ap != ddev->sata_dev.ap)
+			if (eh_dev && eh_dev != ddev)
 				continue;
-			ap = ddev->sata_dev.ap;
+			eh_dev = ddev;
 			list_move(&cmd->eh_entry, &sata_q);
 		}
 
 		if (!list_empty(&sata_q)) {
-			ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata cmd error handler\n");
+			struct ata_port *ap = eh_dev->sata_dev.ap;
+
+			sas_ata_printk(KERN_DEBUG, eh_dev, "cmd error handler\n");
 			ata_scsi_cmd_error_handler(shost, ap, &sata_q);
 			/*
 			 * ata's error handler may leave the cmd on the list
@@ -737,7 +760,7 @@ void sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
 			while (!list_empty(&sata_q))
 				list_del_init(sata_q.next);
 		}
-	} while (ap);
+	} while (eh_dev);
 }
 
 void sas_ata_schedule_reset(struct domain_device *dev)
diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 68a80a0..4b2ecd3 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -166,7 +166,22 @@ static inline void *alloc_smp_resp(int size)
 	return kzalloc(size, GFP_KERNEL);
 }
 
-/* ---------- Expander configuration ---------- */
+static char sas_route_char(struct domain_device *dev, struct ex_phy *phy)
+{
+	switch (phy->routing_attr) {
+	case TABLE_ROUTING:
+		if (dev->ex_dev.t2t_supp)
+			return 'U';
+		else
+			return 'T';
+	case DIRECT_ROUTING:
+		return 'D';
+	case SUBTRACTIVE_ROUTING:
+		return 'S';
+	default:
+		return '?';
+	}
+}
 
 static void sas_set_ex_phy(struct domain_device *dev, int phy_id,
 			   void *disc_resp)
@@ -176,9 +191,10 @@ static void sas_set_ex_phy(struct domain_device *dev, int phy_id,
 	struct smp_resp *resp = disc_resp;
 	struct discover_resp *dr = &resp->disc;
 	struct sas_rphy *rphy = dev->rphy;
-	int rediscover = (phy->phy != NULL);
+	bool new_phy = !phy->phy;
+	char *type;
 
-	if (!rediscover) {
+	if (new_phy) {
 		phy->phy = sas_phy_alloc(&rphy->dev, phy_id);
 
 		/* FIXME: error_handling */
@@ -223,20 +239,41 @@ static void sas_set_ex_phy(struct domain_device *dev, int phy_id,
 	phy->phy->maximum_linkrate = dr->pmax_linkrate;
 	phy->phy->negotiated_linkrate = phy->linkrate;
 
-	if (!rediscover)
+	if (new_phy)
 		if (sas_phy_add(phy->phy)) {
 			sas_phy_free(phy->phy);
 			return;
 		}
 
-	SAS_DPRINTK("ex %016llx phy%02d:%c attached: %016llx\n",
-		    SAS_ADDR(dev->sas_addr), phy->phy_id,
-		    phy->routing_attr == TABLE_ROUTING ? 'T' :
-		    phy->routing_attr == DIRECT_ROUTING ? 'D' :
-		    phy->routing_attr == SUBTRACTIVE_ROUTING ? 'S' : '?',
-		    SAS_ADDR(phy->attached_sas_addr));
+	switch (phy->attached_dev_type) {
+	case NO_DEVICE:
+		type = "no device";
+		break;
+	case SAS_END_DEV:
+		if (phy->attached_iproto) {
+			if (phy->attached_tproto)
+				type = "host+target";
+			else
+				type = "host";
+		} else {
+			if (dr->attached_sata_dev)
+				type = "stp";
+			else
+				type = "ssp";
+		}
+		break;
+	case EDGE_DEV:
+	case FANOUT_DEV:
+		type = "smp";
+		break;
+	default:
+		type = "unknown";
+	}
 
-	return;
+	SAS_DPRINTK("ex %016llx phy%02d:%c:%X attached: %016llx (%s)\n",
+		    SAS_ADDR(dev->sas_addr), phy->phy_id,
+		    sas_route_char(dev, phy), phy->linkrate,
+		    SAS_ADDR(phy->attached_sas_addr), type);
 }
 
 /* check if we have an existing attached ata device on this expander phy */
@@ -1176,32 +1213,25 @@ static void sas_print_parent_topology_bug(struct domain_device *child,
 						 struct ex_phy *parent_phy,
 						 struct ex_phy *child_phy)
 {
-	static const char ra_char[] = {
-		[DIRECT_ROUTING] = 'D',
-		[SUBTRACTIVE_ROUTING] = 'S',
-		[TABLE_ROUTING] = 'T',
-	};
 	static const char *ex_type[] = {
 		[EDGE_DEV] = "edge",
 		[FANOUT_DEV] = "fanout",
 	};
 	struct domain_device *parent = child->parent;
 
-	sas_printk("%s ex %016llx (T2T supp:%d) phy 0x%x <--> %s ex %016llx "
-		   "(T2T supp:%d) phy 0x%x has %c:%c routing link!\n",
+	sas_printk("%s ex %016llx phy 0x%x <--> %s ex %016llx "
+		   "phy 0x%x has %c:%c routing link!\n",
 
 		   ex_type[parent->dev_type],
 		   SAS_ADDR(parent->sas_addr),
-		   parent->ex_dev.t2t_supp,
 		   parent_phy->phy_id,
 
 		   ex_type[child->dev_type],
 		   SAS_ADDR(child->sas_addr),
-		   child->ex_dev.t2t_supp,
 		   child_phy->phy_id,
 
-		   ra_char[parent_phy->routing_attr],
-		   ra_char[child_phy->routing_attr]);
+		   sas_route_char(parent, parent_phy),
+		   sas_route_char(child, child_phy));
 }
 
 static int sas_check_eeds(struct domain_device *child,


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 03/13] libsas: async ata scanning
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
  2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
  2012-02-10  8:44 ` [PATCH v8 02/13] libsas: improve debug statements Dan Williams
@ 2012-02-10  8:44 ` Dan Williams
  2012-02-10 18:34   ` Jeff Garzik
  2012-02-10  8:45 ` [PATCH v8 04/13] libsas: set attached device type and target protocols for local phys Dan Williams
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:44 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

libsas ata error handling is already async but this does not help the
scan case.  Move initial link recovery out from under host->scan_mutex,
and delay synchronization with eh until after all port probe/recovery
work has been queued.

Device ordering is maintained with scan order by still calling
sas_rphy_add() in order of domain discovery.

Since we now scan the domain list when invoking libata-eh we need to be
careful to check for fully initialized ata ports.

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Cc: Luben Tuikov <ltuikov@yahoo.com>
Acked-by: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/ata/libata-core.c           |   34 +++++++++-------
 drivers/ata/libata-scsi.c           |   13 ++++++
 drivers/ata/libata.h                |    1 
 drivers/scsi/aic94xx/aic94xx_init.c |    1 
 drivers/scsi/isci/init.c            |    1 
 drivers/scsi/libsas/sas_ata.c       |   74 ++++++++++++++++++++++++++++++-----
 drivers/scsi/libsas/sas_discover.c  |   22 +++++-----
 drivers/scsi/libsas/sas_internal.h  |    9 ++++
 drivers/scsi/libsas/sas_scsi_host.c |   18 ---------
 drivers/scsi/mvsas/mv_init.c        |    1 
 drivers/scsi/pm8001/pm8001_init.c   |    1 
 include/linux/libata.h              |    1 
 include/scsi/libsas.h               |    1 
 include/scsi/sas_ata.h              |   12 +++---
 14 files changed, 123 insertions(+), 66 deletions(-)

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index c04ad68..1654c94 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5874,29 +5874,31 @@ void ata_host_init(struct ata_host *host, struct device *dev,
 	host->ops = ops;
 }
 
-int ata_port_probe(struct ata_port *ap)
+void __ata_port_probe(struct ata_port *ap)
 {
-	int rc = 0;
+	struct ata_eh_info *ehi = &ap->link.eh_info;
+	unsigned long flags;
 
-	/* probe */
-	if (ap->ops->error_handler) {
-		struct ata_eh_info *ehi = &ap->link.eh_info;
-		unsigned long flags;
+	/* kick EH for boot probing */
+	spin_lock_irqsave(ap->lock, flags);
 
-		/* kick EH for boot probing */
-		spin_lock_irqsave(ap->lock, flags);
+	ehi->probe_mask |= ATA_ALL_DEVICES;
+	ehi->action |= ATA_EH_RESET;
+	ehi->flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET;
 
-		ehi->probe_mask |= ATA_ALL_DEVICES;
-		ehi->action |= ATA_EH_RESET;
-		ehi->flags |= ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET;
+	ap->pflags &= ~ATA_PFLAG_INITIALIZING;
+	ap->pflags |= ATA_PFLAG_LOADING;
+	ata_port_schedule_eh(ap);
 
-		ap->pflags &= ~ATA_PFLAG_INITIALIZING;
-		ap->pflags |= ATA_PFLAG_LOADING;
-		ata_port_schedule_eh(ap);
+	spin_unlock_irqrestore(ap->lock, flags);
+}
 
-		spin_unlock_irqrestore(ap->lock, flags);
+int ata_port_probe(struct ata_port *ap)
+{
+	int rc = 0;
 
-		/* wait for EH to finish */
+	if (ap->ops->error_handler) {
+		__ata_port_probe(ap);
 		ata_port_wait_eh(ap);
 	} else {
 		DPRINTK("ata%u: bus probe begin\n", ap->print_id);
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 2a5412e..faf8730 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -3837,6 +3837,19 @@ void ata_sas_port_stop(struct ata_port *ap)
 }
 EXPORT_SYMBOL_GPL(ata_sas_port_stop);
 
+int ata_sas_async_port_init(struct ata_port *ap)
+{
+	int rc = ap->ops->port_start(ap);
+
+	if (!rc) {
+		ap->print_id = ata_print_id++;
+		__ata_port_probe(ap);
+	}
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(ata_sas_async_port_init);
+
 /**
  *	ata_sas_port_init - Initialize a SATA device
  *	@ap: SATA port to initialize
diff --git a/drivers/ata/libata.h b/drivers/ata/libata.h
index 78c356d..2a2aa51 100644
--- a/drivers/ata/libata.h
+++ b/drivers/ata/libata.h
@@ -104,6 +104,7 @@ extern int ata_cmd_ioctl(struct scsi_device *scsidev, void __user *arg);
 extern struct ata_port *ata_port_alloc(struct ata_host *host);
 extern const char *sata_spd_string(unsigned int spd);
 extern int ata_port_probe(struct ata_port *ap);
+extern void __ata_port_probe(struct ata_port *ap);
 
 /* libata-acpi.c */
 #ifdef CONFIG_ATA_ACPI
diff --git a/drivers/scsi/aic94xx/aic94xx_init.c b/drivers/scsi/aic94xx/aic94xx_init.c
index eea988a..ff80552 100644
--- a/drivers/scsi/aic94xx/aic94xx_init.c
+++ b/drivers/scsi/aic94xx/aic94xx_init.c
@@ -81,7 +81,6 @@ static struct scsi_host_template aic94xx_sht = {
 	.use_clustering		= ENABLE_CLUSTERING,
 	.eh_device_reset_handler	= sas_eh_device_reset_handler,
 	.eh_bus_reset_handler	= sas_eh_bus_reset_handler,
-	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 };
diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index 437f76b..9a28270 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -157,7 +157,6 @@ static struct scsi_host_template isci_sht = {
 	.sg_tablesize			= SG_ALL,
 	.max_sectors			= SCSI_DEFAULT_MAX_SECTORS,
 	.use_clustering			= ENABLE_CLUSTERING,
-	.slave_alloc			= sas_slave_alloc,
 	.target_destroy			= sas_target_destroy,
 	.ioctl				= sas_ioctl,
 	.shost_attrs			= isci_host_attrs,
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index 94880f2..43f6ba7 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -585,11 +585,10 @@ static struct ata_port_info sata_port_info = {
 	.port_ops = &sas_sata_ops
 };
 
-int sas_ata_init_host_and_port(struct domain_device *found_dev,
-			       struct scsi_target *starget)
+int sas_ata_init_host_and_port(struct domain_device *found_dev)
 {
-	struct Scsi_Host *shost = dev_to_shost(&starget->dev);
-	struct sas_ha_struct *ha = SHOST_TO_SAS_HA(shost);
+	struct sas_ha_struct *ha = found_dev->port->ha;
+	struct Scsi_Host *shost = ha->core.shost;
 	struct ata_port *ap;
 
 	ata_host_init(&found_dev->sata_dev.ata_host,
@@ -607,6 +606,8 @@ int sas_ata_init_host_and_port(struct domain_device *found_dev,
 	ap->private_data = found_dev;
 	ap->cbl = ATA_CBL_SATA;
 	ap->scsi_host = shost;
+	/* publish initialized ata port */
+	smp_wmb();
 	found_dev->sata_dev.ap = ap;
 
 	return 0;
@@ -683,6 +684,38 @@ static void sas_get_ata_command_set(struct domain_device *dev)
 		dev->sata_dev.command_set = ATAPI_COMMAND_SET;
 }
 
+void sas_probe_sata(struct asd_sas_port *port)
+{
+	struct domain_device *dev, *n;
+	int err;
+
+	mutex_lock(&port->ha->disco_mutex);
+	list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node) {
+		if (!dev_is_sata(dev))
+			continue;
+
+		err = sas_ata_init_host_and_port(dev);
+		if (err)
+			sas_fail_probe(dev, __func__, err);
+		else
+			ata_sas_async_port_init(dev->sata_dev.ap);
+	}
+	mutex_unlock(&port->ha->disco_mutex);
+
+	list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node) {
+		if (!dev_is_sata(dev))
+			continue;
+
+		sas_ata_wait_eh(dev);
+
+		/* if libata could not bring the link up, don't surface
+		 * the device
+		 */
+		if (ata_dev_disabled(sas_to_ata_dev(dev)))
+			sas_fail_probe(dev, __func__, -ENODEV);
+	}
+}
+
 /**
  * sas_discover_sata -- discover an STP/SATA domain device
  * @dev: pointer to struct domain_device of interest
@@ -723,11 +756,23 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
 	ata_scsi_port_error_handler(ha->core.shost, ap);
 }
 
+static bool sas_ata_dev_eh_valid(struct domain_device *dev)
+{
+	struct ata_port *ap;
+
+	if (!dev_is_sata(dev))
+		return false;
+	ap = dev->sata_dev.ap;
+	/* consume fully initialized ata ports */
+	smp_rmb();
+	return !!ap;
+}
+
 void sas_ata_strategy_handler(struct Scsi_Host *shost)
 {
-	struct scsi_device *sdev;
 	struct sas_ha_struct *sas_ha = SHOST_TO_SAS_HA(shost);
 	LIST_HEAD(async);
+	int i;
 
 	/* it's ok to defer revalidation events during ata eh, these
 	 * disks are in one of three states:
@@ -739,14 +784,21 @@ void sas_ata_strategy_handler(struct Scsi_Host *shost)
 	 */
 	sas_disable_revalidation(sas_ha);
 
-	shost_for_each_device(sdev, shost) {
-		struct domain_device *ddev = sdev_to_domain_dev(sdev);
-
-		if (!dev_is_sata(ddev))
-			continue;
+	spin_lock_irq(&sas_ha->phy_port_lock);
+	for (i = 0; i < sas_ha->num_phys; i++) {
+		struct asd_sas_port *port = sas_ha->sas_port[i];
+		struct domain_device *dev;
 
-		async_schedule_domain(async_sas_ata_eh, ddev, &async);
+		spin_lock(&port->dev_list_lock);
+		list_for_each_entry(dev, &port->dev_list, dev_list_node) {
+			if (!sas_ata_dev_eh_valid(dev))
+				continue;
+			async_schedule_domain(async_sas_ata_eh, dev, &async);
+		}
+		spin_unlock(&port->dev_list_lock);
 	}
+	spin_unlock_irq(&sas_ha->phy_port_lock);
+
 	async_synchronize_full_domain(&async);
 
 	sas_enable_revalidation(sas_ha);
diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index f60b602..ed3f8c0 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -207,22 +207,22 @@ static void sas_probe_devices(struct work_struct *work)
 
 	clear_bit(DISCE_PROBE, &port->disc.pending);
 
-	list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node) {
-		int err;
-
+	/* devices must be domain members before link recovery and probe */
+	list_for_each_entry(dev, &port->disco_list, disco_list_node) {
 		spin_lock_irq(&port->dev_list_lock);
 		list_add_tail(&dev->dev_list_node, &port->dev_list);
 		spin_unlock_irq(&port->dev_list_lock);
+	}
 
-		err = sas_rphy_add(dev->rphy);
+	sas_probe_sata(port);
 
-		if (err) {
-			SAS_DPRINTK("%s: for %s device %16llx returned %d\n",
-				    __func__, dev->parent ? "exp-attached" :
-							    "direct-attached",
-				    SAS_ADDR(dev->sas_addr), err);
-			sas_unregister_dev(port, dev);
-		} else
+	list_for_each_entry_safe(dev, n, &port->disco_list, disco_list_node) {
+		int err;
+
+		err = sas_rphy_add(dev->rphy);
+		if (err)
+			sas_fail_probe(dev, __func__, err);
+		else
 			list_del_init(&dev->disco_list_node);
 	}
 }
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index e028d7a..d0d9bf1 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -113,6 +113,15 @@ static inline int sas_smp_host_handler(struct Scsi_Host *shost,
 }
 #endif
 
+static inline void sas_fail_probe(struct domain_device *dev, const char *func, int err)
+{
+	SAS_DPRINTK("%s: for %s device %16llx returned %d\n",
+		    func, dev->parent ? "exp-attached" :
+					    "direct-attached",
+		    SAS_ADDR(dev->sas_addr), err);
+	sas_unregister_dev(dev->port, dev);
+}
+
 static inline void sas_fill_in_rphy(struct domain_device *dev,
 				    struct sas_rphy *rphy)
 {
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index e58ca50..3701ff7 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -762,17 +762,10 @@ int sas_target_alloc(struct scsi_target *starget)
 {
 	struct sas_rphy *rphy = dev_to_rphy(starget->dev.parent);
 	struct domain_device *found_dev = sas_find_dev_by_rphy(rphy);
-	int res;
 
 	if (!found_dev)
 		return -ENODEV;
 
-	if (dev_is_sata(found_dev)) {
-		res = sas_ata_init_host_and_port(found_dev, starget);
-		if (res)
-			return res;
-	}
-
 	kref_get(&found_dev->kref);
 	starget->hostdata = found_dev;
 	return 0;
@@ -1012,16 +1005,6 @@ void sas_task_abort(struct sas_task *task)
 	}
 }
 
-int sas_slave_alloc(struct scsi_device *scsi_dev)
-{
-	struct domain_device *dev = sdev_to_domain_dev(scsi_dev);
-
-	if (dev_is_sata(dev))
-		return ata_sas_port_init(dev->sata_dev.ap);
-
-	return 0;
-}
-
 void sas_target_destroy(struct scsi_target *starget)
 {
 	struct domain_device *found_dev = starget->hostdata;
@@ -1082,6 +1065,5 @@ EXPORT_SYMBOL_GPL(sas_task_abort);
 EXPORT_SYMBOL_GPL(sas_phy_reset);
 EXPORT_SYMBOL_GPL(sas_eh_device_reset_handler);
 EXPORT_SYMBOL_GPL(sas_eh_bus_reset_handler);
-EXPORT_SYMBOL_GPL(sas_slave_alloc);
 EXPORT_SYMBOL_GPL(sas_target_destroy);
 EXPORT_SYMBOL_GPL(sas_ioctl);
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c
index d45878b..cc59dff 100644
--- a/drivers/scsi/mvsas/mv_init.c
+++ b/drivers/scsi/mvsas/mv_init.c
@@ -73,7 +73,6 @@ static struct scsi_host_template mvs_sht = {
 	.use_clustering		= ENABLE_CLUSTERING,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_bus_reset_handler	= sas_eh_bus_reset_handler,
-	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 	.shost_attrs		= mvst_host_attrs,
diff --git a/drivers/scsi/pm8001/pm8001_init.c b/drivers/scsi/pm8001/pm8001_init.c
index bd165ea..36efaa7 100644
--- a/drivers/scsi/pm8001/pm8001_init.c
+++ b/drivers/scsi/pm8001/pm8001_init.c
@@ -75,7 +75,6 @@ static struct scsi_host_template pm8001_sht = {
 	.use_clustering		= ENABLE_CLUSTERING,
 	.eh_device_reset_handler = sas_eh_device_reset_handler,
 	.eh_bus_reset_handler	= sas_eh_bus_reset_handler,
-	.slave_alloc		= sas_slave_alloc,
 	.target_destroy		= sas_target_destroy,
 	.ioctl			= sas_ioctl,
 	.shost_attrs		= pm8001_host_attrs,
diff --git a/include/linux/libata.h b/include/linux/libata.h
index aa42704..42378d6 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -996,6 +996,7 @@ extern int ata_sas_scsi_ioctl(struct ata_port *ap, struct scsi_device *dev,
 extern void ata_sas_port_destroy(struct ata_port *);
 extern struct ata_port *ata_sas_port_alloc(struct ata_host *,
 					   struct ata_port_info *, struct Scsi_Host *);
+extern int ata_sas_async_port_init(struct ata_port *);
 extern int ata_sas_port_init(struct ata_port *);
 extern int ata_sas_port_start(struct ata_port *ap);
 extern void ata_sas_port_stop(struct ata_port *ap);
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 4a42be3..20153d5 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -646,7 +646,6 @@ int sas_phy_reset(struct sas_phy *phy, int hard_reset);
 int sas_queue_up(struct sas_task *task);
 extern int sas_queuecommand(struct Scsi_Host * ,struct scsi_cmnd *);
 extern int sas_target_alloc(struct scsi_target *);
-extern int sas_slave_alloc(struct scsi_device *);
 extern int sas_slave_configure(struct scsi_device *);
 extern int sas_change_queue_depth(struct scsi_device *, int new_depth,
 				  int reason);
diff --git a/include/scsi/sas_ata.h b/include/scsi/sas_ata.h
index 1556eff..cdccd2e 100644
--- a/include/scsi/sas_ata.h
+++ b/include/scsi/sas_ata.h
@@ -37,15 +37,14 @@ static inline int dev_is_sata(struct domain_device *dev)
 }
 
 int sas_get_ata_info(struct domain_device *dev, struct ex_phy *phy);
-int sas_ata_init_host_and_port(struct domain_device *found_dev,
-			       struct scsi_target *starget);
-
+int sas_ata_init_host_and_port(struct domain_device *found_dev);
 void sas_ata_task_abort(struct sas_task *task);
 void sas_ata_strategy_handler(struct Scsi_Host *shost);
 void sas_ata_eh(struct Scsi_Host *shost, struct list_head *work_q,
 		struct list_head *done_q);
 void sas_ata_schedule_reset(struct domain_device *dev);
 void sas_ata_wait_eh(struct domain_device *dev);
+void sas_probe_sata(struct asd_sas_port *port);
 #else
 
 
@@ -53,8 +52,7 @@ static inline int dev_is_sata(struct domain_device *dev)
 {
 	return 0;
 }
-static inline int sas_ata_init_host_and_port(struct domain_device *found_dev,
-			       struct scsi_target *starget)
+static inline int sas_ata_init_host_and_port(struct domain_device *found_dev)
 {
 	return 0;
 }
@@ -79,6 +77,10 @@ static inline void sas_ata_wait_eh(struct domain_device *dev)
 {
 }
 
+static inline void sas_probe_sata(struct asd_sas_port *port)
+{
+}
+
 static inline int sas_get_ata_info(struct domain_device *dev, struct ex_phy *phy)
 {
 	return 0;


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 04/13] libsas: set attached device type and target protocols for local phys
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (2 preceding siblings ...)
  2012-02-10  8:44 ` [PATCH v8 03/13] libsas: async ata scanning Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 05/13] libsas: fixup target_port_protocols for expanders that don't report sata Dan Williams
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

Before:
$ cat /sys/class/sas_phy/phy-6\:3/device_type
none
$ cat /sys/class/sas_phy/phy-6\:3/target_port_protocols
none

After:
$ cat /sys/class/sas_phy/phy-6\:3/device_type
end device
$ cat /sys/class/sas_phy/phy-6\:3/target_port_protocols
sata

Also downgrade the phy_list_lock to _irq instead of _irqsave since
libsas will never call sas_get_port_device with interrupts disbled.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_discover.c |   12 ++++++++----
 drivers/scsi/libsas/sas_internal.h |   17 +++++++++++++++++
 drivers/scsi/libsas/sas_port.c     |    2 ++
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index ed3f8c0..7a3ae48 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -69,7 +69,6 @@ void sas_init_dev(struct domain_device *dev)
  */
 static int sas_get_port_device(struct asd_sas_port *port)
 {
-	unsigned long flags;
 	struct asd_sas_phy *phy;
 	struct sas_rphy *rphy;
 	struct domain_device *dev;
@@ -78,9 +77,9 @@ static int sas_get_port_device(struct asd_sas_port *port)
 	if (!dev)
 		return -ENOMEM;
 
-	spin_lock_irqsave(&port->phy_list_lock, flags);
+	spin_lock_irq(&port->phy_list_lock);
 	if (list_empty(&port->phy_list)) {
-		spin_unlock_irqrestore(&port->phy_list_lock, flags);
+		spin_unlock_irq(&port->phy_list_lock);
 		sas_put_device(dev);
 		return -ENODEV;
 	}
@@ -89,7 +88,7 @@ static int sas_get_port_device(struct asd_sas_port *port)
 	memcpy(dev->frame_rcvd, phy->frame_rcvd, min(sizeof(dev->frame_rcvd),
 					     (size_t)phy->frame_rcvd_size));
 	spin_unlock(&phy->frame_rcvd_lock);
-	spin_unlock_irqrestore(&port->phy_list_lock, flags);
+	spin_unlock_irq(&port->phy_list_lock);
 
 	if (dev->frame_rcvd[0] == 0x34 && port->oob_mode == SATA_OOB_MODE) {
 		struct dev_to_host_fis *fis =
@@ -134,6 +133,11 @@ static int sas_get_port_device(struct asd_sas_port *port)
 		sas_put_device(dev);
 		return -ENODEV;
 	}
+
+	spin_lock_irq(&port->phy_list_lock);
+	list_for_each_entry(phy, &port->phy_list, port_phy_el)
+		sas_phy_set_target(phy, dev);
+	spin_unlock_irq(&port->phy_list_lock);
 	rphy->identify.phy_identifier = phy->phy->identify.phy_identifier;
 	memcpy(dev->sas_addr, port->attached_sas_addr, SAS_ADDR_SIZE);
 	sas_fill_in_rphy(dev, rphy);
diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
index d0d9bf1..f05c638 100644
--- a/drivers/scsi/libsas/sas_internal.h
+++ b/drivers/scsi/libsas/sas_internal.h
@@ -30,6 +30,7 @@
 #include <scsi/scsi_host.h>
 #include <scsi/scsi_transport_sas.h>
 #include <scsi/libsas.h>
+#include <scsi/sas_ata.h>
 
 #define sas_printk(fmt, ...) printk(KERN_NOTICE "sas: " fmt, ## __VA_ARGS__)
 
@@ -147,6 +148,22 @@ static inline void sas_fill_in_rphy(struct domain_device *dev,
 	}
 }
 
+static inline void sas_phy_set_target(struct asd_sas_phy *p, struct domain_device *dev)
+{
+	struct sas_phy *phy = p->phy;
+
+	if (dev) {
+		if (dev_is_sata(dev))
+			phy->identify.device_type = SAS_END_DEVICE;
+		else
+			phy->identify.device_type = dev->dev_type;
+		phy->identify.target_port_protocols = dev->tproto;
+	} else {
+		phy->identify.device_type = SAS_PHY_UNUSED;
+		phy->identify.target_port_protocols = 0;
+	}
+}
+
 static inline void sas_add_parent_port(struct domain_device *dev, int phy_id)
 {
 	struct expander_device *ex = &dev->ex_dev;
diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
index 29f8fd3..eb19c01 100644
--- a/drivers/scsi/libsas/sas_port.c
+++ b/drivers/scsi/libsas/sas_port.c
@@ -104,6 +104,7 @@ static void sas_form_port(struct asd_sas_phy *phy)
 
 	/* add the phy to the port */
 	list_add_tail(&phy->port_phy_el, &port->phy_list);
+	sas_phy_set_target(phy, port->port_dev);
 	phy->port = port;
 	port->num_phys++;
 	port->phy_mask |= (1U << phy->id);
@@ -182,6 +183,7 @@ void sas_deform_port(struct asd_sas_phy *phy, int gone)
 	spin_lock(&port->phy_list_lock);
 
 	list_del_init(&phy->port_phy_el);
+	sas_phy_set_target(phy, NULL);
 	phy->port = NULL;
 	port->num_phys--;
 	port->phy_mask &= ~(1U << phy->id);


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 05/13] libsas: fixup target_port_protocols for expanders that don't report sata
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (3 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 04/13] libsas: set attached device type and target protocols for local phys Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 06/13] libsas: don't recover end devices attached to disabled phys Dan Williams
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

If discovery returns 0 for target_port_protocols but shows an attached
sata device, just report SAS_PROTOCOL_SATA in the identify data so
userspace can reliably search for sata devices in the domain.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_expander.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 14e3244..05acd9e 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -251,6 +251,8 @@ static void sas_set_ex_phy(struct domain_device *dev, int phy_id, void *rsp)
 	phy->phy->identify.device_type = dr->attached_dev_type;
 	phy->phy->identify.initiator_port_protocols = phy->attached_iproto;
 	phy->phy->identify.target_port_protocols = phy->attached_tproto;
+	if (!phy->attached_tproto && dr->attached_sata_dev)
+		phy->phy->identify.target_port_protocols = SAS_PROTOCOL_SATA;
 	phy->phy->identify.phy_identifier = phy_id;
 	phy->phy->minimum_linkrate_hw = dr->hmin_linkrate;
 	phy->phy->maximum_linkrate_hw = dr->hmax_linkrate;


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 06/13] libsas: don't recover end devices attached to disabled phys
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (4 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 05/13] libsas: fixup target_port_protocols for expanders that don't report sata Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 07/13] scsi_transport_sas: 'enable' phys on reset Dan Williams
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Luben Tuikov

If userspace has decided to disable a phy the kernel should honor that
and not inadvertantly re-enable the phy via error recovery.  This is
more straightforward in the sata case where link recovery (via
libata-eh) is separate from sas_task cancelling in libsas-eh.  Teach
libsas to accept -ENODEV as a successful response from I_T_nexus_reset
('successful' in terms of not escalating further).

This is a more comprehensive fix then "libsas: don't recover 'gone'
devices in sas_ata_hard_reset()", as it is no longer sata-specific.

aic94xx does check the return value from sas_phy_reset() so if the phy
is disabled we proceed with clearing the I_T_nexus.

Cc: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/aic94xx/aic94xx_tmf.c  |    2 +-
 drivers/scsi/libsas/sas_ata.c       |    5 ++---
 drivers/scsi/libsas/sas_init.c      |    3 +++
 drivers/scsi/libsas/sas_scsi_host.c |    3 ++-
 4 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/aic94xx/aic94xx_tmf.c b/drivers/scsi/aic94xx/aic94xx_tmf.c
index 50b914f..cf90409 100644
--- a/drivers/scsi/aic94xx/aic94xx_tmf.c
+++ b/drivers/scsi/aic94xx/aic94xx_tmf.c
@@ -192,7 +192,7 @@ int asd_I_T_nexus_reset(struct domain_device *dev)
 	ASD_DPRINTK("sending %s reset to %s\n",
 		    reset_type ? "hard" : "soft", dev_name(&phy->dev));
 	res = sas_phy_reset(phy, reset_type);
-	if (res == TMF_RESP_FUNC_COMPLETE) {
+	if (res == TMF_RESP_FUNC_COMPLETE || res == -ENODEV) {
 		/* wait for the maximum settle time */
 		msleep(500);
 		/* clear all outstanding commands (keep nexus suspended) */
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
index befba2c..1b1d4b7 100644
--- a/drivers/scsi/libsas/sas_ata.c
+++ b/drivers/scsi/libsas/sas_ata.c
@@ -407,10 +407,9 @@ static int sas_ata_hard_reset(struct ata_link *link, unsigned int *class,
 	struct domain_device *dev = ap->private_data;
 	struct sas_internal *i = dev_to_sas_internal(dev);
 
-	if (test_bit(SAS_DEV_GONE, &dev->state))
-		return -ENODEV;
-
 	res = i->dft->lldd_I_T_nexus_reset(dev);
+	if (res == -ENODEV)
+		return res;
 
 	if (res != TMF_RESP_FUNC_COMPLETE)
 		sas_ata_printk(KERN_DEBUG, dev, "Unable to reset ata device?\n");
diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index 09c14ca..120bff6 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -298,6 +298,9 @@ int sas_phy_reset(struct sas_phy *phy, int hard_reset)
 	int ret;
 	enum phy_func reset_type;
 
+	if (!phy->enabled)
+		return -ENODEV;
+
 	if (hard_reset)
 		reset_type = PHY_FUNC_HARD_RESET;
 	else
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index fd32913..f0b9b7b 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -607,7 +607,8 @@ static void sas_eh_handle_sas_errors(struct Scsi_Host *shost, struct list_head *
 			SAS_DPRINTK("task 0x%p is not at LU: I_T recover\n",
 				    task);
 			tmf_resp = sas_recover_I_T(task->dev);
-			if (tmf_resp == TMF_RESP_FUNC_COMPLETE) {
+			if (tmf_resp == TMF_RESP_FUNC_COMPLETE ||
+			    tmf_resp == -ENODEV) {
 				struct domain_device *dev = task->dev;
 				SAS_DPRINTK("I_T %016llx recovered\n",
 					    SAS_ADDR(task->dev->sas_addr));


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 07/13] scsi_transport_sas: 'enable' phys on reset
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (5 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 06/13] libsas: don't recover end devices attached to disabled phys Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter Dan Williams
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

If userspace requests a phy reset, treat that as a request for the phy
to be enabled since that is the effect on hardware.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/scsi_transport_sas.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index 7d69a25..f7565fc 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -615,6 +615,7 @@ do_sas_phy_reset(struct device *dev, size_t count, int hard_reset)
 	error = i->f->phy_reset(phy, hard_reset);
 	if (error)
 		return error;
+	phy->enabled = 1;
 	return count;
 };
 


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (6 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 07/13] scsi_transport_sas: 'enable' phys on reset Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-29 21:55   ` James Bottomley
  2012-02-10  8:45 ` [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context Dan Williams
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

It is possible for a host to get "locked out" from talking to sata
devices in the domain if, for example, its sas address changes but the
expander topology has existing affiliations with the old address.  If
the system is booted userspace can write to
/sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
if this condition exists for the root device the module parameter can be
used to promote all ata resets to hard resets.

After the system is booted this state can be cleared via
/sys/module/libsas/parameters/force_hard_reset

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Cc: Luben Tuikov <ltuikov@yahoo.com>
Cc: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 Documentation/kernel-parameters.txt |    6 ++++++
 drivers/scsi/libsas/sas_init.c      |    6 +++++-
 2 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 81c287f..ffefa3b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1283,6 +1283,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			If there are multiple matching configurations changing
 			the same attribute, the last one is used.
 
+	libsas.force_hard_reset=
+			[LIBSAS] Clear SATA affiliations with every reset, for
+			cases where affiliation errors are causing boot
+			failures, otherwise use sysfs hard_reset interface to
+			clear individual phys.
+
 	memblock=debug	[KNL] Enable memblock debug messages.
 
 	load_ramdisk=	[RAM] List of ramdisks to load from floppy
diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index 120bff6..2fc23d3 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -293,6 +293,10 @@ static int sas_phy_enable(struct sas_phy *phy, int enable)
 	return ret;
 }
 
+static bool force_hard_reset;
+module_param(force_hard_reset, bool, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(force_hard_reset, "clear sata affiliations on every reset");
+
 int sas_phy_reset(struct sas_phy *phy, int hard_reset)
 {
 	int ret;
@@ -301,7 +305,7 @@ int sas_phy_reset(struct sas_phy *phy, int hard_reset)
 	if (!phy->enabled)
 		return -ENODEV;
 
-	if (hard_reset)
+	if (hard_reset || force_hard_reset)
 		reset_type = PHY_FUNC_HARD_RESET;
 	else
 		reset_type = PHY_FUNC_LINK_RESET;


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (7 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-29 22:05   ` James Bottomley
  2012-02-10  8:45 ` [PATCH v8 10/13] libsas: add sas_eh_abort_handler Dan Williams
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

The strategy handlers may be called in places that are problematic for
libsas (i.e. sata resets outside of domain revalidation filtering /
libata link recovery), or problematic for userspace (non-blocking ioctl
to sleeping reset functions).  However, these routines are also called
for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
as long as we are running in the host's error handler.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
 1 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index f0b9b7b..1cabedc 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
 /* Attempt to send a LUN reset message to a device */
 int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
 {
-	struct domain_device *dev = cmd_to_domain_dev(cmd);
-	struct sas_internal *i =
-		to_sas_internal(dev->port->ha->core.shost->transportt);
-	struct scsi_lun lun;
 	int res;
+	struct scsi_lun lun;
+	struct Scsi_Host *host = cmd->device->host;
+	struct domain_device *dev = cmd_to_domain_dev(cmd);
+	struct sas_internal *i = to_sas_internal(host->transportt);
+
+	if (current != host->ehandler)
+		return FAILED;
 
 	int_to_scsilun(cmd->device->lun, &lun);
 
@@ -486,8 +489,12 @@ int sas_eh_bus_reset_handler(struct scsi_cmnd *cmd)
 {
 	struct domain_device *dev = cmd_to_domain_dev(cmd);
 	struct sas_phy *phy = sas_get_local_phy(dev);
+	struct Scsi_Host *host = cmd->device->host;
 	int res;
 
+	if (current != host->ehandler)
+		return FAILED;
+
 	res = sas_phy_reset(phy, 1);
 	if (res)
 		SAS_DPRINTK("Bus reset of %s failed 0x%x\n",


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 10/13] libsas: add sas_eh_abort_handler
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (8 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler Dan Williams
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

When recovering failed eh-cmnds let the lldd attempt an abort via
scsi_abort_eh_cmnd before escalating.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_scsi_host.c |   21 +++++++++++++++++++++
 include/scsi/libsas.h               |    1 +
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 1cabedc..82f7532 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -460,6 +460,27 @@ struct sas_phy *sas_get_local_phy(struct domain_device *dev)
 }
 EXPORT_SYMBOL_GPL(sas_get_local_phy);
 
+int sas_eh_abort_handler(struct scsi_cmnd *cmd)
+{
+	int res;
+	struct sas_task *task = TO_SAS_TASK(cmd);
+	struct Scsi_Host *host = cmd->device->host;
+	struct sas_internal *i = to_sas_internal(host->transportt);
+
+	if (current != host->ehandler)
+		return FAILED;
+
+	if (!i->dft->lldd_abort_task)
+		return FAILED;
+
+	res = i->dft->lldd_abort_task(task);
+	if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE)
+		return SUCCESS;
+
+	return FAILED;
+}
+EXPORT_SYMBOL_GPL(sas_eh_abort_handler);
+
 /* Attempt to send a LUN reset message to a device */
 int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
 {
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 5f5ed1b..484bc52 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -674,6 +674,7 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *);
 void sas_init_dev(struct domain_device *);
 
 void sas_task_abort(struct sas_task *);
+int sas_eh_abort_handler(struct scsi_cmnd *cmd);
 int sas_eh_device_reset_handler(struct scsi_cmnd *cmd);
 int sas_eh_bus_reset_handler(struct scsi_cmnd *cmd);
 


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (9 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 10/13] libsas: add sas_eh_abort_handler Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 12/13] isci: use sas eh strategy handlers Dan Williams
  2012-02-10  8:45 ` [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure Dan Williams
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

sas_eh_bus_reset_handler() amounts to sas_phy_reset() without
notification of the reset to the lldd.  If this is triggered from
eh-cmnd recovery there may be sas_tasks for the lldd to terminate, so
->lldd_I_T_nexus_reset is warranted.

Cc: Xiangliang Yu <yuxiangl@marvell.com>
Cc: Luben Tuikov <ltuikov@yahoo.com>
Cc: Jack Wang <jack_wang@usish.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_scsi_host.c |   19 ++++++++-----------
 1 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 82f7532..86ffd8f 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -505,25 +505,22 @@ int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
 	return FAILED;
 }
 
-/* Attempt to send a phy (bus) reset */
 int sas_eh_bus_reset_handler(struct scsi_cmnd *cmd)
 {
-	struct domain_device *dev = cmd_to_domain_dev(cmd);
-	struct sas_phy *phy = sas_get_local_phy(dev);
-	struct Scsi_Host *host = cmd->device->host;
 	int res;
+	struct Scsi_Host *host = cmd->device->host;
+	struct domain_device *dev = cmd_to_domain_dev(cmd);
+	struct sas_internal *i = to_sas_internal(host->transportt);
 
 	if (current != host->ehandler)
 		return FAILED;
 
-	res = sas_phy_reset(phy, 1);
-	if (res)
-		SAS_DPRINTK("Bus reset of %s failed 0x%x\n",
-			    kobject_name(&phy->dev.kobj),
-			    res);
-	sas_put_local_phy(phy);
+	if (!i->dft->lldd_I_T_nexus_reset)
+		return FAILED;
 
-	if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE)
+	res = i->dft->lldd_I_T_nexus_reset(dev);
+	if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE ||
+	    res == -ENODEV)
 		return SUCCESS;
 
 	return FAILED;


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 12/13] isci: use sas eh strategy handlers
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (10 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-10  8:45 ` [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure Dan Williams
  12 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide

...now that the strategy handlers guarantee eh context and and notify
the driver of bus reset.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/isci/init.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/isci/init.c b/drivers/scsi/isci/init.c
index 9a28270..87ac0f6 100644
--- a/drivers/scsi/isci/init.c
+++ b/drivers/scsi/isci/init.c
@@ -157,6 +157,9 @@ static struct scsi_host_template isci_sht = {
 	.sg_tablesize			= SG_ALL,
 	.max_sectors			= SCSI_DEFAULT_MAX_SECTORS,
 	.use_clustering			= ENABLE_CLUSTERING,
+	.eh_abort_handler		= sas_eh_abort_handler,
+	.eh_device_reset_handler        = sas_eh_device_reset_handler,
+	.eh_bus_reset_handler           = sas_eh_bus_reset_handler,
 	.target_destroy			= sas_target_destroy,
 	.ioctl				= sas_ioctl,
 	.shost_attrs			= isci_host_attrs,


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure
  2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
                   ` (11 preceding siblings ...)
  2012-02-10  8:45 ` [PATCH v8 12/13] isci: use sas eh strategy handlers Dan Williams
@ 2012-02-10  8:45 ` Dan Williams
  2012-02-15  1:09   ` Jack Wang
  12 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-10  8:45 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Christoph Hellwig

The timer and the completion are only used for slow path tasks (smp, and
lldd tmfs), yet we incur the allocation space and cpu setup time for
every fast path task.

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 drivers/scsi/libsas/sas_expander.c  |   20 ++++++++++----------
 drivers/scsi/libsas/sas_init.c      |   23 +++++++++++++++++++++--
 drivers/scsi/libsas/sas_scsi_host.c |    8 ++++++--
 drivers/scsi/mvsas/mv_sas.c         |   20 ++++++++++----------
 drivers/scsi/pm8001/pm8001_sas.c    |   30 +++++++++++++++---------------
 include/scsi/libsas.h               |   14 +++++++++-----
 6 files changed, 71 insertions(+), 44 deletions(-)

diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
index 05acd9e..0ab3796 100644
--- a/drivers/scsi/libsas/sas_expander.c
+++ b/drivers/scsi/libsas/sas_expander.c
@@ -51,14 +51,14 @@ static void smp_task_timedout(unsigned long _task)
 		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
 	spin_unlock_irqrestore(&task->task_state_lock, flags);
 
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 static void smp_task_done(struct sas_task *task)
 {
-	if (!del_timer(&task->timer))
+	if (!del_timer(&task->slow_task->timer))
 		return;
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 /* Give it some long enough timeout. In seconds. */
@@ -79,7 +79,7 @@ static int smp_execute_task(struct domain_device *dev, void *req, int req_size,
 			break;
 		}
 
-		task = sas_alloc_task(GFP_KERNEL);
+		task = sas_alloc_slow_task(GFP_KERNEL);
 		if (!task) {
 			res = -ENOMEM;
 			break;
@@ -91,20 +91,20 @@ static int smp_execute_task(struct domain_device *dev, void *req, int req_size,
 
 		task->task_done = smp_task_done;
 
-		task->timer.data = (unsigned long) task;
-		task->timer.function = smp_task_timedout;
-		task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
-		add_timer(&task->timer);
+		task->slow_task->timer.data = (unsigned long) task;
+		task->slow_task->timer.function = smp_task_timedout;
+		task->slow_task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
+		add_timer(&task->slow_task->timer);
 
 		res = i->dft->lldd_execute_task(task, 1, GFP_KERNEL);
 
 		if (res) {
-			del_timer(&task->timer);
+			del_timer(&task->slow_task->timer);
 			SAS_DPRINTK("executing SMP task failed:%d\n", res);
 			break;
 		}
 
-		wait_for_completion(&task->completion);
+		wait_for_completion(&task->slow_task->completion);
 		res = -ECOMM;
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
 			SAS_DPRINTK("smp task timed out or aborted\n");
diff --git a/drivers/scsi/libsas/sas_init.c b/drivers/scsi/libsas/sas_init.c
index 2fc23d3..57e7ac9 100644
--- a/drivers/scsi/libsas/sas_init.c
+++ b/drivers/scsi/libsas/sas_init.c
@@ -48,18 +48,37 @@ struct sas_task *sas_alloc_task(gfp_t flags)
 		INIT_LIST_HEAD(&task->list);
 		spin_lock_init(&task->task_state_lock);
 		task->task_state_flags = SAS_TASK_STATE_PENDING;
-		init_timer(&task->timer);
-		init_completion(&task->completion);
 	}
 
 	return task;
 }
 EXPORT_SYMBOL_GPL(sas_alloc_task);
 
+struct sas_task *sas_alloc_slow_task(gfp_t flags)
+{
+	struct sas_task *task = sas_alloc_task(flags);
+	struct sas_task_slow *slow = kmalloc(sizeof(*slow), flags);
+
+	if (!task || !slow) {
+		if (task)
+			kmem_cache_free(sas_task_cache, task);
+		kfree(slow);
+		return NULL;
+	}
+
+	task->slow_task = slow;
+	init_timer(&slow->timer);
+	init_completion(&slow->completion);
+
+	return task;
+}
+EXPORT_SYMBOL_GPL(sas_alloc_slow_task);
+
 void sas_free_task(struct sas_task *task)
 {
 	if (task) {
 		BUG_ON(!list_empty(&task->list));
+		kfree(task->slow_task);
 		kmem_cache_free(sas_task_cache, task);
 	}
 }
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 86ffd8f..9716f2e 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -1013,9 +1013,13 @@ void sas_task_abort(struct sas_task *task)
 
 	/* Escape for libsas internal commands */
 	if (!sc) {
-		if (!del_timer(&task->timer))
+		struct sas_task_slow *slow = task->slow_task;
+
+		if (!slow)
+			return;
+		if (!del_timer(&slow->timer))
 			return;
-		task->timer.function(task->timer.data);
+		slow->timer.function(slow->timer.data);
 		return;
 	}
 
diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
index b68a653..d0462b8 100644
--- a/drivers/scsi/mvsas/mv_sas.c
+++ b/drivers/scsi/mvsas/mv_sas.c
@@ -1365,9 +1365,9 @@ void mvs_dev_gone(struct domain_device *dev)
 
 static void mvs_task_done(struct sas_task *task)
 {
-	if (!del_timer(&task->timer))
+	if (!del_timer(&task->slow_task->timer))
 		return;
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 static void mvs_tmf_timedout(unsigned long data)
@@ -1375,7 +1375,7 @@ static void mvs_tmf_timedout(unsigned long data)
 	struct sas_task *task = (struct sas_task *)data;
 
 	task->task_state_flags |= SAS_TASK_STATE_ABORTED;
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 #define MVS_TASK_TIMEOUT 20
@@ -1386,7 +1386,7 @@ static int mvs_exec_internal_tmf_task(struct domain_device *dev,
 	struct sas_task *task = NULL;
 
 	for (retry = 0; retry < 3; retry++) {
-		task = sas_alloc_task(GFP_KERNEL);
+		task = sas_alloc_slow_task(GFP_KERNEL);
 		if (!task)
 			return -ENOMEM;
 
@@ -1396,20 +1396,20 @@ static int mvs_exec_internal_tmf_task(struct domain_device *dev,
 		memcpy(&task->ssp_task, parameter, para_len);
 		task->task_done = mvs_task_done;
 
-		task->timer.data = (unsigned long) task;
-		task->timer.function = mvs_tmf_timedout;
-		task->timer.expires = jiffies + MVS_TASK_TIMEOUT*HZ;
-		add_timer(&task->timer);
+		task->slow_task->timer.data = (unsigned long) task;
+		task->slow_task->timer.function = mvs_tmf_timedout;
+		task->slow_task->timer.expires = jiffies + MVS_TASK_TIMEOUT*HZ;
+		add_timer(&task->slow_task->timer);
 
 		res = mvs_task_exec(task, 1, GFP_KERNEL, NULL, 1, tmf);
 
 		if (res) {
-			del_timer(&task->timer);
+			del_timer(&task->slow_task->timer);
 			mv_printk("executing internel task failed:%d\n", res);
 			goto ex_err;
 		}
 
-		wait_for_completion(&task->completion);
+		wait_for_completion(&task->slow_task->completion);
 		res = TMF_RESP_FUNC_FAILED;
 		/* Even TMF timed out, return direct. */
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
diff --git a/drivers/scsi/pm8001/pm8001_sas.c b/drivers/scsi/pm8001/pm8001_sas.c
index b111018..ad36f1a 100644
--- a/drivers/scsi/pm8001/pm8001_sas.c
+++ b/drivers/scsi/pm8001/pm8001_sas.c
@@ -627,9 +627,9 @@ int pm8001_dev_found(struct domain_device *dev)
 
 static void pm8001_task_done(struct sas_task *task)
 {
-	if (!del_timer(&task->timer))
+	if (!del_timer(&task->slow_task->timer))
 		return;
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 static void pm8001_tmf_timedout(unsigned long data)
@@ -637,7 +637,7 @@ static void pm8001_tmf_timedout(unsigned long data)
 	struct sas_task *task = (struct sas_task *)data;
 
 	task->task_state_flags |= SAS_TASK_STATE_ABORTED;
-	complete(&task->completion);
+	complete(&task->slow_task->completion);
 }
 
 #define PM8001_TASK_TIMEOUT 20
@@ -668,21 +668,21 @@ static int pm8001_exec_internal_tmf_task(struct domain_device *dev,
 		task->task_proto = dev->tproto;
 		memcpy(&task->ssp_task, parameter, para_len);
 		task->task_done = pm8001_task_done;
-		task->timer.data = (unsigned long)task;
-		task->timer.function = pm8001_tmf_timedout;
-		task->timer.expires = jiffies + PM8001_TASK_TIMEOUT*HZ;
-		add_timer(&task->timer);
+		task->slow_task->timer.data = (unsigned long)task;
+		task->slow_task->timer.function = pm8001_tmf_timedout;
+		task->slow_task->timer.expires = jiffies + PM8001_TASK_TIMEOUT*HZ;
+		add_timer(&task->slow_task->timer);
 
 		res = pm8001_task_exec(task, 1, GFP_KERNEL, 1, tmf);
 
 		if (res) {
-			del_timer(&task->timer);
+			del_timer(&task->slow_task->timer);
 			PM8001_FAIL_DBG(pm8001_ha,
 				pm8001_printk("Executing internal task "
 				"failed\n"));
 			goto ex_err;
 		}
-		wait_for_completion(&task->completion);
+		wait_for_completion(&task->slow_task->completion);
 		res = -TMF_RESP_FUNC_FAILED;
 		/* Even TMF timed out, return direct. */
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
@@ -749,10 +749,10 @@ pm8001_exec_internal_task_abort(struct pm8001_hba_info *pm8001_ha,
 		task->dev = dev;
 		task->task_proto = dev->tproto;
 		task->task_done = pm8001_task_done;
-		task->timer.data = (unsigned long)task;
-		task->timer.function = pm8001_tmf_timedout;
-		task->timer.expires = jiffies + PM8001_TASK_TIMEOUT * HZ;
-		add_timer(&task->timer);
+		task->slow_task->timer.data = (unsigned long)task;
+		task->slow_task->timer.function = pm8001_tmf_timedout;
+		task->slow_task->timer.expires = jiffies + PM8001_TASK_TIMEOUT * HZ;
+		add_timer(&task->slow_task->timer);
 
 		res = pm8001_tag_alloc(pm8001_ha, &ccb_tag);
 		if (res)
@@ -766,13 +766,13 @@ pm8001_exec_internal_task_abort(struct pm8001_hba_info *pm8001_ha,
 			pm8001_dev, flag, task_tag, ccb_tag);
 
 		if (res) {
-			del_timer(&task->timer);
+			del_timer(&task->slow_task->timer);
 			PM8001_FAIL_DBG(pm8001_ha,
 				pm8001_printk("Executing internal task "
 				"failed\n"));
 			goto ex_err;
 		}
-		wait_for_completion(&task->completion);
+		wait_for_completion(&task->slow_task->completion);
 		res = TMF_RESP_FUNC_FAILED;
 		/* Even TMF timed out, return direct. */
 		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 484bc52..4e84ef3 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -568,10 +568,6 @@ struct sas_task {
 
 	enum   sas_protocol      task_proto;
 
-	/* Used by the discovery code. */
-	struct timer_list     timer;
-	struct completion     completion;
-
 	union {
 		struct sas_ata_task ata_task;
 		struct sas_smp_task smp_task;
@@ -588,8 +584,15 @@ struct sas_task {
 
 	void   *lldd_task;	  /* for use by LLDDs */
 	void   *uldd_task;
+	struct sas_task_slow *slow_task;
+};
 
-	struct work_struct abort_work;
+struct sas_task_slow {
+	/* standard/extra infrastructure for slow path commands (SMP and
+	 * internal lldd commands
+	 */
+	struct timer_list     timer;
+	struct completion     completion;
 };
 
 #define SAS_TASK_STATE_PENDING      1
@@ -599,6 +602,7 @@ struct sas_task {
 #define SAS_TASK_AT_INITIATOR       16
 
 extern struct sas_task *sas_alloc_task(gfp_t flags);
+extern struct sas_task *sas_alloc_slow_task(gfp_t flags);
 extern void sas_free_task(struct sas_task *task);
 
 struct sas_domain_function_template {


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race
  2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
@ 2012-02-10 18:32   ` Jeff Garzik
  2012-02-10 19:09   ` Dan Williams
  1 sibling, 0 replies; 29+ messages in thread
From: Jeff Garzik @ 2012-02-10 18:32 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-scsi, linux-ide, Marcin Tomczak

On 02/10/2012 03:44 AM, Dan Williams wrote:
> ata_port lifetime in libata follows the host.  In libsas it follows the
> scsi_target.  Once scsi_remove_device() has caused all commands to be
> completed it allows scsi_remove_target() to immediately proceed to
> freeing the ata_port causing bug reports like:
>
> [  848.393333] BUG: spinlock bad magic on CPU#4, kworker/u:2/5107
> [  848.400262] general protection fault: 0000 [#1] SMP
> [  848.406244] CPU 4
> [  848.408310] Modules linked in: nls_utf8 ipv6 uinput i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma dca sg sd_mod sr_mod cdrom ahci libahci isci libsas libata scsi_transport_sas [last unloaded: scsi_wait_scan]
> [  848.432060]
> [  848.434137] Pid: 5107, comm: kworker/u:2 Not tainted 3.2.0-isci+ #8 Intel Corporation S2600CP/S2600CP
> [  848.445310] RIP: 0010:[<ffffffff8126a68c>]  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
> [  848.454787] RSP: 0018:ffff8807f868dca0  EFLAGS: 00010002
> [  848.461137] RAX: 0000000000000048 RBX: ffff8807fe86a630 RCX: ffffffff817d0be0
> [  848.469520] RDX: 0000000000000000 RSI: ffffffff814af1cf RDI: 0000000000000002
> [  848.477959] RBP: ffff8807f868dcb0 R08: 00000000ffffffff R09: 000000006b6b6b6b
> [  848.486327] R10: 000000000003fb8c R11: ffffffff81a19448 R12: 6b6b6b6b6b6b6b6b
> [  848.494699] R13: ffff8808027dc520 R14: 0000000000000000 R15: 000000000000001e
> [  848.503067] FS:  0000000000000000(0000) GS:ffff88083fd00000(0000) knlGS:0000000000000000
> [  848.512899] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [  848.519710] CR2: 00007ff77d001000 CR3: 00000007f7a5d000 CR4: 00000000000406e0
> [  848.528072] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  848.536446] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  848.544831] Process kworker/u:2 (pid: 5107, threadinfo ffff8807f868c000, task ffff8807ff348000)
> [  848.555327] Stack:
> [  848.557959]  ffff8807fe86a630 ffff8807fe86a630 ffff8807f868dcd0 ffffffff8126a6e0
> [  848.567072]  ffffffff817c142f ffff8807fe86a630 ffff8807f868dcf0 ffffffff8126a703
> [  848.576190]  ffff8808027dc520 0000000000000286 ffff8807f868dd10 ffffffff814af1bb
> [  848.585281] Call Trace:
> [  848.588409]  [<ffffffff8126a6e0>] spin_bug+0x26/0x28
> [  848.594357]  [<ffffffff8126a703>] do_raw_spin_unlock+0x21/0x88
> [  848.601283]  [<ffffffff814af1bb>] _raw_spin_unlock_irqrestore+0x2c/0x65
> [  848.609089]  [<ffffffffa001c103>] ata_scsi_port_error_handler+0x548/0x557 [libata]
> [  848.618331]  [<ffffffff81061813>] ? async_schedule+0x17/0x17
> [  848.625060]  [<ffffffffa004f30f>] async_sas_ata_eh+0x45/0x69 [libsas]
> [  848.632655]  [<ffffffff810618aa>] async_run_entry_fn+0x97/0x125
> [  848.639670]  [<ffffffff81057439>] process_one_work+0x207/0x38d
> [  848.646577]  [<ffffffff8105738c>] ? process_one_work+0x15a/0x38d
> [  848.653681]  [<ffffffff810576f7>] worker_thread+0x138/0x21c
> [  848.660305]  [<ffffffff810575bf>] ? process_one_work+0x38d/0x38d
> [  848.667493]  [<ffffffff8105b098>] kthread+0x9d/0xa5
> [  848.673382]  [<ffffffff8106e1bd>] ? trace_hardirqs_on_caller+0x12f/0x166
> [  848.681304]  [<ffffffff814b7704>] kernel_thread_helper+0x4/0x10
> [  848.688324]  [<ffffffff814af534>] ? retint_restore_args+0x13/0x13
> [  848.695530]  [<ffffffff8105affb>] ? __init_kthread_worker+0x5b/0x5b
> [  848.702929]  [<ffffffff814b7700>] ? gs_change+0x13/0x13
> [  848.709155] Code: 00 00 48 8d 88 38 04 00 00 44 8b 80 84 02 00 00 31 c0 e8 cf 1b 24 00 41 83 c8 ff 44 8b 4b 08 48 c7 c1 e0 0b 7d 81 4d 85 e4 74 10<45>  8b 84 24 84 02 00 00 49 8d 8c 24 38 04 00 00 8b 53 04 48 89
> [  848.732467] RIP  [<ffffffff8126a68c>] spin_dump+0x5e/0x8c
> [  848.738905]  RSP<ffff8807f868dca0>
> [  848.743743] ---[ end trace 143161646eee8caa ]---
>
> ...so arrange for the ata_port to have the same end of life as the domain
> device.
>
> Reported-by: Marcin Tomczak<marcin.tomczak@intel.com>
> Signed-off-by: Dan Williams<dan.j.williams@intel.com>

Acked-by: Jeff Garzik <jgarzik@redhat.com>




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 03/13] libsas: async ata scanning
  2012-02-10  8:44 ` [PATCH v8 03/13] libsas: async ata scanning Dan Williams
@ 2012-02-10 18:34   ` Jeff Garzik
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Garzik @ 2012-02-10 18:34 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-scsi, linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

On 02/10/2012 03:44 AM, Dan Williams wrote:
> libsas ata error handling is already async but this does not help the
> scan case.  Move initial link recovery out from under host->scan_mutex,
> and delay synchronization with eh until after all port probe/recovery
> work has been queued.
>
> Device ordering is maintained with scan order by still calling
> sas_rphy_add() in order of domain discovery.
>
> Since we now scan the domain list when invoking libata-eh we need to be
> careful to check for fully initialized ata ports.
>
> Cc: Xiangliang Yu<yuxiangl@marvell.com>
> Cc: Luben Tuikov<ltuikov@yahoo.com>
> Acked-by: Jack Wang<jack_wang@usish.com>
> Signed-off-by: Dan Williams<dan.j.williams@intel.com>
> ---
>   drivers/ata/libata-core.c           |   34 +++++++++-------
>   drivers/ata/libata-scsi.c           |   13 ++++++
>   drivers/ata/libata.h                |    1
>   drivers/scsi/aic94xx/aic94xx_init.c |    1
>   drivers/scsi/isci/init.c            |    1
>   drivers/scsi/libsas/sas_ata.c       |   74 ++++++++++++++++++++++++++++++-----
>   drivers/scsi/libsas/sas_discover.c  |   22 +++++-----
>   drivers/scsi/libsas/sas_internal.h  |    9 ++++
>   drivers/scsi/libsas/sas_scsi_host.c |   18 ---------
>   drivers/scsi/mvsas/mv_init.c        |    1
>   drivers/scsi/pm8001/pm8001_init.c   |    1
>   include/linux/libata.h              |    1
>   include/scsi/libsas.h               |    1
>   include/scsi/sas_ata.h              |   12 +++---
>   14 files changed, 123 insertions(+), 66 deletions(-)

Acked-by: Jeff Garzik <jgarzik@redhat.com>




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race
  2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
  2012-02-10 18:32   ` Jeff Garzik
@ 2012-02-10 19:09   ` Dan Williams
  1 sibling, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-02-10 19:09 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-ide, Marcin Tomczak

On Fri, Feb 10, 2012 at 12:44 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c
> index 37a9e73..852b1b1 100644
> --- a/drivers/scsi/libsas/sas_ata.c
> +++ b/drivers/scsi/libsas/sas_ata.c
> @@ -661,6 +661,10 @@ static void async_sas_ata_eh(void *data, async_cookie_t cookie)
>        struct ata_port *ap = dev->sata_dev.ap;
>        struct sas_ha_struct *ha = dev->port->ha;
>
> +       /* hold a reference over eh since we may be racing with final
> +        * remove once all commands are completed
> +        */
> +       kref_get(&dev->kref);
>        ata_port_printk(ap, KERN_DEBUG, "sas eh calling libata port error handler");
>        ata_scsi_port_error_handler(ha->core.shost, ap);

The last rebase ate the matching call to "sas_put_device()" that needs
to be here.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure
  2012-02-10  8:45 ` [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure Dan Williams
@ 2012-02-15  1:09   ` Jack Wang
  0 siblings, 0 replies; 29+ messages in thread
From: Jack Wang @ 2012-02-15  1:09 UTC (permalink / raw)
  To: 'Dan Williams', linux-scsi; +Cc: linux-ide, 'Christoph Hellwig'

Dear Dan,

You seems forget change sas_alloc_task to sas_alloc_slow_task in pm8001.
I'll post a patch to fix this later.
Thanks for your work.

Jack
[PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure
> 
> The timer and the completion are only used for slow path tasks (smp, and
> lldd tmfs), yet we incur the allocation space and cpu setup time for
> every fast path task.
> 
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/scsi/libsas/sas_expander.c  |   20 ++++++++++----------
>  drivers/scsi/libsas/sas_init.c      |   23 +++++++++++++++++++++--
>  drivers/scsi/libsas/sas_scsi_host.c |    8 ++++++--
>  drivers/scsi/mvsas/mv_sas.c         |   20 ++++++++++----------
>  drivers/scsi/pm8001/pm8001_sas.c    |   30 +++++++++++++++---------------
>  include/scsi/libsas.h               |   14 +++++++++-----
>  6 files changed, 71 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/scsi/libsas/sas_expander.c
> b/drivers/scsi/libsas/sas_expander.c
> index 05acd9e..0ab3796 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -51,14 +51,14 @@ static void smp_task_timedout(unsigned long _task)
>  		task->task_state_flags |= SAS_TASK_STATE_ABORTED;
>  	spin_unlock_irqrestore(&task->task_state_lock, flags);
> 
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  static void smp_task_done(struct sas_task *task)
>  {
> -	if (!del_timer(&task->timer))
> +	if (!del_timer(&task->slow_task->timer))
>  		return;
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  /* Give it some long enough timeout. In seconds. */
> @@ -79,7 +79,7 @@ static int smp_execute_task(struct domain_device *dev,
void
> *req, int req_size,
>  			break;
>  		}
> 
> -		task = sas_alloc_task(GFP_KERNEL);
> +		task = sas_alloc_slow_task(GFP_KERNEL);
>  		if (!task) {
>  			res = -ENOMEM;
>  			break;
> @@ -91,20 +91,20 @@ static int smp_execute_task(struct domain_device *dev,
> void *req, int req_size,
> 
>  		task->task_done = smp_task_done;
> 
> -		task->timer.data = (unsigned long) task;
> -		task->timer.function = smp_task_timedout;
> -		task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
> -		add_timer(&task->timer);
> +		task->slow_task->timer.data = (unsigned long) task;
> +		task->slow_task->timer.function = smp_task_timedout;
> +		task->slow_task->timer.expires = jiffies + SMP_TIMEOUT*HZ;
> +		add_timer(&task->slow_task->timer);
> 
>  		res = i->dft->lldd_execute_task(task, 1, GFP_KERNEL);
> 
>  		if (res) {
> -			del_timer(&task->timer);
> +			del_timer(&task->slow_task->timer);
>  			SAS_DPRINTK("executing SMP task failed:%d\n", res);
>  			break;
>  		}
> 
> -		wait_for_completion(&task->completion);
> +		wait_for_completion(&task->slow_task->completion);
>  		res = -ECOMM;
>  		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
>  			SAS_DPRINTK("smp task timed out or aborted\n");
> diff --git a/drivers/scsi/libsas/sas_init.c
> b/drivers/scsi/libsas/sas_init.c
> index 2fc23d3..57e7ac9 100644
> --- a/drivers/scsi/libsas/sas_init.c
> +++ b/drivers/scsi/libsas/sas_init.c
> @@ -48,18 +48,37 @@ struct sas_task *sas_alloc_task(gfp_t flags)
>  		INIT_LIST_HEAD(&task->list);
>  		spin_lock_init(&task->task_state_lock);
>  		task->task_state_flags = SAS_TASK_STATE_PENDING;
> -		init_timer(&task->timer);
> -		init_completion(&task->completion);
>  	}
> 
>  	return task;
>  }
>  EXPORT_SYMBOL_GPL(sas_alloc_task);
> 
> +struct sas_task *sas_alloc_slow_task(gfp_t flags)
> +{
> +	struct sas_task *task = sas_alloc_task(flags);
> +	struct sas_task_slow *slow = kmalloc(sizeof(*slow), flags);
> +
> +	if (!task || !slow) {
> +		if (task)
> +			kmem_cache_free(sas_task_cache, task);
> +		kfree(slow);
> +		return NULL;
> +	}
> +
> +	task->slow_task = slow;
> +	init_timer(&slow->timer);
> +	init_completion(&slow->completion);
> +
> +	return task;
> +}
> +EXPORT_SYMBOL_GPL(sas_alloc_slow_task);
> +
>  void sas_free_task(struct sas_task *task)
>  {
>  	if (task) {
>  		BUG_ON(!list_empty(&task->list));
> +		kfree(task->slow_task);
>  		kmem_cache_free(sas_task_cache, task);
>  	}
>  }
> diff --git a/drivers/scsi/libsas/sas_scsi_host.c
> b/drivers/scsi/libsas/sas_scsi_host.c
> index 86ffd8f..9716f2e 100644
> --- a/drivers/scsi/libsas/sas_scsi_host.c
> +++ b/drivers/scsi/libsas/sas_scsi_host.c
> @@ -1013,9 +1013,13 @@ void sas_task_abort(struct sas_task *task)
> 
>  	/* Escape for libsas internal commands */
>  	if (!sc) {
> -		if (!del_timer(&task->timer))
> +		struct sas_task_slow *slow = task->slow_task;
> +
> +		if (!slow)
> +			return;
> +		if (!del_timer(&slow->timer))
>  			return;
> -		task->timer.function(task->timer.data);
> +		slow->timer.function(slow->timer.data);
>  		return;
>  	}
> 
> diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
> index b68a653..d0462b8 100644
> --- a/drivers/scsi/mvsas/mv_sas.c
> +++ b/drivers/scsi/mvsas/mv_sas.c
> @@ -1365,9 +1365,9 @@ void mvs_dev_gone(struct domain_device *dev)
> 
>  static void mvs_task_done(struct sas_task *task)
>  {
> -	if (!del_timer(&task->timer))
> +	if (!del_timer(&task->slow_task->timer))
>  		return;
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  static void mvs_tmf_timedout(unsigned long data)
> @@ -1375,7 +1375,7 @@ static void mvs_tmf_timedout(unsigned long data)
>  	struct sas_task *task = (struct sas_task *)data;
> 
>  	task->task_state_flags |= SAS_TASK_STATE_ABORTED;
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  #define MVS_TASK_TIMEOUT 20
> @@ -1386,7 +1386,7 @@ static int mvs_exec_internal_tmf_task(struct
> domain_device *dev,
>  	struct sas_task *task = NULL;
> 
>  	for (retry = 0; retry < 3; retry++) {
> -		task = sas_alloc_task(GFP_KERNEL);
> +		task = sas_alloc_slow_task(GFP_KERNEL);
>  		if (!task)
>  			return -ENOMEM;
> 
> @@ -1396,20 +1396,20 @@ static int mvs_exec_internal_tmf_task(struct
> domain_device *dev,
>  		memcpy(&task->ssp_task, parameter, para_len);
>  		task->task_done = mvs_task_done;
> 
> -		task->timer.data = (unsigned long) task;
> -		task->timer.function = mvs_tmf_timedout;
> -		task->timer.expires = jiffies + MVS_TASK_TIMEOUT*HZ;
> -		add_timer(&task->timer);
> +		task->slow_task->timer.data = (unsigned long) task;
> +		task->slow_task->timer.function = mvs_tmf_timedout;
> +		task->slow_task->timer.expires = jiffies +
MVS_TASK_TIMEOUT*HZ;
> +		add_timer(&task->slow_task->timer);
> 
>  		res = mvs_task_exec(task, 1, GFP_KERNEL, NULL, 1, tmf);
> 
>  		if (res) {
> -			del_timer(&task->timer);
> +			del_timer(&task->slow_task->timer);
>  			mv_printk("executing internel task failed:%d\n",
res);
>  			goto ex_err;
>  		}
> 
> -		wait_for_completion(&task->completion);
> +		wait_for_completion(&task->slow_task->completion);
>  		res = TMF_RESP_FUNC_FAILED;
>  		/* Even TMF timed out, return direct. */
>  		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
> diff --git a/drivers/scsi/pm8001/pm8001_sas.c
> b/drivers/scsi/pm8001/pm8001_sas.c
> index b111018..ad36f1a 100644
> --- a/drivers/scsi/pm8001/pm8001_sas.c
> +++ b/drivers/scsi/pm8001/pm8001_sas.c
> @@ -627,9 +627,9 @@ int pm8001_dev_found(struct domain_device *dev)
> 
>  static void pm8001_task_done(struct sas_task *task)
>  {
> -	if (!del_timer(&task->timer))
> +	if (!del_timer(&task->slow_task->timer))
>  		return;
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  static void pm8001_tmf_timedout(unsigned long data)
> @@ -637,7 +637,7 @@ static void pm8001_tmf_timedout(unsigned long data)
>  	struct sas_task *task = (struct sas_task *)data;
> 
>  	task->task_state_flags |= SAS_TASK_STATE_ABORTED;
> -	complete(&task->completion);
> +	complete(&task->slow_task->completion);
>  }
> 
>  #define PM8001_TASK_TIMEOUT 20
> @@ -668,21 +668,21 @@ static int pm8001_exec_internal_tmf_task(struct
> domain_device *dev,
>  		task->task_proto = dev->tproto;
>  		memcpy(&task->ssp_task, parameter, para_len);
>  		task->task_done = pm8001_task_done;
> -		task->timer.data = (unsigned long)task;
> -		task->timer.function = pm8001_tmf_timedout;
> -		task->timer.expires = jiffies + PM8001_TASK_TIMEOUT*HZ;
> -		add_timer(&task->timer);
> +		task->slow_task->timer.data = (unsigned long)task;
> +		task->slow_task->timer.function = pm8001_tmf_timedout;
> +		task->slow_task->timer.expires = jiffies +
PM8001_TASK_TIMEOUT*HZ;
> +		add_timer(&task->slow_task->timer);
> 
>  		res = pm8001_task_exec(task, 1, GFP_KERNEL, 1, tmf);
> 
>  		if (res) {
> -			del_timer(&task->timer);
> +			del_timer(&task->slow_task->timer);
>  			PM8001_FAIL_DBG(pm8001_ha,
>  				pm8001_printk("Executing internal task "
>  				"failed\n"));
>  			goto ex_err;
>  		}
> -		wait_for_completion(&task->completion);
> +		wait_for_completion(&task->slow_task->completion);
>  		res = -TMF_RESP_FUNC_FAILED;
>  		/* Even TMF timed out, return direct. */
>  		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
> @@ -749,10 +749,10 @@ pm8001_exec_internal_task_abort(struct
pm8001_hba_info
> *pm8001_ha,
>  		task->dev = dev;
>  		task->task_proto = dev->tproto;
>  		task->task_done = pm8001_task_done;
> -		task->timer.data = (unsigned long)task;
> -		task->timer.function = pm8001_tmf_timedout;
> -		task->timer.expires = jiffies + PM8001_TASK_TIMEOUT * HZ;
> -		add_timer(&task->timer);
> +		task->slow_task->timer.data = (unsigned long)task;
> +		task->slow_task->timer.function = pm8001_tmf_timedout;
> +		task->slow_task->timer.expires = jiffies +
PM8001_TASK_TIMEOUT *
> HZ;
> +		add_timer(&task->slow_task->timer);
> 
>  		res = pm8001_tag_alloc(pm8001_ha, &ccb_tag);
>  		if (res)
> @@ -766,13 +766,13 @@ pm8001_exec_internal_task_abort(struct
pm8001_hba_info
> *pm8001_ha,
>  			pm8001_dev, flag, task_tag, ccb_tag);
> 
>  		if (res) {
> -			del_timer(&task->timer);
> +			del_timer(&task->slow_task->timer);
>  			PM8001_FAIL_DBG(pm8001_ha,
>  				pm8001_printk("Executing internal task "
>  				"failed\n"));
>  			goto ex_err;
>  		}
> -		wait_for_completion(&task->completion);
> +		wait_for_completion(&task->slow_task->completion);
>  		res = TMF_RESP_FUNC_FAILED;
>  		/* Even TMF timed out, return direct. */
>  		if ((task->task_state_flags & SAS_TASK_STATE_ABORTED)) {
> diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
> index 484bc52..4e84ef3 100644
> --- a/include/scsi/libsas.h
> +++ b/include/scsi/libsas.h
> @@ -568,10 +568,6 @@ struct sas_task {
> 
>  	enum   sas_protocol      task_proto;
> 
> -	/* Used by the discovery code. */
> -	struct timer_list     timer;
> -	struct completion     completion;
> -
>  	union {
>  		struct sas_ata_task ata_task;
>  		struct sas_smp_task smp_task;
> @@ -588,8 +584,15 @@ struct sas_task {
> 
>  	void   *lldd_task;	  /* for use by LLDDs */
>  	void   *uldd_task;
> +	struct sas_task_slow *slow_task;
> +};
> 
> -	struct work_struct abort_work;
> +struct sas_task_slow {
> +	/* standard/extra infrastructure for slow path commands (SMP and
> +	 * internal lldd commands
> +	 */
> +	struct timer_list     timer;
> +	struct completion     completion;
>  };
> 
>  #define SAS_TASK_STATE_PENDING      1
> @@ -599,6 +602,7 @@ struct sas_task {
>  #define SAS_TASK_AT_INITIATOR       16
> 
>  extern struct sas_task *sas_alloc_task(gfp_t flags);
> +extern struct sas_task *sas_alloc_slow_task(gfp_t flags);
>  extern void sas_free_task(struct sas_task *task);
> 
>  struct sas_domain_function_template {
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-10  8:45 ` [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter Dan Williams
@ 2012-02-29 21:55   ` James Bottomley
  2012-02-29 22:40     ` Douglas Gilbert
  2012-02-29 23:22     ` Dan Williams
  0 siblings, 2 replies; 29+ messages in thread
From: James Bottomley @ 2012-02-29 21:55 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-scsi, linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> It is possible for a host to get "locked out" from talking to sata
> devices in the domain if, for example, its sas address changes but the
> expander topology has existing affiliations with the old address.  If
> the system is booted userspace can write to
> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
> if this condition exists for the root device the module parameter can be
> used to promote all ata resets to hard resets.

I don't quite understand this.  Are you saying we can't (or shouldn't)
execute 

/sys/class/sas_phy/<phy-X>/hard_reset

on the root device for some reason?

> After the system is booted this state can be cleared via
> /sys/module/libsas/parameters/force_hard_reset

I really don't think a module parameter for this is such a good idea ...
it effectively promotes all soft resets to being hard ones, which can
have a lot of unintended consequences.

James



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context
  2012-02-10  8:45 ` [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context Dan Williams
@ 2012-02-29 22:05   ` James Bottomley
  2012-03-01  0:28     ` Dan Williams
  0 siblings, 1 reply; 29+ messages in thread
From: James Bottomley @ 2012-02-29 22:05 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-scsi, linux-ide

On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> The strategy handlers may be called in places that are problematic for
> libsas (i.e. sata resets outside of domain revalidation filtering /
> libata link recovery), or problematic for userspace (non-blocking ioctl
> to sleeping reset functions).  However, these routines are also called
> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
> as long as we are running in the host's error handler.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
>  1 files changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
> index f0b9b7b..1cabedc 100644
> --- a/drivers/scsi/libsas/sas_scsi_host.c
> +++ b/drivers/scsi/libsas/sas_scsi_host.c
> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
>  /* Attempt to send a LUN reset message to a device */
>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
>  {
> -	struct domain_device *dev = cmd_to_domain_dev(cmd);
> -	struct sas_internal *i =
> -		to_sas_internal(dev->port->ha->core.shost->transportt);
> -	struct scsi_lun lun;
>  	int res;
> +	struct scsi_lun lun;
> +	struct Scsi_Host *host = cmd->device->host;
> +	struct domain_device *dev = cmd_to_domain_dev(cmd);
> +	struct sas_internal *i = to_sas_internal(host->transportt);
> +
> +	if (current != host->ehandler)
> +		return FAILED;

Doing this will ensure that SG_SCSI_RESET now fails.

I don't mind checking for O_NONBLOCK in the sg handler and failing if it
is, but disallowing everything looks a trifle drastic.

James



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-29 21:55   ` James Bottomley
@ 2012-02-29 22:40     ` Douglas Gilbert
  2012-02-29 23:27       ` Dan Williams
  2012-02-29 23:22     ` Dan Williams
  1 sibling, 1 reply; 29+ messages in thread
From: Douglas Gilbert @ 2012-02-29 22:40 UTC (permalink / raw)
  To: James Bottomley
  Cc: Dan Williams, linux-scsi, linux-ide, Jack Wang, Luben Tuikov,
	Xiangliang Yu

On 12-02-29 04:55 PM, James Bottomley wrote:
> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>> It is possible for a host to get "locked out" from talking to sata
>> devices in the domain if, for example, its sas address changes but the
>> expander topology has existing affiliations with the old address.  If
>> the system is booted userspace can write to
>> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
>> if this condition exists for the root device the module parameter can be
>> used to promote all ata resets to hard resets.

A point of order: SAS has link resets and hard resets. The
hard reset is a superset of link reset. A "link reset sequence
serves as a hard reset for SATA devices" and hence is
sufficient to reset a SATA device. To reset a SAS device
(e.g. a SAS disk) you need a SAS hard reset. Therefore a link
reset is the appropriately sized "gun" to reset a SATA device.

I have a SAS-2 expander that annoyingly powers up with the
programmed maximum physical link rate of its phys at 3 Gbps
even though its hardware maximum rate is 6 Gbps. For expander
phys connected to SAS-2 disks I can up the programmed maximum
value to 6 Gbps on the expander phy then do a link reset on
that phy. So without upsetting Linux (or any other OS) I can
switch that path from 3 Gbps to 6 Gbps. Can't do that with a
SATA disk without the OS finding out.

Also to clear a SATA affiliation you should be using a SMP
PHY CONTROL (phy_op=6) function.

Doug Gilbert

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-29 21:55   ` James Bottomley
  2012-02-29 22:40     ` Douglas Gilbert
@ 2012-02-29 23:22     ` Dan Williams
  2012-03-01 14:27       ` James Bottomley
  1 sibling, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-29 23:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-scsi, linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

On Wed, Feb 29, 2012 at 1:55 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>> It is possible for a host to get "locked out" from talking to sata
>> devices in the domain if, for example, its sas address changes but the
>> expander topology has existing affiliations with the old address.  If
>> the system is booted userspace can write to
>> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
>> if this condition exists for the root device the module parameter can be
>> used to promote all ata resets to hard resets.
>
> I don't quite understand this.  Are you saying we can't (or shouldn't)
> execute
>
> /sys/class/sas_phy/<phy-X>/hard_reset
>
> on the root device for some reason?

The case I ran into was accidentally changing the host sas address
between reboots.  If the sata device had been a root device then I
would not have been able boot the system.  But now that I think about
it, if Linux could not boot then neither could the pre-os
option-rom/efi driver.

>> After the system is booted this state can be cleared via
>> /sys/module/libsas/parameters/force_hard_reset
>
> I really don't think a module parameter for this is such a good idea ...
> it effectively promotes all soft resets to being hard ones, which can
> have a lot of unintended consequences.

Yes, it was only meant as a temporary "get out of a sticky situation"
option, but given the above pre-os-driver realization it is not even
useful for that case.  So I'm fine killing this patch.

--
Dan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-29 22:40     ` Douglas Gilbert
@ 2012-02-29 23:27       ` Dan Williams
  2012-03-01  0:23         ` Douglas Gilbert
  0 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-02-29 23:27 UTC (permalink / raw)
  To: dgilbert
  Cc: James Bottomley, linux-scsi, linux-ide, Jack Wang, Luben Tuikov,
	Xiangliang Yu

On Wed, Feb 29, 2012 at 2:40 PM, Douglas Gilbert <dgilbert@interlog.com> wrote:
> On 12-02-29 04:55 PM, James Bottomley wrote:
>>
>> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>>>
>>> It is possible for a host to get "locked out" from talking to sata
>>> devices in the domain if, for example, its sas address changes but the
>>> expander topology has existing affiliations with the old address.  If
>>> the system is booted userspace can write to
>>> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
>>> if this condition exists for the root device the module parameter can be
>>> used to promote all ata resets to hard resets.
>
>
> A point of order: SAS has link resets and hard resets. The
> hard reset is a superset of link reset. A "link reset sequence
> serves as a hard reset for SATA devices" and hence is
> sufficient to reset a SATA device. To reset a SAS device
> (e.g. a SAS disk) you need a SAS hard reset. Therefore a link
> reset is the appropriately sized "gun" to reset a SATA device.
>
> I have a SAS-2 expander that annoyingly powers up with the
> programmed maximum physical link rate of its phys at 3 Gbps
> even though its hardware maximum rate is 6 Gbps. For expander
> phys connected to SAS-2 disks I can up the programmed maximum
> value to 6 Gbps on the expander phy then do a link reset on
> that phy. So without upsetting Linux (or any other OS) I can
> switch that path from 3 Gbps to 6 Gbps. Can't do that with a
> SATA disk without the OS finding out.

At least now (with these pending patches) if you trigger a link-reset
via the sysfs interface libsas will manage the link recovery like any
other error-recovery initiated reset.

Something like a libsas.force_max_phys_link_rate module parameter
might not be a bad idea for this scenario, since libsas sata discovery
always forces at least one reset of the disk after the phy reports
"attached sata device".

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-29 23:27       ` Dan Williams
@ 2012-03-01  0:23         ` Douglas Gilbert
  2012-03-01  0:35           ` Dan Williams
  0 siblings, 1 reply; 29+ messages in thread
From: Douglas Gilbert @ 2012-03-01  0:23 UTC (permalink / raw)
  To: Dan Williams
  Cc: James Bottomley, linux-scsi, linux-ide, Jack Wang, Luben Tuikov,
	Xiangliang Yu

On 12-02-29 06:27 PM, Dan Williams wrote:
> On Wed, Feb 29, 2012 at 2:40 PM, Douglas Gilbert<dgilbert@interlog.com>  wrote:
>> On 12-02-29 04:55 PM, James Bottomley wrote:
>>>
>>> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>>>>
>>>> It is possible for a host to get "locked out" from talking to sata
>>>> devices in the domain if, for example, its sas address changes but the
>>>> expander topology has existing affiliations with the old address.  If
>>>> the system is booted userspace can write to
>>>> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
>>>> if this condition exists for the root device the module parameter can be
>>>> used to promote all ata resets to hard resets.
>>
>>
>> A point of order: SAS has link resets and hard resets. The
>> hard reset is a superset of link reset. A "link reset sequence
>> serves as a hard reset for SATA devices" and hence is
>> sufficient to reset a SATA device. To reset a SAS device
>> (e.g. a SAS disk) you need a SAS hard reset. Therefore a link
>> reset is the appropriately sized "gun" to reset a SATA device.
>>
>> I have a SAS-2 expander that annoyingly powers up with the
>> programmed maximum physical link rate of its phys at 3 Gbps
>> even though its hardware maximum rate is 6 Gbps. For expander
>> phys connected to SAS-2 disks I can up the programmed maximum
>> value to 6 Gbps on the expander phy then do a link reset on
>> that phy. So without upsetting Linux (or any other OS) I can
>> switch that path from 3 Gbps to 6 Gbps. Can't do that with a
>> SATA disk without the OS finding out.
>
> At least now (with these pending patches) if you trigger a link-reset
> via the sysfs interface libsas will manage the link recovery like any
> other error-recovery initiated reset.

I can think of 4 cases for link reset. The other end
of the link is:
   a) a SAS target: not error recovery situation
   b) a SAS expander phy: not error recovery situation
   c) a SATA device: error recovery situation
   d) a SAS initiator: not sure, probably not

Doug Gilbert

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context
  2012-02-29 22:05   ` James Bottomley
@ 2012-03-01  0:28     ` Dan Williams
  2012-03-01 14:29       ` James Bottomley
  0 siblings, 1 reply; 29+ messages in thread
From: Dan Williams @ 2012-03-01  0:28 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-ide

On Wed, Feb 29, 2012 at 2:05 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>> The strategy handlers may be called in places that are problematic for
>> libsas (i.e. sata resets outside of domain revalidation filtering /
>> libata link recovery), or problematic for userspace (non-blocking ioctl
>> to sleeping reset functions).  However, these routines are also called
>> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
>> as long as we are running in the host's error handler.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
>>  1 files changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
>> index f0b9b7b..1cabedc 100644
>> --- a/drivers/scsi/libsas/sas_scsi_host.c
>> +++ b/drivers/scsi/libsas/sas_scsi_host.c
>> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
>>  /* Attempt to send a LUN reset message to a device */
>>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
>>  {
>> -     struct domain_device *dev = cmd_to_domain_dev(cmd);
>> -     struct sas_internal *i =
>> -             to_sas_internal(dev->port->ha->core.shost->transportt);
>> -     struct scsi_lun lun;
>>       int res;
>> +     struct scsi_lun lun;
>> +     struct Scsi_Host *host = cmd->device->host;
>> +     struct domain_device *dev = cmd_to_domain_dev(cmd);
>> +     struct sas_internal *i = to_sas_internal(host->transportt);
>> +
>> +     if (current != host->ehandler)
>> +             return FAILED;
>
> Doing this will ensure that SG_SCSI_RESET now fails.
>
> I don't mind checking for O_NONBLOCK in the sg handler and failing if it
> is, but disallowing everything looks a trifle drastic.
>

The thought process here was following the lead of libata which does
not specify eh_reset handlers.  We can't permit "unmanaged" (outside
of eh) resets to hit ata devices otherwise we run the risk of a reset
turning into a link bounce / hotplug.

...and I can't take the same route as I did for the scsi_transport_sas
initiated reset since these handlers are called from both ioctl and eh
context, or can I?

Hmm, what about something like:

  if (current != host->ehandler) {
    schedule_reset_to_run_in_eh_context():
    wait_for_eh();
  } else
    do_reset();

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-03-01  0:23         ` Douglas Gilbert
@ 2012-03-01  0:35           ` Dan Williams
  0 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-03-01  0:35 UTC (permalink / raw)
  To: dgilbert
  Cc: James Bottomley, linux-scsi, linux-ide, Jack Wang, Luben Tuikov,
	Xiangliang Yu

On Wed, Feb 29, 2012 at 4:23 PM, Douglas Gilbert <dgilbert@interlog.com> wrote:
> On 12-02-29 06:27 PM, Dan Williams wrote:
>>
>> On Wed, Feb 29, 2012 at 2:40 PM, Douglas Gilbert<dgilbert@interlog.com>
>>  wrote:
>>>
>>> On 12-02-29 04:55 PM, James Bottomley wrote:
>>>>
>>>>
>>>> On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
>>>>>
>>>>>
>>>>> It is possible for a host to get "locked out" from talking to sata
>>>>> devices in the domain if, for example, its sas address changes but the
>>>>> expander topology has existing affiliations with the old address.  If
>>>>> the system is booted userspace can write to
>>>>> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
>>>>> if this condition exists for the root device the module parameter can
>>>>> be
>>>>> used to promote all ata resets to hard resets.
>>>
>>>
>>>
>>> A point of order: SAS has link resets and hard resets. The
>>> hard reset is a superset of link reset. A "link reset sequence
>>> serves as a hard reset for SATA devices" and hence is
>>> sufficient to reset a SATA device. To reset a SAS device
>>> (e.g. a SAS disk) you need a SAS hard reset. Therefore a link
>>> reset is the appropriately sized "gun" to reset a SATA device.
>>>
>>> I have a SAS-2 expander that annoyingly powers up with the
>>> programmed maximum physical link rate of its phys at 3 Gbps
>>> even though its hardware maximum rate is 6 Gbps. For expander
>>> phys connected to SAS-2 disks I can up the programmed maximum
>>> value to 6 Gbps on the expander phy then do a link reset on
>>> that phy. So without upsetting Linux (or any other OS) I can
>>> switch that path from 3 Gbps to 6 Gbps. Can't do that with a
>>> SATA disk without the OS finding out.
>>
>>
>> At least now (with these pending patches) if you trigger a link-reset
>> via the sysfs interface libsas will manage the link recovery like any
>> other error-recovery initiated reset.
>
>
> I can think of 4 cases for link reset. The other end
> of the link is:
>  a) a SAS target: not error recovery situation
>  b) a SAS expander phy: not error recovery situation
>  c) a SATA device: error recovery situation

sas_try_ata_reset() [1] is what promotes user requested resets into
error recovery managed resets if the other end of the link is sata.

[1]: http://git.kernel.org/?p=linux/kernel/git/djbw/isci.git;a=blob;f=drivers/scsi/libsas/sas_init.c;h=57e7ac97b3e3dba3091f83a64c0c32a6660390cb;hb=refs/heads/all#l222

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter
  2012-02-29 23:22     ` Dan Williams
@ 2012-03-01 14:27       ` James Bottomley
  0 siblings, 0 replies; 29+ messages in thread
From: James Bottomley @ 2012-03-01 14:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-scsi, linux-ide, Jack Wang, Luben Tuikov, Xiangliang Yu

On Wed, 2012-02-29 at 15:22 -0800, Dan Williams wrote:
> On Wed, Feb 29, 2012 at 1:55 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> >> It is possible for a host to get "locked out" from talking to sata
> >> devices in the domain if, for example, its sas address changes but the
> >> expander topology has existing affiliations with the old address.  If
> >> the system is booted userspace can write to
> >> /sys/class/sas_phy/<phy-X>/hard_reset to clear the affiliation, however
> >> if this condition exists for the root device the module parameter can be
> >> used to promote all ata resets to hard resets.
> >
> > I don't quite understand this.  Are you saying we can't (or shouldn't)
> > execute
> >
> > /sys/class/sas_phy/<phy-X>/hard_reset
> >
> > on the root device for some reason?
> 
> The case I ran into was accidentally changing the host sas address
> between reboots.  If the sata device had been a root device then I
> would not have been able boot the system.  But now that I think about
> it, if Linux could not boot then neither could the pre-os
> option-rom/efi driver.
> 
> >> After the system is booted this state can be cleared via
> >> /sys/module/libsas/parameters/force_hard_reset
> >
> > I really don't think a module parameter for this is such a good idea ...
> > it effectively promotes all soft resets to being hard ones, which can
> > have a lot of unintended consequences.
> 
> Yes, it was only meant as a temporary "get out of a sticky situation"
> option, but given the above pre-os-driver realization it is not even
> useful for that case.  So I'm fine killing this patch.

Great, I'll drop it, thanks.

James



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context
  2012-03-01  0:28     ` Dan Williams
@ 2012-03-01 14:29       ` James Bottomley
  2012-03-06 19:17         ` Dan Williams
  0 siblings, 1 reply; 29+ messages in thread
From: James Bottomley @ 2012-03-01 14:29 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-scsi, linux-ide

On Wed, 2012-02-29 at 16:28 -0800, Dan Williams wrote:
> On Wed, Feb 29, 2012 at 2:05 PM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Fri, 2012-02-10 at 00:45 -0800, Dan Williams wrote:
> >> The strategy handlers may be called in places that are problematic for
> >> libsas (i.e. sata resets outside of domain revalidation filtering /
> >> libata link recovery), or problematic for userspace (non-blocking ioctl
> >> to sleeping reset functions).  However, these routines are also called
> >> for eh escalations and recovery of scsi_eh_prep_cmnd(), so permit them
> >> as long as we are running in the host's error handler.
> >>
> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >> ---
> >>  drivers/scsi/libsas/sas_scsi_host.c |   15 +++++++++++----
> >>  1 files changed, 11 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
> >> index f0b9b7b..1cabedc 100644
> >> --- a/drivers/scsi/libsas/sas_scsi_host.c
> >> +++ b/drivers/scsi/libsas/sas_scsi_host.c
> >> @@ -463,11 +463,14 @@ EXPORT_SYMBOL_GPL(sas_get_local_phy);
> >>  /* Attempt to send a LUN reset message to a device */
> >>  int sas_eh_device_reset_handler(struct scsi_cmnd *cmd)
> >>  {
> >> -     struct domain_device *dev = cmd_to_domain_dev(cmd);
> >> -     struct sas_internal *i =
> >> -             to_sas_internal(dev->port->ha->core.shost->transportt);
> >> -     struct scsi_lun lun;
> >>       int res;
> >> +     struct scsi_lun lun;
> >> +     struct Scsi_Host *host = cmd->device->host;
> >> +     struct domain_device *dev = cmd_to_domain_dev(cmd);
> >> +     struct sas_internal *i = to_sas_internal(host->transportt);
> >> +
> >> +     if (current != host->ehandler)
> >> +             return FAILED;
> >
> > Doing this will ensure that SG_SCSI_RESET now fails.
> >
> > I don't mind checking for O_NONBLOCK in the sg handler and failing if it
> > is, but disallowing everything looks a trifle drastic.
> >
> 
> The thought process here was following the lead of libata which does
> not specify eh_reset handlers.  We can't permit "unmanaged" (outside
> of eh) resets to hit ata devices otherwise we run the risk of a reset
> turning into a link bounce / hotplug.
> 
> ...and I can't take the same route as I did for the scsi_transport_sas
> initiated reset since these handlers are called from both ioctl and eh
> context, or can I?
> 
> Hmm, what about something like:
> 
>   if (current != host->ehandler) {
>     schedule_reset_to_run_in_eh_context():
>     wait_for_eh();
>   } else
>     do_reset();

I think that would work for me ... as long as the wait doesn't cause
entangled deadlocks (I can't think of any at the moment, but I'll think
a bit more deeply about it).

James



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context
  2012-03-01 14:29       ` James Bottomley
@ 2012-03-06 19:17         ` Dan Williams
  0 siblings, 0 replies; 29+ messages in thread
From: Dan Williams @ 2012-03-06 19:17 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, linux-ide

On Thu, Mar 1, 2012 at 6:29 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>> Hmm, what about something like:
>>
>>   if (current != host->ehandler) {
>>     schedule_reset_to_run_in_eh_context():
>>     wait_for_eh();
>>   } else
>>     do_reset();
>
> I think that would work for me ... as long as the wait doesn't cause
> entangled deadlocks (I can't think of any at the moment, but I'll think
> a bit more deeply about it).

So there is a deadlock due to:

commit d7a1bb0a04ca835bffc0a91e64ab827dfba7d8f5
Author: James Smart <James.Smart@Emulex.Com>
Date:   Wed Mar 8 14:50:12 2006 -0500

    [SCSI] Block I/O while SG reset operation in progress - the midlayer patch

    The scsi midlayer portion of the patch

    Signed-off-by: James Smart <James.Smart@emulex.com>
    Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>

...this adds shost->tmf_in_progress to scsi_host_in_recovery(), so I
can't "wait for eh" because the exit condition for that wait is
"!scsi_host_in_recovery()".  But since sg_reset is opened O_NONBLOCK
by default and that is likely the only non-eh reason for calling the
eh_{bus|device}_reset_handler routines I'll just make this routine
asynchronously queue a reset and return.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2012-03-06 19:17 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-10  8:44 [PATCH v8 00/13] libsas error handling + discovery v8 Dan Williams
2012-02-10  8:44 ` [PATCH v8 01/13] libsas: close scsi_remove_target() vs libata-eh race Dan Williams
2012-02-10 18:32   ` Jeff Garzik
2012-02-10 19:09   ` Dan Williams
2012-02-10  8:44 ` [PATCH v8 02/13] libsas: improve debug statements Dan Williams
2012-02-10  8:44 ` [PATCH v8 03/13] libsas: async ata scanning Dan Williams
2012-02-10 18:34   ` Jeff Garzik
2012-02-10  8:45 ` [PATCH v8 04/13] libsas: set attached device type and target protocols for local phys Dan Williams
2012-02-10  8:45 ` [PATCH v8 05/13] libsas: fixup target_port_protocols for expanders that don't report sata Dan Williams
2012-02-10  8:45 ` [PATCH v8 06/13] libsas: don't recover end devices attached to disabled phys Dan Williams
2012-02-10  8:45 ` [PATCH v8 07/13] scsi_transport_sas: 'enable' phys on reset Dan Williams
2012-02-10  8:45 ` [PATCH v8 08/13] libsas: libsas.force_hard_reset module parameter Dan Williams
2012-02-29 21:55   ` James Bottomley
2012-02-29 22:40     ` Douglas Gilbert
2012-02-29 23:27       ` Dan Williams
2012-03-01  0:23         ` Douglas Gilbert
2012-03-01  0:35           ` Dan Williams
2012-02-29 23:22     ` Dan Williams
2012-03-01 14:27       ` James Bottomley
2012-02-10  8:45 ` [PATCH v8 09/13] libsas: enforce eh strategy handlers only in eh context Dan Williams
2012-02-29 22:05   ` James Bottomley
2012-03-01  0:28     ` Dan Williams
2012-03-01 14:29       ` James Bottomley
2012-03-06 19:17         ` Dan Williams
2012-02-10  8:45 ` [PATCH v8 10/13] libsas: add sas_eh_abort_handler Dan Williams
2012-02-10  8:45 ` [PATCH v8 11/13] libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler Dan Williams
2012-02-10  8:45 ` [PATCH v8 12/13] isci: use sas eh strategy handlers Dan Williams
2012-02-10  8:45 ` [PATCH v8 13/13] libsas: trim sas_task of slow path infrastructure Dan Williams
2012-02-15  1:09   ` Jack Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).