linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs.
@ 2015-02-05 19:23 James Smart
  2015-03-07 16:56 ` James Bottomley
  0 siblings, 1 reply; 4+ messages in thread
From: James Smart @ 2015-02-05 19:23 UTC (permalink / raw)
  To: linux-scsi


---
 drivers/scsi/lpfc/lpfc_crtn.h |  1 +
 drivers/scsi/lpfc/lpfc_init.c | 11 ++++++++---
 drivers/scsi/lpfc/lpfc_scsi.c | 25 +++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 00665a5..665c88c 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -354,6 +354,7 @@ void lpfc_free_sysfs_attr(struct lpfc_vport *);
 extern struct device_attribute *lpfc_hba_attrs[];
 extern struct device_attribute *lpfc_vport_attrs[];
 extern struct scsi_host_template lpfc_template;
+extern struct scsi_host_template lpfc_template_s3;
 extern struct scsi_host_template lpfc_vport_template;
 extern struct fc_function_template lpfc_transport_functions;
 extern struct fc_function_template lpfc_vport_transport_functions;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 4ba91af..74672e0 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3257,12 +3257,17 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, struct device *dev)
 	struct Scsi_Host  *shost;
 	int error = 0;
 
-	if (dev != &phba->pcidev->dev)
+	if (dev != &phba->pcidev->dev) {
 		shost = scsi_host_alloc(&lpfc_vport_template,
 					sizeof(struct lpfc_vport));
-	else
-		shost = scsi_host_alloc(&lpfc_template,
+	} else {
+		if (phba->sli_rev == LPFC_SLI_REV4)
+			shost = scsi_host_alloc(&lpfc_template,
 					sizeof(struct lpfc_vport));
+		else
+			shost = scsi_host_alloc(&lpfc_template_s3,
+					sizeof(struct lpfc_vport));
+	}
 	if (!shost)
 		goto out;
 
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 4f9222e..f623299 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5857,6 +5857,31 @@ lpfc_disable_oas_lun(struct lpfc_hba *phba, struct lpfc_name *vport_wwpn,
 	return false;
 }
 
+struct scsi_host_template lpfc_template_s3 = {
+	.module			= THIS_MODULE,
+	.name			= LPFC_DRIVER_NAME,
+	.info			= lpfc_info,
+	.queuecommand		= lpfc_queuecommand,
+	.eh_abort_handler	= lpfc_abort_handler,
+	.eh_device_reset_handler = lpfc_device_reset_handler,
+	.eh_target_reset_handler = lpfc_target_reset_handler,
+	.eh_bus_reset_handler	= lpfc_bus_reset_handler,
+	.slave_alloc		= lpfc_slave_alloc,
+	.slave_configure	= lpfc_slave_configure,
+	.slave_destroy		= lpfc_slave_destroy,
+	.scan_finished		= lpfc_scan_finished,
+	.this_id		= -1,
+	.sg_tablesize		= LPFC_DEFAULT_SG_SEG_CNT,
+	.cmd_per_lun		= LPFC_CMD_PER_LUN,
+	.use_clustering		= ENABLE_CLUSTERING,
+	.shost_attrs		= lpfc_hba_attrs,
+	.max_sectors		= 0xFFFF,
+	.vendor_id		= LPFC_NL_VENDOR_ID,
+	.change_queue_depth	= scsi_change_queue_depth,
+	.use_blk_tags		= 1,
+	.track_queue_depth	= 1,
+};
+
 struct scsi_host_template lpfc_template = {
 	.module			= THIS_MODULE,
 	.name			= LPFC_DRIVER_NAME,
-- 
1.7.11.7

Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: Dick Kennedy <dick.kennedy@emulex.com>




^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs.
  2015-02-05 19:23 James Smart
@ 2015-03-07 16:56 ` James Bottomley
  0 siblings, 0 replies; 4+ messages in thread
From: James Bottomley @ 2015-03-07 16:56 UTC (permalink / raw)
  To: james.smart; +Cc: linux-scsi

On Thu, 2015-02-05 at 14:23 -0500, James Smart wrote:
> ---

This one really requires more than a one-line explanation because the
one liner doesn't really describe what you've done.

It looks like the description should be:

Only the SLI rev 4 card has a host reset capability, so only activate it
in the host template for that card.

Which leads me to two questions:

     1. Is the if condition the right way around?  It seems from the
        previous code that SLI rev < 4 do support a host reset handler.
        At the very least, if the if clause is correct, explain how it
        relates to 27b01b8 [SCSI] lpfc 8.3.31: Fixed system crash due to
        not providing SCSI error-handling host reset handler.
     2. Is this only a problem of SLI rev 4, so == is justified, or
        should it be 4 and above?

James



^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs.
@ 2015-04-03 21:11 James Smart
  2015-04-10  6:10 ` Hannes Reinecke
  0 siblings, 1 reply; 4+ messages in thread
From: James Smart @ 2015-04-03 21:11 UTC (permalink / raw)
  To: linux-scsi

Fix host reset escalation killing all IOs.

SLI-3 adapters will use a new host template. The template differs
from SLI-4 adapters in that it does not have an eh_host_reset_handler.

Lpfc has traditionally never had a host_reset. The host reset
handler was added when we ran into a stuck hardware condition on a
SLI-4 adapter. The host_reset will reset and reinit the pci function,
clearing the hardware condition.

Unfortunately, the host reset handler uses attach/detach code paths,
which makes scsi_add_host() and scsi_remove_host() calls. Meaning, a
host_reset will completely remove the scsi_host from the system. As a
new call to scsi_add_host() is made, the shost# changes, which results
in completely new scsi_devices and device names. All the older scsi
devices on the old shost# are now orphaned and unrecoverable.

We realize we need to re-implement the host_reset_handler so the scsi_host
stays registered across the host_reset, but that will be a rather
lengthy effort. In the short term, we had an immediate need to restore
the SLI-3 devices to their working behavior, with the easiest path being
to remove their host_reset handler.

Signed-off-by: Dick Kennedy <dick.kennedy@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
---
 drivers/scsi/lpfc/lpfc_crtn.h |  1 +
 drivers/scsi/lpfc/lpfc_init.c | 11 ++++++++---
 drivers/scsi/lpfc/lpfc_scsi.c | 25 +++++++++++++++++++++++++
 3 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h
index 00665a5..665c88c 100644
--- a/drivers/scsi/lpfc/lpfc_crtn.h
+++ b/drivers/scsi/lpfc/lpfc_crtn.h
@@ -354,6 +354,7 @@ void lpfc_free_sysfs_attr(struct lpfc_vport *);
 extern struct device_attribute *lpfc_hba_attrs[];
 extern struct device_attribute *lpfc_vport_attrs[];
 extern struct scsi_host_template lpfc_template;
+extern struct scsi_host_template lpfc_template_s3;
 extern struct scsi_host_template lpfc_vport_template;
 extern struct fc_function_template lpfc_transport_functions;
 extern struct fc_function_template lpfc_vport_transport_functions;
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c
index 4ba91af..74672e0 100644
--- a/drivers/scsi/lpfc/lpfc_init.c
+++ b/drivers/scsi/lpfc/lpfc_init.c
@@ -3257,12 +3257,17 @@ lpfc_create_port(struct lpfc_hba *phba, int instance, struct device *dev)
 	struct Scsi_Host  *shost;
 	int error = 0;
 
-	if (dev != &phba->pcidev->dev)
+	if (dev != &phba->pcidev->dev) {
 		shost = scsi_host_alloc(&lpfc_vport_template,
 					sizeof(struct lpfc_vport));
-	else
-		shost = scsi_host_alloc(&lpfc_template,
+	} else {
+		if (phba->sli_rev == LPFC_SLI_REV4)
+			shost = scsi_host_alloc(&lpfc_template,
 					sizeof(struct lpfc_vport));
+		else
+			shost = scsi_host_alloc(&lpfc_template_s3,
+					sizeof(struct lpfc_vport));
+	}
 	if (!shost)
 		goto out;
 
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index 4f9222e..f623299 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -5857,6 +5857,31 @@ lpfc_disable_oas_lun(struct lpfc_hba *phba, struct lpfc_name *vport_wwpn,
 	return false;
 }
 
+struct scsi_host_template lpfc_template_s3 = {
+	.module			= THIS_MODULE,
+	.name			= LPFC_DRIVER_NAME,
+	.info			= lpfc_info,
+	.queuecommand		= lpfc_queuecommand,
+	.eh_abort_handler	= lpfc_abort_handler,
+	.eh_device_reset_handler = lpfc_device_reset_handler,
+	.eh_target_reset_handler = lpfc_target_reset_handler,
+	.eh_bus_reset_handler	= lpfc_bus_reset_handler,
+	.slave_alloc		= lpfc_slave_alloc,
+	.slave_configure	= lpfc_slave_configure,
+	.slave_destroy		= lpfc_slave_destroy,
+	.scan_finished		= lpfc_scan_finished,
+	.this_id		= -1,
+	.sg_tablesize		= LPFC_DEFAULT_SG_SEG_CNT,
+	.cmd_per_lun		= LPFC_CMD_PER_LUN,
+	.use_clustering		= ENABLE_CLUSTERING,
+	.shost_attrs		= lpfc_hba_attrs,
+	.max_sectors		= 0xFFFF,
+	.vendor_id		= LPFC_NL_VENDOR_ID,
+	.change_queue_depth	= scsi_change_queue_depth,
+	.use_blk_tags		= 1,
+	.track_queue_depth	= 1,
+};
+
 struct scsi_host_template lpfc_template = {
 	.module			= THIS_MODULE,
 	.name			= LPFC_DRIVER_NAME,
-- 
1.7.11.7





^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs.
  2015-04-03 21:11 [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs James Smart
@ 2015-04-10  6:10 ` Hannes Reinecke
  0 siblings, 0 replies; 4+ messages in thread
From: Hannes Reinecke @ 2015-04-10  6:10 UTC (permalink / raw)
  To: james.smart, linux-scsi

On 04/03/2015 11:11 PM, James Smart wrote:
> Fix host reset escalation killing all IOs.
> 
> SLI-3 adapters will use a new host template. The template differs
> from SLI-4 adapters in that it does not have an eh_host_reset_handler.
> 
> Lpfc has traditionally never had a host_reset. The host reset
> handler was added when we ran into a stuck hardware condition on a
> SLI-4 adapter. The host_reset will reset and reinit the pci function,
> clearing the hardware condition.
> 
> Unfortunately, the host reset handler uses attach/detach code paths,
> which makes scsi_add_host() and scsi_remove_host() calls. Meaning, a
> host_reset will completely remove the scsi_host from the system. As a
> new call to scsi_add_host() is made, the shost# changes, which results
> in completely new scsi_devices and device names. All the older scsi
> devices on the old shost# are now orphaned and unrecoverable.
> 
> We realize we need to re-implement the host_reset_handler so the scsi_host
> stays registered across the host_reset, but that will be a rather
> lengthy effort. In the short term, we had an immediate need to restore
> the SLI-3 devices to their working behavior, with the easiest path being
> to remove their host_reset handler.
> 
> Signed-off-by: Dick Kennedy <dick.kennedy@emulex.com>
> Signed-off-by: James Smart <james.smart@emulex.com>
> ---

Of course, this will disable the eh_deadline mechanism, which relies
on the host_reset to be present.

Of which you are perfectly aware :-)

So, grudgingly:

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-04-10  6:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-03 21:11 [PATCH 06/21] lpfc: Fix host reset escalation killing all IOs James Smart
2015-04-10  6:10 ` Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2015-02-05 19:23 James Smart
2015-03-07 16:56 ` James Bottomley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).