* [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid
@ 2008-03-04 16:50 Michael Reed
2008-03-04 17:22 ` Moore, Eric
0 siblings, 1 reply; 4+ messages in thread
From: Michael Reed @ 2008-03-04 16:50 UTC (permalink / raw)
To: linux-scsi, Moore, Eric, Prakash, Sathya; +Cc: Michael Reed, Jeremy Higdon
Following a hard reset of a SAS raid, one of the raid targets is occasionally
missing. I tracked this down to a pretty obscure little bug.
The LSI fusion drivers for SAS and Fibre Channel both use their respective
transport layers. Those transport layers increment the target number
assigned to new targets.
The routine __scsi_scan_target uses the "this_id" element of the Scsi_Host
structure to avoid scanning the scsi host adapter. Both fusion drivers set
"this_id" from a value returned in a firmware PortFacts response. For my
particular test case (SAS) the firmware id assigned to the initiator was
173. After enough raid resets to cause the raid targets to go and come a
sufficient number of times, the id assigned by the transport to a raid
target would match the id assigned by the host adapter to the "this_id"
field, resulting in that target not being scanned.
static void __scsi_scan_target(struct device *parent, unsigned int channel,
unsigned int id, unsigned int lun, int rescan)
{
struct Scsi_Host *shost = dev_to_shost(parent);
int bflags = 0;
int res;
struct scsi_target *starget;
if (shost->this_id == id)
/*
* Don't scan the host adapter
*/
return;
....
The fix is simple. Fusion SAS and Fibre Channel (subject to same bug) should
just leave "this_id" initialized to "-1".
Applies to 2.6.25-rc3-git5.
Signed-off-by: Michael Reed <mdr@sgi.com>
--
--- kou/drivers/message/fusion/mptfc.c 2008-01-24 16:58:37.000000000 -0600
+++ ko/drivers/message/fusion/mptfc.c 2008-03-04 09:01:18.428176326 -0600
@@ -1238,8 +1238,6 @@ mptfc_probe(struct pci_dev *pdev, const
sh->max_id = ioc->pfacts->MaxDevices;
sh->max_lun = max_lun;
- sh->this_id = ioc->pfacts[0].PortSCSIID;
-
/* Required entry.
*/
sh->unique_id = ioc->id;
--- kou/drivers/message/fusion/mptsas.c 2008-03-04 08:38:58.000000000 -0600
+++ ko/drivers/message/fusion/mptsas.c 2008-03-04 09:01:04.284807301 -0600
@@ -3176,8 +3176,6 @@ mptsas_probe(struct pci_dev *pdev, const
sh->transportt = mptsas_transport_template;
- sh->this_id = ioc->pfacts[0].PortSCSIID;
-
/* Required entry.
*/
sh->unique_id = ioc->id;
^ permalink raw reply [flat|nested] 4+ messages in thread* RE: [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid 2008-03-04 16:50 [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid Michael Reed @ 2008-03-04 17:22 ` Moore, Eric 2008-03-13 19:53 ` Michael Reed 0 siblings, 1 reply; 4+ messages in thread From: Moore, Eric @ 2008-03-04 17:22 UTC (permalink / raw) To: Michael Reed, linux-scsi, Prakash, Sathya; +Cc: Jeremy Higdon On Tuesday, March 04, 2008 9:51 AM, Michael Reed wrote: > Subject: [PATCH 1/1] Fusion SAS and Fibre Channel: target > missing after resetting external raid > > Following a hard reset of a SAS raid, one of the raid targets > is occasionally > missing. I tracked this down to a pretty obscure little bug. > > The LSI fusion drivers for SAS and Fibre Channel both use > their respective > transport layers. Those transport layers increment the target number > assigned to new targets. > > The routine __scsi_scan_target uses the "this_id" element of > the Scsi_Host > structure to avoid scanning the scsi host adapter. Both > fusion drivers set > "this_id" from a value returned in a firmware PortFacts > response. For my > particular test case (SAS) the firmware id assigned to the > initiator was > 173. After enough raid resets to cause the raid targets to > go and come a > sufficient number of times, the id assigned by the transport to a raid > target would match the id assigned by the host adapter to the > "this_id" > field, resulting in that target not being scanned. > > static void __scsi_scan_target(struct device *parent, > unsigned int channel, > unsigned int id, unsigned int lun, int rescan) > { > struct Scsi_Host *shost = dev_to_shost(parent); > int bflags = 0; > int res; > struct scsi_target *starget; > > if (shost->this_id == id) > /* > * Don't scan the host adapter > */ > return; > > .... > > The fix is simple. Fusion SAS and Fibre Channel (subject to > same bug) should > just leave "this_id" initialized to "-1". > > Applies to 2.6.25-rc3-git5. > > Signed-off-by: Michael Reed <mdr@sgi.com> > > -- > > --- kou/drivers/message/fusion/mptfc.c 2008-01-24 > 16:58:37.000000000 -0600 > +++ ko/drivers/message/fusion/mptfc.c 2008-03-04 > 09:01:18.428176326 -0600 > @@ -1238,8 +1238,6 @@ mptfc_probe(struct pci_dev *pdev, const > sh->max_id = ioc->pfacts->MaxDevices; > sh->max_lun = max_lun; > > - sh->this_id = ioc->pfacts[0].PortSCSIID; > - > /* Required entry. > */ > sh->unique_id = ioc->id; > --- kou/drivers/message/fusion/mptsas.c 2008-03-04 > 08:38:58.000000000 -0600 > +++ ko/drivers/message/fusion/mptsas.c 2008-03-04 > 09:01:04.284807301 -0600 > @@ -3176,8 +3176,6 @@ mptsas_probe(struct pci_dev *pdev, const > > sh->transportt = mptsas_transport_template; > > - sh->this_id = ioc->pfacts[0].PortSCSIID; > - > /* Required entry. > */ > sh->unique_id = ioc->id; > > This looks good. I had deleted setting this_id in mptsas internal sources long ago. In addition to this change, we need to fix mptscsih_slave_configure so it doesn't set the queue depth to 1 for SAS protocal when sdev->id is greater than sh->max_id. The sas transport layer assigns the target ids, incrementing with each hotplug add, with large topologies, it doesn't take long to hit this threshold. Eric ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid 2008-03-04 17:22 ` Moore, Eric @ 2008-03-13 19:53 ` Michael Reed 2008-03-20 22:46 ` Moore, Eric 0 siblings, 1 reply; 4+ messages in thread From: Michael Reed @ 2008-03-13 19:53 UTC (permalink / raw) To: Moore, Eric; +Cc: linux-scsi, Prakash, Sathya, Jeremy Higdon, Hannes Reinecke Moore, Eric wrote: >>On Tuesday, March 04, 2008 9:51 AM, Michael Reed wrote: <snip> >> >> The fix is simple. Fusion SAS and Fibre Channel (subject to >> same bug) should just leave "this_id" initialized to "-1". >> >> Applies to 2.6.25-rc3-git5. >> >> Signed-off-by: Michael Reed <mdr@sgi.com> >> >> -- >> <snip> > > This looks good. I had deleted setting this_id in mptsas internal > sources long ago. In addition to this change, we need to fix > mptscsih_slave_configure so it doesn't set the queue depth to 1 for SAS > protocal when sdev->id is greater than sh->max_id. The sas transport > layer assigns the target ids, incrementing with each hotplug add, with > large topologies, it doesn't take long to hit this threshold. > > Eric Adjusting the patch to reflect Eric's comments. The change WRT max_id is from Hannes. FC should also be subject to the same issue. Signed-off-by: Michael Reed <mdr@sgi.com> --- gitu/drivers/message/fusion/mptfc.c 2008-01-24 14:58:37.000000000 -0800 +++ git/drivers/message/fusion/mptfc.c 2008-03-13 12:37:52.176015612 -0700 @@ -1238,8 +1238,6 @@ mptfc_probe(struct pci_dev *pdev, const sh->max_id = ioc->pfacts->MaxDevices; sh->max_lun = max_lun; - sh->this_id = ioc->pfacts[0].PortSCSIID; - /* Required entry. */ sh->unique_id = ioc->id; --- gitu/drivers/message/fusion/mptsas.c 2008-03-10 13:23:46.000000000 -0700 +++ git/drivers/message/fusion/mptsas.c 2008-03-13 12:38:01.912440308 -0700 @@ -3181,8 +3181,6 @@ mptsas_probe(struct pci_dev *pdev, const sh->transportt = mptsas_transport_template; - sh->this_id = ioc->pfacts[0].PortSCSIID; - /* Required entry. */ sh->unique_id = ioc->id; --- gitu/drivers/message/fusion/mptscsih.c 2008-03-10 13:20:30.000000000 -0700 +++ git/drivers/message/fusion/mptscsih.c 2008-03-13 12:38:38.070017332 -0700 @@ -2442,12 +2442,6 @@ mptscsih_slave_configure(struct scsi_dev ioc->name, sdev->sdtr, sdev->wdtr, sdev->ppr, sdev->inquiry_len)); - if (sdev->id > sh->max_id) { - /* error case, should never happen */ - scsi_adjust_queue_depth(sdev, 0, 1); - goto slave_configure_exit; - } - vdevice->configured_lun = 1; mptscsih_change_queue_depth(sdev, MPT_SCSI_CMD_PER_DEV_HIGH); @@ -2461,8 +2455,6 @@ mptscsih_slave_configure(struct scsi_dev ioc->name, vtarget->negoFlags, vtarget->maxOffset, vtarget->minSyncFactor)); -slave_configure_exit: - dsprintk(ioc, printk(MYIOC_s_DEBUG_FMT "tagged %d, simple %d, ordered %d\n", ioc->name,sdev->tagged_supported, sdev->simple_tags, ^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid 2008-03-13 19:53 ` Michael Reed @ 2008-03-20 22:46 ` Moore, Eric 0 siblings, 0 replies; 4+ messages in thread From: Moore, Eric @ 2008-03-20 22:46 UTC (permalink / raw) To: Michael Reed; +Cc: linux-scsi, Prakash, Sathya, Jeremy Higdon, Hannes Reinecke On Thursday, March 13, 2008 1:54 PM, Michael Reed wrote: > Adjusting the patch to reflect Eric's comments. The change WRT > max_id is from Hannes. FC should also be subject to the same issue. > > Signed-off-by: Michael Reed <mdr@sgi.com> > > --- gitu/drivers/message/fusion/mptfc.c 2008-01-24 > 14:58:37.000000000 -0800 > +++ git/drivers/message/fusion/mptfc.c 2008-03-13 > 12:37:52.176015612 -0700 > @@ -1238,8 +1238,6 @@ mptfc_probe(struct pci_dev *pdev, const > sh->max_id = ioc->pfacts->MaxDevices; > sh->max_lun = max_lun; > > - sh->this_id = ioc->pfacts[0].PortSCSIID; > - > /* Required entry. > */ > sh->unique_id = ioc->id; > --- gitu/drivers/message/fusion/mptsas.c 2008-03-10 > 13:23:46.000000000 -0700 > +++ git/drivers/message/fusion/mptsas.c 2008-03-13 > 12:38:01.912440308 -0700 > @@ -3181,8 +3181,6 @@ mptsas_probe(struct pci_dev *pdev, const > > sh->transportt = mptsas_transport_template; > > - sh->this_id = ioc->pfacts[0].PortSCSIID; > - > /* Required entry. > */ > sh->unique_id = ioc->id; > --- gitu/drivers/message/fusion/mptscsih.c 2008-03-10 > 13:20:30.000000000 -0700 > +++ git/drivers/message/fusion/mptscsih.c 2008-03-13 > 12:38:38.070017332 -0700 > @@ -2442,12 +2442,6 @@ mptscsih_slave_configure(struct scsi_dev > ioc->name, sdev->sdtr, sdev->wdtr, > sdev->ppr, sdev->inquiry_len)); > > - if (sdev->id > sh->max_id) { > - /* error case, should never happen */ > - scsi_adjust_queue_depth(sdev, 0, 1); > - goto slave_configure_exit; > - } > - > vdevice->configured_lun = 1; > mptscsih_change_queue_depth(sdev, MPT_SCSI_CMD_PER_DEV_HIGH); > > @@ -2461,8 +2455,6 @@ mptscsih_slave_configure(struct scsi_dev > ioc->name, vtarget->negoFlags, vtarget->maxOffset, > vtarget->minSyncFactor)); > > -slave_configure_exit: > - > dsprintk(ioc, printk(MYIOC_s_DEBUG_FMT > "tagged %d, simple %d, ordered %d\n", > ioc->name,sdev->tagged_supported, sdev->simple_tags, > ACK ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-03-20 22:47 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-03-04 16:50 [PATCH 1/1] Fusion SAS and Fibre Channel: target missing after resetting external raid Michael Reed 2008-03-04 17:22 ` Moore, Eric 2008-03-13 19:53 ` Michael Reed 2008-03-20 22:46 ` Moore, Eric
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.