From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Christie Subject: Re: [PATCH] fc_user_scan correction Date: Tue, 22 Apr 2008 13:11:54 -0500 Message-ID: <480E2A6A.4040309@cs.wisc.edu> References: <1208885319.12659.7.camel@localhost.localdomain> <480E29FC.3090502@cs.wisc.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from sabe.cs.wisc.edu ([128.105.6.20]:54337 "EHLO sabe.cs.wisc.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755497AbYDVSMF (ORCPT ); Tue, 22 Apr 2008 14:12:05 -0400 In-Reply-To: <480E29FC.3090502@cs.wisc.edu> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: James.Smart@Emulex.Com Cc: linux-scsi@vger.kernel.org Mike Christie wrote: > James Smart wrote: >> Way back when, when the fc_user_scan routine was created, it kept some >> of its original logic that walked the rport list and kicked off a scan. >> Unfortunately, it didn't keep any of the locking around the rport list, >> nor did it consider the synchronous nature of the scan invoked. The >> result, >> there are some scan requests where the rport list changes, thus a >> subsequent >> scan is called on a bogus rport structure and the system NMI's. >> >> The fix is to defer to the midlayer scan routine and not deal with >> locking >> in the user scan function. >> >> Patch was cut against scsi-misc-2.6 >> >> -- james s >> >> >> Signed-off-by: James Smart >> >> --- >> >> scsi_scan.c | 1 + >> scsi_transport_fc.c | 21 +-------------------- >> 2 files changed, 2 insertions(+), 20 deletions(-) >> >> >> diff -upNr a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c >> --- a/drivers/scsi/scsi_scan.c 2008-04-22 13:09:23.000000000 -0400 >> +++ b/drivers/scsi/scsi_scan.c 2008-04-22 13:10:02.000000000 -0400 >> @@ -1669,6 +1669,7 @@ int scsi_scan_host_selected(struct Scsi_ >> >> return 0; >> } >> +EXPORT_SYMBOL(scsi_scan_host_selected); >> >> static void scsi_sysfs_add_devices(struct Scsi_Host *shost) >> { >> diff -upNr a/drivers/scsi/scsi_transport_fc.c >> b/drivers/scsi/scsi_transport_fc.c >> --- a/drivers/scsi/scsi_transport_fc.c 2008-04-22 >> 12:51:22.000000000 -0400 >> +++ b/drivers/scsi/scsi_transport_fc.c 2008-04-22 >> 12:53:30.000000000 -0400 >> @@ -1929,29 +1929,10 @@ fc_timed_out(struct scsi_cmnd *scmd) >> return EH_NOT_HANDLED; >> } >> >> -/* >> - * Must be called with shost->host_lock held >> - */ >> static int fc_user_scan(struct Scsi_Host *shost, uint channel, >> uint id, uint lun) >> { >> - struct fc_rport *rport; >> - >> - list_for_each_entry(rport, &fc_host_rports(shost), peers) { >> - if (rport->scsi_target_id == -1) >> - continue; >> - >> - if (rport->port_state != FC_PORTSTATE_ONLINE) >> - continue; >> - >> - if ((channel == SCAN_WILD_CARD || channel == rport->channel) && >> - (id == SCAN_WILD_CARD || id == rport->scsi_target_id)) { >> - scsi_scan_target(&rport->dev, rport->channel, >> - rport->scsi_target_id, lun, 1); >> - } >> - } >> - >> - return 0; >> + return scsi_scan_host_selected(shost, channel, id, lun, 1); >> } > > Is this is the same as if you did not implement the user_scan callout? > scsi_sysfs.c will call > > scsi_scan_host_selected(shost, channel, id, lun, 1); > > I thought we added the user_scan callback because the transport classes > had to pass in the device struct between the host and target so we got > > .../host/rport/target/scsi_device > > instead of > > .../host/target/scsi_device > > qla4xxx has the same problem. Do not look at it for help :( It added a > mutex and does not deadlock because like the FC class it stats the > removal of the rport/session then device so the cache sync always fails > (the check ready function always returns DID_NO_CONNECT so the cache > sync fails). iscsi tcp/iser/bnx2i works because it has userspace helping > out with the removal and shutdown and does it in two stages. > I think we need some loop + locking + refcounting similar to how the shost_for_each_device loops over devices.