public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: David Milburn <dmilburn@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: james.bottomley@suse.de, Haipao Fan <haipao.fan@intel.com>,
	dave.jiang@intel.com, linux-scsi@vger.kernel.org,
	jacek.danecki@intel.com, Maciej Trela <maciej.trela@intel.com>,
	jeffrey.d.skirvin@intel.com, edmund.nadolski@intel.com,
	"Darrick J. Wong" <djwong@us.ibm.com>
Subject: Re: [PATCH] libsas: fix/amend device gone notification in	sas_deform_port()
Date: Thu, 17 Feb 2011 17:36:16 -0600	[thread overview]
Message-ID: <4D5DB0F0.4060006@redhat.com> (raw)
In-Reply-To: <20110217030659.4303.12762.stgit@localhost6.localdomain6>

Dan Williams wrote:
> Commit 56dd2c06 "libsas: Don't issue commands to devices that have been
> hot-removed" edited Darrick's original patch to remove setting 'gone' in
> the sas_deform_port() path because that prevented scsi sync cache
> commands from being issued when the driver was unloaded.  However, this
> allows true device gone notifications (as signaled port phy events) to
> trigger sync cache commands to devices that are known to be unreachable.
> 
> Teach libsas which sas_deform_port() invocations are likely device gone
> events.
> 
> This patch also introduces sas_device_gone() which hopefully allows
> subtle/tricky locking to be dropped from lldd drivers, like the
> following in mvsas which is broken if the ata path is ever converted to
> call lldd_execute_task() with irqs enabled:
> 
> 	flags_libsas = 0;
> 	[...]
> 	spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags_libsas);
> 	spin_unlock_irqrestore(&mvi->lock, flags);
> 	t->task_done(t);
> 	spin_lock_irqsave(&mvi->lock, flags);
> 	spin_lock_irqsave(dev->sata_dev.ap->lock, flags_libsas);
> 
> Cc: Darrick J. Wong <djwong@us.ibm.com>
> Cc: Haipao Fan <haipao.fan@intel.com>
> Cc: Maciej Trela <maciej.trela@intel.com>
> Cc: David Milburn <dmilburn@redhat.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> 
> The sas_device_gone() change could be split into its own patch for bisection
> purposes, let me know if you want me to resubmit.  This is being driven by a
> lockdep error in isci as it calls task->done() from lldd_execute_task()
> when it finds lldd_dev == NULL for the ata domain_device.

Dan,

So, you are seeing sas_ata_task_done() trying to get the ata_port lock,
but it has already been taken earlier in the code path?

ata_scsi_queuecmd (spin_lock_irqsave(ap->lock, irq_flags)
   __ata_scsi_queuecmd
     ata_scsi_translate
       ata_qc_issue
         sas_ata_qc_issue
           isci_task_execute_task
             isci_task_complete_for_upper_layer
               sas_ata_task_done 
(spin_lock_irqsave(dev->sata_dev.ap->lock, flags)

Thanks,
David


> 
> --
> Dan
> 
>  drivers/scsi/libsas/sas_expander.c |   20 +++++++++++++++++---
>  drivers/scsi/libsas/sas_internal.h |    3 ++-
>  drivers/scsi/libsas/sas_phy.c      |    4 ++--
>  drivers/scsi/libsas/sas_port.c     |   21 ++++++++++++---------
>  4 files changed, 33 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
> index 505ffe3..7a785df 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -1721,13 +1721,27 @@ out:
>  	return res;
>  }
>  
> +/* FIXME: delete this routine when/if sas_ata stops submitting tasks
> + * with host_lock held
> + */
> +void sas_device_gone(struct domain_device *dev)
> +{
> +	struct sas_rphy *rphy = dev->rphy;
> +	struct Scsi_Host *shost = dev_to_shost(rphy->dev.parent);
> +
> +	/* take the lock to synchronize against incoming sata i/o */
> +	spin_lock_irq(shost->host_lock);
> +	dev->gone = 1;
> +	spin_unlock_irq(shost->host_lock);
> +}
> +
>  static void sas_unregister_ex_tree(struct domain_device *dev)
>  {
>  	struct expander_device *ex = &dev->ex_dev;
>  	struct domain_device *child, *n;
>  
>  	list_for_each_entry_safe(child, n, &ex->children, siblings) {
> -		child->gone = 1;
> +		sas_device_gone(child);
>  		if (child->dev_type == EDGE_DEV ||
>  		    child->dev_type == FANOUT_DEV)
>  			sas_unregister_ex_tree(child);
> @@ -1748,7 +1762,7 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
>  			&ex_dev->children, siblings) {
>  			if (SAS_ADDR(child->sas_addr) ==
>  			    SAS_ADDR(phy->attached_sas_addr)) {
> -				child->gone = 1;
> +				sas_device_gone(child);
>  				if (child->dev_type == EDGE_DEV ||
>  				    child->dev_type == FANOUT_DEV)
>  					sas_unregister_ex_tree(child);
> @@ -1757,7 +1771,7 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
>  				break;
>  			}
>  		}
> -		parent->gone = 1;
> +		sas_device_gone(parent);
>  		sas_disable_routing(parent, phy->attached_sas_addr);
>  	}
>  	memset(phy->attached_sas_addr, 0, SAS_ADDR_SIZE);
> diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
> index 2d5c460..4db2752 100644
> --- a/drivers/scsi/libsas/sas_internal.h
> +++ b/drivers/scsi/libsas/sas_internal.h
> @@ -61,7 +61,7 @@ int  sas_init_queue(struct sas_ha_struct *sas_ha);
>  int  sas_init_events(struct sas_ha_struct *sas_ha);
>  void sas_shutdown_queue(struct sas_ha_struct *sas_ha);
>  
> -void sas_deform_port(struct asd_sas_phy *phy);
> +void sas_deform_port(struct asd_sas_phy *phy, int gone);
>  
>  void sas_porte_bytes_dmaed(struct work_struct *work);
>  void sas_porte_broadcast_rcvd(struct work_struct *work);
> @@ -71,6 +71,7 @@ void sas_porte_hard_reset(struct work_struct *work);
>  
>  int sas_notify_lldd_dev_found(struct domain_device *);
>  void sas_notify_lldd_dev_gone(struct domain_device *);
> +void sas_device_gone(struct domain_device *dev);
>  
>  int sas_smp_phy_control(struct domain_device *dev, int phy_id,
>  			enum phy_func phy_func, struct sas_phy_linkrates *);
> diff --git a/drivers/scsi/libsas/sas_phy.c b/drivers/scsi/libsas/sas_phy.c
> index b459c4b..e0f5018 100644
> --- a/drivers/scsi/libsas/sas_phy.c
> +++ b/drivers/scsi/libsas/sas_phy.c
> @@ -39,7 +39,7 @@ static void sas_phye_loss_of_signal(struct work_struct *work)
>  	sas_begin_event(PHYE_LOSS_OF_SIGNAL, &phy->ha->event_lock,
>  			&phy->phy_events_pending);
>  	phy->error = 0;
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  static void sas_phye_oob_done(struct work_struct *work)
> @@ -66,7 +66,7 @@ static void sas_phye_oob_error(struct work_struct *work)
>  	sas_begin_event(PHYE_OOB_ERROR, &phy->ha->event_lock,
>  			&phy->phy_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  
>  	if (!port && phy->enabled && i->dft->lldd_control_phy) {
>  		phy->error++;
> diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
> index 5257fdf..be490e4 100644
> --- a/drivers/scsi/libsas/sas_port.c
> +++ b/drivers/scsi/libsas/sas_port.c
> @@ -57,7 +57,7 @@ static void sas_form_port(struct asd_sas_phy *phy)
>  
>  	if (port) {
>  		if (!phy_is_wideport_member(port, phy))
> -			sas_deform_port(phy);
> +			sas_deform_port(phy, 0);
>  		else {
>  			SAS_DPRINTK("%s: phy%d belongs to port%d already(%d)!\n",
>  				    __func__, phy->id, phy->port->id,
> @@ -153,28 +153,31 @@ static void sas_form_port(struct asd_sas_phy *phy)
>   * This is called when the physical link to the other phy has been
>   * lost (on this phy), in Event thread context. We cannot delay here.
>   */
> -void sas_deform_port(struct asd_sas_phy *phy)
> +void sas_deform_port(struct asd_sas_phy *phy, int gone)
>  {
>  	struct sas_ha_struct *sas_ha = phy->ha;
>  	struct asd_sas_port *port = phy->port;
>  	struct sas_internal *si =
>  		to_sas_internal(sas_ha->core.shost->transportt);
> +	struct domain_device *dev;
>  	unsigned long flags;
>  
>  	if (!port)
>  		return;		  /* done by a phy event */
>  
> -	if (port->port_dev)
> -		port->port_dev->pathways--;
> +	dev = port->port_dev;
> +	if (dev)
> +		dev->pathways--;
>  
>  	if (port->num_phys == 1) {
> +		if (dev && gone)
> +			sas_device_gone(dev);
>  		sas_unregister_domain_devices(port);
>  		sas_port_delete(port->port);
>  		port->port = NULL;
>  	} else
>  		sas_port_delete_phy(port->port, phy->phy);
>  
> -
>  	if (si->dft->lldd_port_deformed)
>  		si->dft->lldd_port_deformed(phy);
>  
> @@ -244,7 +247,7 @@ void sas_porte_link_reset_err(struct work_struct *work)
>  	sas_begin_event(PORTE_LINK_RESET_ERR, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  void sas_porte_timer_event(struct work_struct *work)
> @@ -256,7 +259,7 @@ void sas_porte_timer_event(struct work_struct *work)
>  	sas_begin_event(PORTE_TIMER_EVENT, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  void sas_porte_hard_reset(struct work_struct *work)
> @@ -268,7 +271,7 @@ void sas_porte_hard_reset(struct work_struct *work)
>  	sas_begin_event(PORTE_HARD_RESET, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  /* ---------- SAS port registration ---------- */
> @@ -306,6 +309,6 @@ void sas_unregister_ports(struct sas_ha_struct *sas_ha)
>  
>  	for (i = 0; i < sas_ha->num_phys; i++)
>  		if (sas_ha->sas_phy[i]->port)
> -			sas_deform_port(sas_ha->sas_phy[i]);
> +			sas_deform_port(sas_ha->sas_phy[i], 0);
>  
>  }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2011-02-17 23:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17  3:11 [PATCH] libsas: fix/amend device gone notification in sas_deform_port() Dan Williams
2011-02-17 23:36 ` David Milburn [this message]
2011-02-18  0:19   ` Dan Williams
2011-03-04 22:44     ` David Milburn
2011-03-04 22:53       ` Dan Williams
2011-03-04 23:08         ` David Milburn
2011-03-26 22:27 ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5DB0F0.4060006@redhat.com \
    --to=dmilburn@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=djwong@us.ibm.com \
    --cc=edmund.nadolski@intel.com \
    --cc=haipao.fan@intel.com \
    --cc=jacek.danecki@intel.com \
    --cc=james.bottomley@suse.de \
    --cc=jeffrey.d.skirvin@intel.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maciej.trela@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox