All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Milburn <dmilburn@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: james.bottomley@suse.de, Haipao Fan <haipao.fan@intel.com>,
	dave.jiang@intel.com, linux-scsi@vger.kernel.org,
	jacek.danecki@intel.com, Maciej Trela <maciej.trela@intel.com>,
	jeffrey.d.skirvin@intel.com, edmund.nadolski@intel.com,
	"Darrick J. Wong" <djwong@us.ibm.com>
Subject: Re: [PATCH] libsas: fix/amend device gone notification in	sas_deform_port()
Date: Thu, 17 Feb 2011 17:36:16 -0600	[thread overview]
Message-ID: <4D5DB0F0.4060006@redhat.com> (raw)
In-Reply-To: <20110217030659.4303.12762.stgit@localhost6.localdomain6>

Dan Williams wrote:
> Commit 56dd2c06 "libsas: Don't issue commands to devices that have been
> hot-removed" edited Darrick's original patch to remove setting 'gone' in
> the sas_deform_port() path because that prevented scsi sync cache
> commands from being issued when the driver was unloaded.  However, this
> allows true device gone notifications (as signaled port phy events) to
> trigger sync cache commands to devices that are known to be unreachable.
> 
> Teach libsas which sas_deform_port() invocations are likely device gone
> events.
> 
> This patch also introduces sas_device_gone() which hopefully allows
> subtle/tricky locking to be dropped from lldd drivers, like the
> following in mvsas which is broken if the ata path is ever converted to
> call lldd_execute_task() with irqs enabled:
> 
> 	flags_libsas = 0;
> 	[...]
> 	spin_unlock_irqrestore(dev->sata_dev.ap->lock, flags_libsas);
> 	spin_unlock_irqrestore(&mvi->lock, flags);
> 	t->task_done(t);
> 	spin_lock_irqsave(&mvi->lock, flags);
> 	spin_lock_irqsave(dev->sata_dev.ap->lock, flags_libsas);
> 
> Cc: Darrick J. Wong <djwong@us.ibm.com>
> Cc: Haipao Fan <haipao.fan@intel.com>
> Cc: Maciej Trela <maciej.trela@intel.com>
> Cc: David Milburn <dmilburn@redhat.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> 
> The sas_device_gone() change could be split into its own patch for bisection
> purposes, let me know if you want me to resubmit.  This is being driven by a
> lockdep error in isci as it calls task->done() from lldd_execute_task()
> when it finds lldd_dev == NULL for the ata domain_device.

Dan,

So, you are seeing sas_ata_task_done() trying to get the ata_port lock,
but it has already been taken earlier in the code path?

ata_scsi_queuecmd (spin_lock_irqsave(ap->lock, irq_flags)
   __ata_scsi_queuecmd
     ata_scsi_translate
       ata_qc_issue
         sas_ata_qc_issue
           isci_task_execute_task
             isci_task_complete_for_upper_layer
               sas_ata_task_done 
(spin_lock_irqsave(dev->sata_dev.ap->lock, flags)

Thanks,
David


> 
> --
> Dan
> 
>  drivers/scsi/libsas/sas_expander.c |   20 +++++++++++++++++---
>  drivers/scsi/libsas/sas_internal.h |    3 ++-
>  drivers/scsi/libsas/sas_phy.c      |    4 ++--
>  drivers/scsi/libsas/sas_port.c     |   21 ++++++++++++---------
>  4 files changed, 33 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/scsi/libsas/sas_expander.c b/drivers/scsi/libsas/sas_expander.c
> index 505ffe3..7a785df 100644
> --- a/drivers/scsi/libsas/sas_expander.c
> +++ b/drivers/scsi/libsas/sas_expander.c
> @@ -1721,13 +1721,27 @@ out:
>  	return res;
>  }
>  
> +/* FIXME: delete this routine when/if sas_ata stops submitting tasks
> + * with host_lock held
> + */
> +void sas_device_gone(struct domain_device *dev)
> +{
> +	struct sas_rphy *rphy = dev->rphy;
> +	struct Scsi_Host *shost = dev_to_shost(rphy->dev.parent);
> +
> +	/* take the lock to synchronize against incoming sata i/o */
> +	spin_lock_irq(shost->host_lock);
> +	dev->gone = 1;
> +	spin_unlock_irq(shost->host_lock);
> +}
> +
>  static void sas_unregister_ex_tree(struct domain_device *dev)
>  {
>  	struct expander_device *ex = &dev->ex_dev;
>  	struct domain_device *child, *n;
>  
>  	list_for_each_entry_safe(child, n, &ex->children, siblings) {
> -		child->gone = 1;
> +		sas_device_gone(child);
>  		if (child->dev_type == EDGE_DEV ||
>  		    child->dev_type == FANOUT_DEV)
>  			sas_unregister_ex_tree(child);
> @@ -1748,7 +1762,7 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
>  			&ex_dev->children, siblings) {
>  			if (SAS_ADDR(child->sas_addr) ==
>  			    SAS_ADDR(phy->attached_sas_addr)) {
> -				child->gone = 1;
> +				sas_device_gone(child);
>  				if (child->dev_type == EDGE_DEV ||
>  				    child->dev_type == FANOUT_DEV)
>  					sas_unregister_ex_tree(child);
> @@ -1757,7 +1771,7 @@ static void sas_unregister_devs_sas_addr(struct domain_device *parent,
>  				break;
>  			}
>  		}
> -		parent->gone = 1;
> +		sas_device_gone(parent);
>  		sas_disable_routing(parent, phy->attached_sas_addr);
>  	}
>  	memset(phy->attached_sas_addr, 0, SAS_ADDR_SIZE);
> diff --git a/drivers/scsi/libsas/sas_internal.h b/drivers/scsi/libsas/sas_internal.h
> index 2d5c460..4db2752 100644
> --- a/drivers/scsi/libsas/sas_internal.h
> +++ b/drivers/scsi/libsas/sas_internal.h
> @@ -61,7 +61,7 @@ int  sas_init_queue(struct sas_ha_struct *sas_ha);
>  int  sas_init_events(struct sas_ha_struct *sas_ha);
>  void sas_shutdown_queue(struct sas_ha_struct *sas_ha);
>  
> -void sas_deform_port(struct asd_sas_phy *phy);
> +void sas_deform_port(struct asd_sas_phy *phy, int gone);
>  
>  void sas_porte_bytes_dmaed(struct work_struct *work);
>  void sas_porte_broadcast_rcvd(struct work_struct *work);
> @@ -71,6 +71,7 @@ void sas_porte_hard_reset(struct work_struct *work);
>  
>  int sas_notify_lldd_dev_found(struct domain_device *);
>  void sas_notify_lldd_dev_gone(struct domain_device *);
> +void sas_device_gone(struct domain_device *dev);
>  
>  int sas_smp_phy_control(struct domain_device *dev, int phy_id,
>  			enum phy_func phy_func, struct sas_phy_linkrates *);
> diff --git a/drivers/scsi/libsas/sas_phy.c b/drivers/scsi/libsas/sas_phy.c
> index b459c4b..e0f5018 100644
> --- a/drivers/scsi/libsas/sas_phy.c
> +++ b/drivers/scsi/libsas/sas_phy.c
> @@ -39,7 +39,7 @@ static void sas_phye_loss_of_signal(struct work_struct *work)
>  	sas_begin_event(PHYE_LOSS_OF_SIGNAL, &phy->ha->event_lock,
>  			&phy->phy_events_pending);
>  	phy->error = 0;
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  static void sas_phye_oob_done(struct work_struct *work)
> @@ -66,7 +66,7 @@ static void sas_phye_oob_error(struct work_struct *work)
>  	sas_begin_event(PHYE_OOB_ERROR, &phy->ha->event_lock,
>  			&phy->phy_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  
>  	if (!port && phy->enabled && i->dft->lldd_control_phy) {
>  		phy->error++;
> diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
> index 5257fdf..be490e4 100644
> --- a/drivers/scsi/libsas/sas_port.c
> +++ b/drivers/scsi/libsas/sas_port.c
> @@ -57,7 +57,7 @@ static void sas_form_port(struct asd_sas_phy *phy)
>  
>  	if (port) {
>  		if (!phy_is_wideport_member(port, phy))
> -			sas_deform_port(phy);
> +			sas_deform_port(phy, 0);
>  		else {
>  			SAS_DPRINTK("%s: phy%d belongs to port%d already(%d)!\n",
>  				    __func__, phy->id, phy->port->id,
> @@ -153,28 +153,31 @@ static void sas_form_port(struct asd_sas_phy *phy)
>   * This is called when the physical link to the other phy has been
>   * lost (on this phy), in Event thread context. We cannot delay here.
>   */
> -void sas_deform_port(struct asd_sas_phy *phy)
> +void sas_deform_port(struct asd_sas_phy *phy, int gone)
>  {
>  	struct sas_ha_struct *sas_ha = phy->ha;
>  	struct asd_sas_port *port = phy->port;
>  	struct sas_internal *si =
>  		to_sas_internal(sas_ha->core.shost->transportt);
> +	struct domain_device *dev;
>  	unsigned long flags;
>  
>  	if (!port)
>  		return;		  /* done by a phy event */
>  
> -	if (port->port_dev)
> -		port->port_dev->pathways--;
> +	dev = port->port_dev;
> +	if (dev)
> +		dev->pathways--;
>  
>  	if (port->num_phys == 1) {
> +		if (dev && gone)
> +			sas_device_gone(dev);
>  		sas_unregister_domain_devices(port);
>  		sas_port_delete(port->port);
>  		port->port = NULL;
>  	} else
>  		sas_port_delete_phy(port->port, phy->phy);
>  
> -
>  	if (si->dft->lldd_port_deformed)
>  		si->dft->lldd_port_deformed(phy);
>  
> @@ -244,7 +247,7 @@ void sas_porte_link_reset_err(struct work_struct *work)
>  	sas_begin_event(PORTE_LINK_RESET_ERR, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  void sas_porte_timer_event(struct work_struct *work)
> @@ -256,7 +259,7 @@ void sas_porte_timer_event(struct work_struct *work)
>  	sas_begin_event(PORTE_TIMER_EVENT, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  void sas_porte_hard_reset(struct work_struct *work)
> @@ -268,7 +271,7 @@ void sas_porte_hard_reset(struct work_struct *work)
>  	sas_begin_event(PORTE_HARD_RESET, &phy->ha->event_lock,
>  			&phy->port_events_pending);
>  
> -	sas_deform_port(phy);
> +	sas_deform_port(phy, 1);
>  }
>  
>  /* ---------- SAS port registration ---------- */
> @@ -306,6 +309,6 @@ void sas_unregister_ports(struct sas_ha_struct *sas_ha)
>  
>  	for (i = 0; i < sas_ha->num_phys; i++)
>  		if (sas_ha->sas_phy[i]->port)
> -			sas_deform_port(sas_ha->sas_phy[i]);
> +			sas_deform_port(sas_ha->sas_phy[i], 0);
>  
>  }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  reply	other threads:[~2011-02-17 23:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-17  3:11 [PATCH] libsas: fix/amend device gone notification in sas_deform_port() Dan Williams
2011-02-17 23:36 ` David Milburn [this message]
2011-02-18  0:19   ` Dan Williams
2011-03-04 22:44     ` David Milburn
2011-03-04 22:53       ` Dan Williams
2011-03-04 23:08         ` David Milburn
2011-03-26 22:27 ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D5DB0F0.4060006@redhat.com \
    --to=dmilburn@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=djwong@us.ibm.com \
    --cc=edmund.nadolski@intel.com \
    --cc=haipao.fan@intel.com \
    --cc=jacek.danecki@intel.com \
    --cc=james.bottomley@suse.de \
    --cc=jeffrey.d.skirvin@intel.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=maciej.trela@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.