* mpt2sas,mpt3sas watchdog device removal @ 2013-05-15 17:24 Joe Lawrence 2013-05-15 17:26 ` [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Joe Lawrence 2013-05-15 17:29 ` mpt2sas,mpt3sas watchdog device removal Joe Lawrence 0 siblings, 2 replies; 12+ messages in thread From: Joe Lawrence @ 2013-05-15 17:24 UTC (permalink / raw) To: linux-scsi Cc: James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, Bjorn Helgaas This is a followup to the earlier discussion of MPT watchdog device removal calling directly into PCI core API: [1] http://thread.gmane.org/gmane.linux.scsi/80629 I've tested two safer, alternative methods of removing MPT hosts from the SCSI topology. Both involve wrapping the existing MPT .remove routine, ensuring mutual exclusion between regularly scheduled PCI device removal and the drivers' periodic watchdog thread. Both changes tested well against surprise PCI removal of LSI SAS 9211-8i HBAs while driving direct IO out to attached disks. The first version is straightforward, essentially just adding a common mutex and checking that the driver still cares about a given PCI device before removing. The second version is an attempt to detach only from the SCSI topology as soon as possible. Later PCI removal cleans up the rest of the resources. Neither patch maps cleanly to the MPT fusion driver as its watchdog thread resides in the mptbase module. The code currently uses PCI core API pci_remove_bus_device to route around the driver's module dependencies to call from mptbase to mptsas: Module Size Used by mptsas 62366 8 << PCI .remove mptscsih 38803 1 mptsas mptbase 99878 2 mptsas,mptscsih << watchdog The fusion driver has devised other means of calling from mptbase to mptscsih, for example, via the schedule_dead_ioc_flush_running_cmds function pointer. The two changes I explored were made with a relatively light hand, so I didn't know how best to proceed with a MPT fusion patch. Comments on either removal patch strategy welcome. It would also be great if we had documentation guiding the SCSI LLDs in how to safely and completely remove attached hosts in hotplug and defective HW scenarios. Regards, -- Joe ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe 2013-05-15 17:24 mpt2sas,mpt3sas watchdog device removal Joe Lawrence @ 2013-05-15 17:26 ` Joe Lawrence 2013-05-17 15:29 ` Bjorn Helgaas 2013-05-15 17:29 ` mpt2sas,mpt3sas watchdog device removal Joe Lawrence 1 sibling, 1 reply; 12+ messages in thread From: Joe Lawrence @ 2013-05-15 17:26 UTC (permalink / raw) To: linux-scsi Cc: James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, Bjorn Helgaas >From 9fc1a958ad48718216fbdc19405297dd11d11539 Mon Sep 17 00:00:00 2001 From: Joe Lawrence <joe.lawrence@stratus.com> Date: Tue, 14 May 2013 15:41:17 -0400 Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce device removal races with PCI callback functions such as device removal. Modify the MPT PCI callbacks to verify Scsi_Host attachment under a common mutex to prevent racing the driver's watchdog device removal. (PCI callbacks can trust pci_dev existence.) To protect the watchdog device removal thread from racing PCI removal, first wrapper the existing PCI device .remove function. The watchdog should directly call this function, not PCI hotplug API. Under the same common mutex, this routine should verify that pci_dev is on driver's ioc_list before proceeding with device removal. (Asynchronous driver watchdog kthread cannot trust pci_dev existence.) Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> --- drivers/scsi/mpt2sas/mpt2sas_base.c | 24 +++------ drivers/scsi/mpt2sas/mpt2sas_base.h | 2 + drivers/scsi/mpt2sas/mpt2sas_scsih.c | 96 ++++++++++++++++++++++++++++-------- drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++------ drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 95 ++++++++++++++++++++++++++++------- 6 files changed, 171 insertions(+), 70 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c index bcb23d2..eb24ddd 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.c +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c @@ -116,24 +116,16 @@ module_param_call(mpt2sas_fwfault_debug, _scsih_set_fwfault_debug, /** * mpt2sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc + * @pdev: PCI device struct * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. + * Returns 0. */ -static int mpt2sas_remove_dead_ioc_func(void *arg) +static int +mpt2sas_remove_dead_ioc_func(void *arg) { - struct MPT2SAS_ADAPTER *ioc = (struct MPT2SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); - return 0; + /* mpt2sas_scsih_detach_pci will validate pci_dev */ + mpt2sas_scsih_detach_pci((struct pci_dev *)arg); + return 0; } @@ -192,7 +184,7 @@ _base_fault_reset_work(struct work_struct *work) */ ioc->remove_host = 1; /*Remove the Dead Host */ - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, + p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc->pdev, "mpt2sas_dead_ioc_%d", ioc->id); if (IS_ERR(p)) { printk(MPT2SAS_ERR_FMT diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h index 4caaac1..d88515d 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h @@ -1079,6 +1079,8 @@ void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc); void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase); +void mpt2sas_scsih_detach_pci(struct pci_dev *pdev); + /* config shared API */ u8 mpt2sas_config_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply); diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c index c6bdc92..8bff162 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c @@ -78,6 +78,8 @@ static int _scsih_scan_finished(struct Scsi_Host *shost, unsigned long time); LIST_HEAD(mpt2sas_ioc_list); /* local parameters */ +DEFINE_MUTEX(_mpt2sas_pci_mutex); + static u8 scsi_io_cb_idx = -1; static u8 tm_cb_idx = -1; static u8 ctl_cb_idx = -1; @@ -7660,11 +7662,17 @@ _scsih_ir_shutdown(struct MPT2SAS_ADAPTER *ioc) static void _scsih_shutdown(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT2SAS_ADAPTER *ioc; struct workqueue_struct *wq; unsigned long flags; + mutex_lock(&_mpt2sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7677,26 +7685,41 @@ _scsih_shutdown(struct pci_dev *pdev) _scsih_ir_shutdown(ioc); mpt2sas_base_detach(ioc); +out: + mutex_unlock(&_mpt2sas_pci_mutex); } /** - * _scsih_remove - detach and remove add host + * mpt2sas_scsih_detach_pci - detach shost from PCI device * @pdev: PCI device struct * - * Routine called when unloading the driver. + * Routine called by pci driver .remove callback and watchdog-created + * mpt2sas_remove_dead_ioc_func kthread. * Return nothing. */ -static void -_scsih_remove(struct pci_dev *pdev) +void +mpt2sas_scsih_detach_pci(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT2SAS_ADAPTER *ioc; struct _sas_port *mpt2sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT2SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + mutex_lock(&_mpt2sas_pci_mutex); + /* verify driver still knows about this pdev */ + list_for_each_entry(ioc, &mpt2sas_ioc_list, list) { + if (ioc->pdev == pdev) + goto remove; + } + goto out; +remove: + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7750,6 +7773,21 @@ _scsih_remove(struct pci_dev *pdev) list_del(&ioc->list); scsi_remove_host(shost); scsi_host_put(shost); +out: + mutex_unlock(&_mpt2sas_pci_mutex); +} + +/** + * _scsih_remove - detach and remove host + * @pdev: PCI device struct + * + * Routine called when unloading the driver and hotplug remove. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + mpt2sas_scsih_detach_pci(pdev); } /** @@ -8159,10 +8197,16 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) static int _scsih_suspend(struct pci_dev *pdev, pm_message_t state) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT2SAS_ADAPTER *ioc; pci_power_t device_state; + mutex_lock(&_mpt2sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + mpt2sas_base_stop_watchdog(ioc); scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); @@ -8174,6 +8218,9 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_save_state(pdev); pci_disable_device(pdev); pci_set_power_state(pdev, device_state); +out: + mutex_unlock(&_mpt2sas_pci_mutex); + return 0; } @@ -8186,10 +8233,17 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) static int _scsih_resume(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); - pci_power_t device_state = pdev->current_state; - int r; + struct Scsi_Host *shost; + struct MPT2SAS_ADAPTER *ioc; + pci_power_t device_state; + int r = 0; + + mutex_lock(&_mpt2sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + device_state = pdev->current_state; printk(MPT2SAS_INFO_FMT "pdev=0x%p, slot=%s, previous " "operating state [D%d]\n", ioc->name, pdev, @@ -8200,13 +8254,15 @@ _scsih_resume(struct pci_dev *pdev) pci_restore_state(pdev); ioc->pdev = pdev; r = mpt2sas_base_map_resources(ioc); - if (r) - return r; + if (!r) { + mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); + scsi_unblock_requests(shost); + mpt2sas_base_start_watchdog(ioc); + } +out: + mutex_unlock(&_mpt2sas_pci_mutex); - mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); - mpt2sas_base_start_watchdog(ioc); - return 0; + return r; } #endif /* CONFIG_PM */ diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 1836003..9d4f314 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -111,23 +111,15 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug, /** * mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc + * @pdev: PCI device struct * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. + * Returns 0. */ -static int mpt3sas_remove_dead_ioc_func(void *arg) +static int +mpt3sas_remove_dead_ioc_func(void *arg) { - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); + /* mpt3sas_scsih_detach_pci will validate pci_dev */ + mpt3sas_scsih_detach_pci((struct pci_dev *)arg); return 0; } @@ -173,7 +165,7 @@ _base_fault_reset_work(struct work_struct *work) */ ioc->remove_host = 1; /*Remove the Dead Host */ - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, + p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc->pdev, "mpt3sas_dead_ioc_%d", ioc->id); if (IS_ERR(p)) pr_err(MPT3SAS_FMT diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 994656c..225c84f 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -1009,6 +1009,8 @@ struct _sas_device *mpt3sas_scsih_sas_device_find_by_sas_address( void mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc); +void mpt3sas_scsih_detach_pci(struct pci_dev *pdev); + /* config shared API */ u8 mpt3sas_config_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, u32 reply); diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index dcbf7c8..5b8c365 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -81,6 +81,8 @@ static int _scsih_scan_finished(struct Scsi_Host *shost, unsigned long time); LIST_HEAD(mpt3sas_ioc_list); /* local parameters */ +DEFINE_MUTEX(_mpt3sas_pci_mutex); + static u8 scsi_io_cb_idx = -1; static u8 tm_cb_idx = -1; static u8 ctl_cb_idx = -1; @@ -7365,22 +7367,36 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc) } /** - * _scsih_remove - detach and remove add host + * mpt3sas_scsih_detach_pci - detach shost from PCI device * @pdev: PCI device struct * - * Routine called when unloading the driver. + * Routine called by pci driver .remove callback and watchdog-created + * mpt3sas_remove_dead_ioc_func kthread. * Return nothing. */ -static void _scsih_remove(struct pci_dev *pdev) +void +mpt3sas_scsih_detach_pci(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT3SAS_ADAPTER *ioc; struct _sas_port *mpt3sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT3SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + mutex_lock(&_mpt3sas_pci_mutex); + /* verify driver still knows about this pdev */ + list_for_each_entry(ioc, &mpt3sas_ioc_list, list) { + if (ioc->pdev == pdev) + goto remove; + } + goto out; +remove: + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7434,6 +7450,21 @@ static void _scsih_remove(struct pci_dev *pdev) list_del(&ioc->list); scsi_remove_host(shost); scsi_host_put(shost); +out: + mutex_unlock(&_mpt3sas_pci_mutex); +} + +/** + * _scsih_remove - detach and remove host + * @pdev: PCI device struct + * + * Routine called when unloading the driver and hotplug remove. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + mpt3sas_scsih_detach_pci(pdev); } /** @@ -7445,11 +7476,17 @@ static void _scsih_remove(struct pci_dev *pdev) static void _scsih_shutdown(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT3SAS_ADAPTER *ioc; struct workqueue_struct *wq; unsigned long flags; + mutex_lock(&_mpt3sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7462,6 +7499,8 @@ _scsih_shutdown(struct pci_dev *pdev) _scsih_ir_shutdown(ioc); mpt3sas_base_detach(ioc); +out: + mutex_unlock(&_mpt3sas_pci_mutex); } @@ -7854,10 +7893,16 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) static int _scsih_suspend(struct pci_dev *pdev, pm_message_t state) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); + struct Scsi_Host *shost; + struct MPT3SAS_ADAPTER *ioc; pci_power_t device_state; + mutex_lock(&_mpt3sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + mpt3sas_base_stop_watchdog(ioc); flush_scheduled_work(); scsi_block_requests(shost); @@ -7869,6 +7914,9 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_save_state(pdev); mpt3sas_base_free_resources(ioc); pci_set_power_state(pdev, device_state); +out: + mutex_unlock(&_mpt3sas_pci_mutex); + return 0; } @@ -7881,10 +7929,17 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) static int _scsih_resume(struct pci_dev *pdev) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); - pci_power_t device_state = pdev->current_state; - int r; + struct Scsi_Host *shost; + struct MPT3SAS_ADAPTER *ioc; + pci_power_t device_state; + int r = 0; + + mutex_lock(&_mpt3sas_pci_mutex); + shost = pci_get_drvdata(pdev); + if (!shost) + goto out; + ioc = shost_priv(shost); + device_state = pdev->current_state; pr_info(MPT3SAS_FMT "pdev=0x%p, slot=%s, previous operating state [D%d]\n", @@ -7895,13 +7950,15 @@ _scsih_resume(struct pci_dev *pdev) pci_restore_state(pdev); ioc->pdev = pdev; r = mpt3sas_base_map_resources(ioc); - if (r) - return r; + if (!r) { + mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); + scsi_unblock_requests(shost); + mpt3sas_base_start_watchdog(ioc); + } +out: + mutex_unlock(&_mpt3sas_pci_mutex); - mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); - mpt3sas_base_start_watchdog(ioc); - return 0; + return r; } #endif /* CONFIG_PM */ -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe 2013-05-15 17:26 ` [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Joe Lawrence @ 2013-05-17 15:29 ` Bjorn Helgaas 0 siblings, 0 replies; 12+ messages in thread From: Bjorn Helgaas @ 2013-05-17 15:29 UTC (permalink / raw) To: Joe Lawrence Cc: linux-scsi@vger.kernel.org, James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, linux-pci@vger.kernel.org [+cc linux-pci] On Wed, May 15, 2013 at 11:26 AM, Joe Lawrence <joe.lawrence@stratus.com> wrote: > From 9fc1a958ad48718216fbdc19405297dd11d11539 Mon Sep 17 00:00:00 2001 > From: Joe Lawrence <joe.lawrence@stratus.com> > Date: Tue, 14 May 2013 15:41:17 -0400 > Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal > safe > > Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce > device removal races with PCI callback functions such as device removal. > > Modify the MPT PCI callbacks to verify Scsi_Host attachment under a > common mutex to prevent racing the driver's watchdog device removal. > (PCI callbacks can trust pci_dev existence.) > > To protect the watchdog device removal thread from racing PCI removal, > first wrapper the existing PCI device .remove function. The watchdog > should directly call this function, not PCI hotplug API. Under the same > common mutex, this routine should verify that pci_dev is on driver's > ioc_list before proceeding with device removal. (Asynchronous driver > watchdog kthread cannot trust pci_dev existence.) I think it's a mistake to use the strategy of verifying that a pci_dev still exists. The driver should be written so that if you have a pci_dev pointer, you are guaranteed that the pci_dev is still valid. The driver should clean up any asynchronous threads or pending work items that keep a pci_dev pointer in its .remove() method, which is before the pci_dev is deallocated. If a driver keeps a pointer to a struct pci_dev after .remove() returns, that's a bug. Bjorn > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > --- > drivers/scsi/mpt2sas/mpt2sas_base.c | 24 +++------ > drivers/scsi/mpt2sas/mpt2sas_base.h | 2 + > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 96 ++++++++++++++++++++++++++++-------- > drivers/scsi/mpt3sas/mpt3sas_base.c | 22 +++------ > drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 95 ++++++++++++++++++++++++++++------- > 6 files changed, 171 insertions(+), 70 deletions(-) > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c > index bcb23d2..eb24ddd 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > @@ -116,24 +116,16 @@ module_param_call(mpt2sas_fwfault_debug, _scsih_set_fwfault_debug, > > /** > * mpt2sas_remove_dead_ioc_func - kthread context to remove dead ioc > - * @arg: input argument, used to derive ioc > + * @pdev: PCI device struct > * > - * Return 0 if controller is removed from pci subsystem. > - * Return -1 for other case. > + * Returns 0. > */ > -static int mpt2sas_remove_dead_ioc_func(void *arg) > +static int > +mpt2sas_remove_dead_ioc_func(void *arg) > { > - struct MPT2SAS_ADAPTER *ioc = (struct MPT2SAS_ADAPTER *)arg; > - struct pci_dev *pdev; > - > - if ((ioc == NULL)) > - return -1; > - > - pdev = ioc->pdev; > - if ((pdev == NULL)) > - return -1; > - pci_stop_and_remove_bus_device(pdev); > - return 0; > + /* mpt2sas_scsih_detach_pci will validate pci_dev */ > + mpt2sas_scsih_detach_pci((struct pci_dev *)arg); > + return 0; > } > > > @@ -192,7 +184,7 @@ _base_fault_reset_work(struct work_struct *work) > */ > ioc->remove_host = 1; > /*Remove the Dead Host */ > - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, > + p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc->pdev, > "mpt2sas_dead_ioc_%d", ioc->id); > if (IS_ERR(p)) { > printk(MPT2SAS_ERR_FMT > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h > index 4caaac1..d88515d 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h > @@ -1079,6 +1079,8 @@ void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc); > > void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase); > > +void mpt2sas_scsih_detach_pci(struct pci_dev *pdev); > + > /* config shared API */ > u8 mpt2sas_config_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, > u32 reply); > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c > index c6bdc92..8bff162 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c > @@ -78,6 +78,8 @@ static int _scsih_scan_finished(struct Scsi_Host *shost, unsigned long time); > LIST_HEAD(mpt2sas_ioc_list); > > /* local parameters */ > +DEFINE_MUTEX(_mpt2sas_pci_mutex); > + > static u8 scsi_io_cb_idx = -1; > static u8 tm_cb_idx = -1; > static u8 ctl_cb_idx = -1; > @@ -7660,11 +7662,17 @@ _scsih_ir_shutdown(struct MPT2SAS_ADAPTER *ioc) > static void > _scsih_shutdown(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT2SAS_ADAPTER *ioc; > struct workqueue_struct *wq; > unsigned long flags; > > + mutex_lock(&_mpt2sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7677,26 +7685,41 @@ _scsih_shutdown(struct pci_dev *pdev) > > _scsih_ir_shutdown(ioc); > mpt2sas_base_detach(ioc); > +out: > + mutex_unlock(&_mpt2sas_pci_mutex); > } > > /** > - * _scsih_remove - detach and remove add host > + * mpt2sas_scsih_detach_pci - detach shost from PCI device > * @pdev: PCI device struct > * > - * Routine called when unloading the driver. > + * Routine called by pci driver .remove callback and watchdog-created > + * mpt2sas_remove_dead_ioc_func kthread. > * Return nothing. > */ > -static void > -_scsih_remove(struct pci_dev *pdev) > +void > +mpt2sas_scsih_detach_pci(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT2SAS_ADAPTER *ioc; > struct _sas_port *mpt2sas_port, *next_port; > struct _raid_device *raid_device, *next; > struct MPT2SAS_TARGET *sas_target_priv_data; > struct workqueue_struct *wq; > unsigned long flags; > > + mutex_lock(&_mpt2sas_pci_mutex); > + /* verify driver still knows about this pdev */ > + list_for_each_entry(ioc, &mpt2sas_ioc_list, list) { > + if (ioc->pdev == pdev) > + goto remove; > + } > + goto out; > +remove: > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7750,6 +7773,21 @@ _scsih_remove(struct pci_dev *pdev) > list_del(&ioc->list); > scsi_remove_host(shost); > scsi_host_put(shost); > +out: > + mutex_unlock(&_mpt2sas_pci_mutex); > +} > + > +/** > + * _scsih_remove - detach and remove host > + * @pdev: PCI device struct > + * > + * Routine called when unloading the driver and hotplug remove. > + * Return nothing. > + */ > +static void > +_scsih_remove(struct pci_dev *pdev) > +{ > + mpt2sas_scsih_detach_pci(pdev); > } > > /** > @@ -8159,10 +8197,16 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > static int > _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT2SAS_ADAPTER *ioc; > pci_power_t device_state; > > + mutex_lock(&_mpt2sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + > mpt2sas_base_stop_watchdog(ioc); > scsi_block_requests(shost); > device_state = pci_choose_state(pdev, state); > @@ -8174,6 +8218,9 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > pci_save_state(pdev); > pci_disable_device(pdev); > pci_set_power_state(pdev, device_state); > +out: > + mutex_unlock(&_mpt2sas_pci_mutex); > + > return 0; > } > > @@ -8186,10 +8233,17 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > static int > _scsih_resume(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > - pci_power_t device_state = pdev->current_state; > - int r; > + struct Scsi_Host *shost; > + struct MPT2SAS_ADAPTER *ioc; > + pci_power_t device_state; > + int r = 0; > + > + mutex_lock(&_mpt2sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + device_state = pdev->current_state; > > printk(MPT2SAS_INFO_FMT "pdev=0x%p, slot=%s, previous " > "operating state [D%d]\n", ioc->name, pdev, > @@ -8200,13 +8254,15 @@ _scsih_resume(struct pci_dev *pdev) > pci_restore_state(pdev); > ioc->pdev = pdev; > r = mpt2sas_base_map_resources(ioc); > - if (r) > - return r; > + if (!r) { > + mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > + scsi_unblock_requests(shost); > + mpt2sas_base_start_watchdog(ioc); > + } > +out: > + mutex_unlock(&_mpt2sas_pci_mutex); > > - mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > - scsi_unblock_requests(shost); > - mpt2sas_base_start_watchdog(ioc); > - return 0; > + return r; > } > #endif /* CONFIG_PM */ > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c > index 1836003..9d4f314 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > @@ -111,23 +111,15 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug, > > /** > * mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc > - * @arg: input argument, used to derive ioc > + * @pdev: PCI device struct > * > - * Return 0 if controller is removed from pci subsystem. > - * Return -1 for other case. > + * Returns 0. > */ > -static int mpt3sas_remove_dead_ioc_func(void *arg) > +static int > +mpt3sas_remove_dead_ioc_func(void *arg) > { > - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; > - struct pci_dev *pdev; > - > - if ((ioc == NULL)) > - return -1; > - > - pdev = ioc->pdev; > - if ((pdev == NULL)) > - return -1; > - pci_stop_and_remove_bus_device(pdev); > + /* mpt3sas_scsih_detach_pci will validate pci_dev */ > + mpt3sas_scsih_detach_pci((struct pci_dev *)arg); > return 0; > } > > @@ -173,7 +165,7 @@ _base_fault_reset_work(struct work_struct *work) > */ > ioc->remove_host = 1; > /*Remove the Dead Host */ > - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, > + p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc->pdev, > "mpt3sas_dead_ioc_%d", ioc->id); > if (IS_ERR(p)) > pr_err(MPT3SAS_FMT > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h > index 994656c..225c84f 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h > @@ -1009,6 +1009,8 @@ struct _sas_device *mpt3sas_scsih_sas_device_find_by_sas_address( > > void mpt3sas_port_enable_complete(struct MPT3SAS_ADAPTER *ioc); > > +void mpt3sas_scsih_detach_pci(struct pci_dev *pdev); > + > /* config shared API */ > u8 mpt3sas_config_done(struct MPT3SAS_ADAPTER *ioc, u16 smid, u8 msix_index, > u32 reply); > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index dcbf7c8..5b8c365 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -81,6 +81,8 @@ static int _scsih_scan_finished(struct Scsi_Host *shost, unsigned long time); > LIST_HEAD(mpt3sas_ioc_list); > > /* local parameters */ > +DEFINE_MUTEX(_mpt3sas_pci_mutex); > + > static u8 scsi_io_cb_idx = -1; > static u8 tm_cb_idx = -1; > static u8 ctl_cb_idx = -1; > @@ -7365,22 +7367,36 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc) > } > > /** > - * _scsih_remove - detach and remove add host > + * mpt3sas_scsih_detach_pci - detach shost from PCI device > * @pdev: PCI device struct > * > - * Routine called when unloading the driver. > + * Routine called by pci driver .remove callback and watchdog-created > + * mpt3sas_remove_dead_ioc_func kthread. > * Return nothing. > */ > -static void _scsih_remove(struct pci_dev *pdev) > +void > +mpt3sas_scsih_detach_pci(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT3SAS_ADAPTER *ioc; > struct _sas_port *mpt3sas_port, *next_port; > struct _raid_device *raid_device, *next; > struct MPT3SAS_TARGET *sas_target_priv_data; > struct workqueue_struct *wq; > unsigned long flags; > > + mutex_lock(&_mpt3sas_pci_mutex); > + /* verify driver still knows about this pdev */ > + list_for_each_entry(ioc, &mpt3sas_ioc_list, list) { > + if (ioc->pdev == pdev) > + goto remove; > + } > + goto out; > +remove: > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7434,6 +7450,21 @@ static void _scsih_remove(struct pci_dev *pdev) > list_del(&ioc->list); > scsi_remove_host(shost); > scsi_host_put(shost); > +out: > + mutex_unlock(&_mpt3sas_pci_mutex); > +} > + > +/** > + * _scsih_remove - detach and remove host > + * @pdev: PCI device struct > + * > + * Routine called when unloading the driver and hotplug remove. > + * Return nothing. > + */ > +static void > +_scsih_remove(struct pci_dev *pdev) > +{ > + mpt3sas_scsih_detach_pci(pdev); > } > > /** > @@ -7445,11 +7476,17 @@ static void _scsih_remove(struct pci_dev *pdev) > static void > _scsih_shutdown(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT3SAS_ADAPTER *ioc; > struct workqueue_struct *wq; > unsigned long flags; > > + mutex_lock(&_mpt3sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7462,6 +7499,8 @@ _scsih_shutdown(struct pci_dev *pdev) > > _scsih_ir_shutdown(ioc); > mpt3sas_base_detach(ioc); > +out: > + mutex_unlock(&_mpt3sas_pci_mutex); > } > > > @@ -7854,10 +7893,16 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > static int > _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > + struct Scsi_Host *shost; > + struct MPT3SAS_ADAPTER *ioc; > pci_power_t device_state; > > + mutex_lock(&_mpt3sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + > mpt3sas_base_stop_watchdog(ioc); > flush_scheduled_work(); > scsi_block_requests(shost); > @@ -7869,6 +7914,9 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > pci_save_state(pdev); > mpt3sas_base_free_resources(ioc); > pci_set_power_state(pdev, device_state); > +out: > + mutex_unlock(&_mpt3sas_pci_mutex); > + > return 0; > } > > @@ -7881,10 +7929,17 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > static int > _scsih_resume(struct pci_dev *pdev) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > - pci_power_t device_state = pdev->current_state; > - int r; > + struct Scsi_Host *shost; > + struct MPT3SAS_ADAPTER *ioc; > + pci_power_t device_state; > + int r = 0; > + > + mutex_lock(&_mpt3sas_pci_mutex); > + shost = pci_get_drvdata(pdev); > + if (!shost) > + goto out; > + ioc = shost_priv(shost); > + device_state = pdev->current_state; > > pr_info(MPT3SAS_FMT > "pdev=0x%p, slot=%s, previous operating state [D%d]\n", > @@ -7895,13 +7950,15 @@ _scsih_resume(struct pci_dev *pdev) > pci_restore_state(pdev); > ioc->pdev = pdev; > r = mpt3sas_base_map_resources(ioc); > - if (r) > - return r; > + if (!r) { > + mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > + scsi_unblock_requests(shost); > + mpt3sas_base_start_watchdog(ioc); > + } > +out: > + mutex_unlock(&_mpt3sas_pci_mutex); > > - mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > - scsi_unblock_requests(shost); > - mpt3sas_base_start_watchdog(ioc); > - return 0; > + return r; > } > #endif /* CONFIG_PM */ > > -- > 1.8.1.4 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-05-15 17:24 mpt2sas,mpt3sas watchdog device removal Joe Lawrence 2013-05-15 17:26 ` [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Joe Lawrence @ 2013-05-15 17:29 ` Joe Lawrence 2013-05-17 15:29 ` Bjorn Helgaas 2013-07-16 12:00 ` Reddy, Sreekanth 1 sibling, 2 replies; 12+ messages in thread From: Joe Lawrence @ 2013-05-15 17:29 UTC (permalink / raw) To: linux-scsi Cc: James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, Bjorn Helgaas >From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 2001 From: Joe Lawrence <joe.lawrence@stratus.com> Date: Wed, 15 May 2013 12:52:31 -0400 Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce device removal races with PCI callback functions. Simplify the mpt2sas watchdog mechanism by separating PCI device removal from SCSI midlayer detachment. When the watchdog wishes to remove a SCSI device from the topology, detach from the SCSI midlayer, leaving the PCI device alone. Adjust various pci_driver callbacks to account for a potentially SCSI detached PCI device. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> --- drivers/scsi/mpt2sas/mpt2sas_base.c | 43 +--------------- drivers/scsi/mpt2sas/mpt2sas_base.h | 2 + drivers/scsi/mpt2sas/mpt2sas_scsih.c | 97 +++++++++++++++++++++--------------- drivers/scsi/mpt3sas/mpt3sas_base.c | 43 +--------------- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 65 +++++++++++++++--------- 6 files changed, 105 insertions(+), 147 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c index bcb23d2..cc1bf28 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.c +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c @@ -57,7 +57,6 @@ #include <linux/sort.h> #include <linux/io.h> #include <linux/time.h> -#include <linux/kthread.h> #include <linux/aer.h> #include "mpt2sas_base.h" @@ -115,29 +114,6 @@ module_param_call(mpt2sas_fwfault_debug, _scsih_set_fwfault_debug, param_get_int, &mpt2sas_fwfault_debug, 0644); /** - * mpt2sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc - * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. - */ -static int mpt2sas_remove_dead_ioc_func(void *arg) -{ - struct MPT2SAS_ADAPTER *ioc = (struct MPT2SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); - return 0; -} - - -/** * _base_fault_reset_work - workq handling ioc fault conditions * @work: input argument, used to derive ioc * Context: sleep. @@ -152,7 +128,6 @@ _base_fault_reset_work(struct work_struct *work) unsigned long flags; u32 doorbell; int rc; - struct task_struct *p; spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); if (ioc->shost_recovery || ioc->pci_error_recovery) @@ -186,23 +161,7 @@ _base_fault_reset_work(struct work_struct *work) * command back from HW. */ ioc->schedule_dead_ioc_flush_running_cmds(ioc); - /* - * Set remove_host flag early since kernel thread will - * take some time to execute. - */ - ioc->remove_host = 1; - /*Remove the Dead Host */ - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, - "mpt2sas_dead_ioc_%d", ioc->id); - if (IS_ERR(p)) { - printk(MPT2SAS_ERR_FMT - "%s: Running mpt2sas_dead_ioc thread failed !!!!\n", - ioc->name, __func__); - } else { - printk(MPT2SAS_ERR_FMT - "%s: Running mpt2sas_dead_ioc thread success !!!!\n", - ioc->name, __func__); - } + mpt2sas_scsih_detach(ioc); return; /* don't rearm timer */ } diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h index 4caaac1..94d0e98 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h @@ -817,6 +817,7 @@ struct MPT2SAS_ADAPTER { u8 broadcast_aen_busy; u16 broadcast_aen_pending; u8 shost_recovery; + u8 shost_attached; struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; @@ -1078,6 +1079,7 @@ struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address( void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc); void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase); +void mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc); /* config shared API */ u8 mpt2sas_config_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c index c6bdc92..a06662c 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c @@ -7652,51 +7652,23 @@ _scsih_ir_shutdown(struct MPT2SAS_ADAPTER *ioc) } /** - * _scsih_shutdown - routine call during system shutdown - * @pdev: PCI device struct - * - * Return nothing. - */ -static void -_scsih_shutdown(struct pci_dev *pdev) -{ - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); - struct workqueue_struct *wq; - unsigned long flags; - - ioc->remove_host = 1; - _scsih_fw_event_cleanup_queue(ioc); - - spin_lock_irqsave(&ioc->fw_event_lock, flags); - wq = ioc->firmware_event_thread; - ioc->firmware_event_thread = NULL; - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); - if (wq) - destroy_workqueue(wq); - - _scsih_ir_shutdown(ioc); - mpt2sas_base_detach(ioc); -} - -/** - * _scsih_remove - detach and remove add host - * @pdev: PCI device struct + * mpt2sas_scsih_detach - detach from SCSI midlayer and free resources + * @ioc: per adapter object * - * Routine called when unloading the driver. * Return nothing. */ -static void -_scsih_remove(struct pci_dev *pdev) +void +mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); struct _sas_port *mpt2sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT2SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + if (!ioc->shost_attached) + return; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7745,11 +7717,52 @@ _scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); + sas_remove_host(ioc->shost); + scsi_remove_host(ioc->shost); + ioc->shost_attached = 0; +} + +/** + * _scsih_shutdown - routine call during system shutdown + * @pdev: PCI device struct + * + * Return nothing. + */ +static void +_scsih_shutdown(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt2sas_base_stop_watchdog(ioc); + mpt2sas_scsih_detach(ioc); + mpt2sas_base_detach(ioc); +} + +/** + * _scsih_remove - detach and remove add host + * @pdev: PCI device struct + * + * Routine called when unloading the driver. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt2sas_base_stop_watchdog(ioc); + mpt2sas_scsih_detach(ioc); + mpt2sas_base_detach(ioc); list_del(&ioc->list); - scsi_remove_host(shost); - scsi_host_put(shost); + + scsi_host_put(ioc->shost); } /** @@ -8022,6 +8035,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&ioc->list); list_add_tail(&ioc->list, &mpt2sas_ioc_list); ioc->shost = shost; + ioc->shost_attached = 1; ioc->id = mpt_ids++; sprintf(ioc->name, "%s%d", MPT2SAS_DRIVER_NAME, ioc->id); ioc->pdev = pdev; @@ -8144,6 +8158,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) list_del(&ioc->list); scsi_remove_host(shost); out_add_shost_fail: + ioc->shost_attached = 0; scsi_host_put(shost); return -ENODEV; } @@ -8164,7 +8179,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_power_t device_state; mpt2sas_base_stop_watchdog(ioc); - scsi_block_requests(shost); + if (ioc->shost_attached) + scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); printk(MPT2SAS_INFO_FMT "pdev=0x%p, slot=%s, entering " "operating state [D%d]\n", ioc->name, pdev, @@ -8204,7 +8220,8 @@ _scsih_resume(struct pci_dev *pdev) return r; mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); + if (ioc->shost_attached) + scsi_unblock_requests(shost); mpt2sas_base_start_watchdog(ioc); return 0; } diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 1836003..f8c25a1 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -56,7 +56,6 @@ #include <linux/dma-mapping.h> #include <linux/io.h> #include <linux/time.h> -#include <linux/kthread.h> #include <linux/aer.h> @@ -110,28 +109,6 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug, param_get_int, &mpt3sas_fwfault_debug, 0644); /** - * mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc - * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. - */ -static int mpt3sas_remove_dead_ioc_func(void *arg) -{ - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); - return 0; -} - -/** * _base_fault_reset_work - workq handling ioc fault conditions * @work: input argument, used to derive ioc * Context: sleep. @@ -146,8 +123,6 @@ _base_fault_reset_work(struct work_struct *work) unsigned long flags; u32 doorbell; int rc; - struct task_struct *p; - spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); if (ioc->shost_recovery) @@ -167,22 +142,8 @@ _base_fault_reset_work(struct work_struct *work) * command back from HW. */ ioc->schedule_dead_ioc_flush_running_cmds(ioc); - /* - * Set remove_host flag early since kernel thread will - * take some time to execute. - */ - ioc->remove_host = 1; - /*Remove the Dead Host */ - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, - "mpt3sas_dead_ioc_%d", ioc->id); - if (IS_ERR(p)) - pr_err(MPT3SAS_FMT - "%s: Running mpt3sas_dead_ioc thread failed !!!!\n", - ioc->name, __func__); - else - pr_err(MPT3SAS_FMT - "%s: Running mpt3sas_dead_ioc thread success !!!!\n", - ioc->name, __func__); + mpt3sas_scsih_detach(ioc); + return; /* don't rearm timer */ } diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 994656c..9fbf56c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -731,6 +731,7 @@ struct MPT3SAS_ADAPTER { u8 broadcast_aen_busy; u16 broadcast_aen_pending; u8 shost_recovery; + u8 shost_attached; struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; @@ -990,6 +991,7 @@ int mpt3sas_port_enable(struct MPT3SAS_ADAPTER *ioc); u8 mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, u32 reply); void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase); +void mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc); int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, uint id, uint lun, u8 type, u16 smid_task, diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index dcbf7c8..9ae0914 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -7365,22 +7365,23 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc) } /** - * _scsih_remove - detach and remove add host - * @pdev: PCI device struct + * mpt3sas_scsih_detach - detach from SCSI midlayer and free resources + * @ioc: per adapter object * - * Routine called when unloading the driver. * Return nothing. */ -static void _scsih_remove(struct pci_dev *pdev) +void +mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); struct _sas_port *mpt3sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT3SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + if (!ioc->shost_attached) + return; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7429,11 +7430,9 @@ static void _scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); - mpt3sas_base_detach(ioc); - list_del(&ioc->list); - scsi_remove_host(shost); - scsi_host_put(shost); + sas_remove_host(ioc->shost); + scsi_remove_host(ioc->shost); + ioc->shost_attached = 0; } /** @@ -7447,23 +7446,37 @@ _scsih_shutdown(struct pci_dev *pdev) { struct Scsi_Host *shost = pci_get_drvdata(pdev); struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); - struct workqueue_struct *wq; - unsigned long flags; ioc->remove_host = 1; - _scsih_fw_event_cleanup_queue(ioc); - - spin_lock_irqsave(&ioc->fw_event_lock, flags); - wq = ioc->firmware_event_thread; - ioc->firmware_event_thread = NULL; - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); - if (wq) - destroy_workqueue(wq); - _scsih_ir_shutdown(ioc); + mpt3sas_base_stop_watchdog(ioc); + mpt3sas_scsih_detach(ioc); mpt3sas_base_detach(ioc); } +/** + * _scsih_remove - detach and remove add host + * @pdev: PCI device struct + * + * Routine called when unloading the driver. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt3sas_base_stop_watchdog(ioc); + mpt3sas_scsih_detach(ioc); + + mpt3sas_base_detach(ioc); + list_del(&ioc->list); + + scsi_host_put(ioc->shost); +} /** * _scsih_probe_boot_devices - reports 1st device @@ -7735,6 +7748,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&ioc->list); list_add_tail(&ioc->list, &mpt3sas_ioc_list); ioc->shost = shost; + ioc->shost_attached = 1; ioc->id = mpt_ids++; sprintf(ioc->name, "%s%d", MPT3SAS_DRIVER_NAME, ioc->id); ioc->pdev = pdev; @@ -7839,6 +7853,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) list_del(&ioc->list); scsi_remove_host(shost); out_add_shost_fail: + ioc->shost_attached = 0; scsi_host_put(shost); return -ENODEV; } @@ -7860,7 +7875,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) mpt3sas_base_stop_watchdog(ioc); flush_scheduled_work(); - scsi_block_requests(shost); + if (ioc->shost_attached) + scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); pr_info(MPT3SAS_FMT "pdev=0x%p, slot=%s, entering operating state [D%d]\n", @@ -7899,7 +7915,8 @@ _scsih_resume(struct pci_dev *pdev) return r; mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); + if (ioc->shost_attached) + scsi_unblock_requests(shost); mpt3sas_base_start_watchdog(ioc); return 0; } -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-05-15 17:29 ` mpt2sas,mpt3sas watchdog device removal Joe Lawrence @ 2013-05-17 15:29 ` Bjorn Helgaas 2013-05-17 21:42 ` Joe Lawrence 2013-07-16 12:00 ` Reddy, Sreekanth 1 sibling, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2013-05-17 15:29 UTC (permalink / raw) To: Joe Lawrence Cc: linux-scsi@vger.kernel.org, James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, linux-pci@vger.kernel.org [+cc linux-pci] On Wed, May 15, 2013 at 11:29 AM, Joe Lawrence <joe.lawrence@stratus.com> wrote: > From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 2001 > From: Joe Lawrence <joe.lawrence@stratus.com> > Date: Wed, 15 May 2013 12:52:31 -0400 > Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal > safe > > Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce > device removal races with PCI callback functions. > > Simplify the mpt2sas watchdog mechanism by separating PCI device removal > from SCSI midlayer detachment. When the watchdog wishes to remove a SCSI > device from the topology, detach from the SCSI midlayer, leaving the PCI > device alone. Adjust various pci_driver callbacks to account for a > potentially SCSI detached PCI device. I don't know the details of the SCSI detachment, but this approach looks much cleaner to me. Thanks for doing all this work, Joe. I know this isn't finished, but it looks like a great step in making these drivers simpler and more reliable. Bjorn > Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> > --- > drivers/scsi/mpt2sas/mpt2sas_base.c | 43 +--------------- > drivers/scsi/mpt2sas/mpt2sas_base.h | 2 + > drivers/scsi/mpt2sas/mpt2sas_scsih.c | 97 +++++++++++++++++++++--------------- > drivers/scsi/mpt3sas/mpt3sas_base.c | 43 +--------------- > drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + > drivers/scsi/mpt3sas/mpt3sas_scsih.c | 65 +++++++++++++++--------- > 6 files changed, 105 insertions(+), 147 deletions(-) > > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c > index bcb23d2..cc1bf28 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_base.c > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c > @@ -57,7 +57,6 @@ > #include <linux/sort.h> > #include <linux/io.h> > #include <linux/time.h> > -#include <linux/kthread.h> > #include <linux/aer.h> > > #include "mpt2sas_base.h" > @@ -115,29 +114,6 @@ module_param_call(mpt2sas_fwfault_debug, _scsih_set_fwfault_debug, > param_get_int, &mpt2sas_fwfault_debug, 0644); > > /** > - * mpt2sas_remove_dead_ioc_func - kthread context to remove dead ioc > - * @arg: input argument, used to derive ioc > - * > - * Return 0 if controller is removed from pci subsystem. > - * Return -1 for other case. > - */ > -static int mpt2sas_remove_dead_ioc_func(void *arg) > -{ > - struct MPT2SAS_ADAPTER *ioc = (struct MPT2SAS_ADAPTER *)arg; > - struct pci_dev *pdev; > - > - if ((ioc == NULL)) > - return -1; > - > - pdev = ioc->pdev; > - if ((pdev == NULL)) > - return -1; > - pci_stop_and_remove_bus_device(pdev); > - return 0; > -} > - > - > -/** > * _base_fault_reset_work - workq handling ioc fault conditions > * @work: input argument, used to derive ioc > * Context: sleep. > @@ -152,7 +128,6 @@ _base_fault_reset_work(struct work_struct *work) > unsigned long flags; > u32 doorbell; > int rc; > - struct task_struct *p; > > spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); > if (ioc->shost_recovery || ioc->pci_error_recovery) > @@ -186,23 +161,7 @@ _base_fault_reset_work(struct work_struct *work) > * command back from HW. > */ > ioc->schedule_dead_ioc_flush_running_cmds(ioc); > - /* > - * Set remove_host flag early since kernel thread will > - * take some time to execute. > - */ > - ioc->remove_host = 1; > - /*Remove the Dead Host */ > - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, > - "mpt2sas_dead_ioc_%d", ioc->id); > - if (IS_ERR(p)) { > - printk(MPT2SAS_ERR_FMT > - "%s: Running mpt2sas_dead_ioc thread failed !!!!\n", > - ioc->name, __func__); > - } else { > - printk(MPT2SAS_ERR_FMT > - "%s: Running mpt2sas_dead_ioc thread success !!!!\n", > - ioc->name, __func__); > - } > + mpt2sas_scsih_detach(ioc); > > return; /* don't rearm timer */ > } > diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h > index 4caaac1..94d0e98 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_base.h > +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h > @@ -817,6 +817,7 @@ struct MPT2SAS_ADAPTER { > u8 broadcast_aen_busy; > u16 broadcast_aen_pending; > u8 shost_recovery; > + u8 shost_attached; > > struct mutex reset_in_progress_mutex; > spinlock_t ioc_reset_in_progress_lock; > @@ -1078,6 +1079,7 @@ struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address( > void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc); > > void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase); > +void mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc); > > /* config shared API */ > u8 mpt2sas_config_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, > diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c > index c6bdc92..a06662c 100644 > --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c > +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c > @@ -7652,51 +7652,23 @@ _scsih_ir_shutdown(struct MPT2SAS_ADAPTER *ioc) > } > > /** > - * _scsih_shutdown - routine call during system shutdown > - * @pdev: PCI device struct > - * > - * Return nothing. > - */ > -static void > -_scsih_shutdown(struct pci_dev *pdev) > -{ > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > - struct workqueue_struct *wq; > - unsigned long flags; > - > - ioc->remove_host = 1; > - _scsih_fw_event_cleanup_queue(ioc); > - > - spin_lock_irqsave(&ioc->fw_event_lock, flags); > - wq = ioc->firmware_event_thread; > - ioc->firmware_event_thread = NULL; > - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); > - if (wq) > - destroy_workqueue(wq); > - > - _scsih_ir_shutdown(ioc); > - mpt2sas_base_detach(ioc); > -} > - > -/** > - * _scsih_remove - detach and remove add host > - * @pdev: PCI device struct > + * mpt2sas_scsih_detach - detach from SCSI midlayer and free resources > + * @ioc: per adapter object > * > - * Routine called when unloading the driver. > * Return nothing. > */ > -static void > -_scsih_remove(struct pci_dev *pdev) > +void > +mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > struct _sas_port *mpt2sas_port, *next_port; > struct _raid_device *raid_device, *next; > struct MPT2SAS_TARGET *sas_target_priv_data; > struct workqueue_struct *wq; > unsigned long flags; > > + if (!ioc->shost_attached) > + return; > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7745,11 +7717,52 @@ _scsih_remove(struct pci_dev *pdev) > ioc->sas_hba.num_phys = 0; > } > > - sas_remove_host(shost); > + sas_remove_host(ioc->shost); > + scsi_remove_host(ioc->shost); > + ioc->shost_attached = 0; > +} > + > +/** > + * _scsih_shutdown - routine call during system shutdown > + * @pdev: PCI device struct > + * > + * Return nothing. > + */ > +static void > +_scsih_shutdown(struct pci_dev *pdev) > +{ > + struct Scsi_Host *shost = pci_get_drvdata(pdev); > + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > + > + ioc->remove_host = 1; > + > + mpt2sas_base_stop_watchdog(ioc); > + mpt2sas_scsih_detach(ioc); > + mpt2sas_base_detach(ioc); > +} > + > +/** > + * _scsih_remove - detach and remove add host > + * @pdev: PCI device struct > + * > + * Routine called when unloading the driver. > + * Return nothing. > + */ > +static void > +_scsih_remove(struct pci_dev *pdev) > +{ > + struct Scsi_Host *shost = pci_get_drvdata(pdev); > + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); > + > + ioc->remove_host = 1; > + > + mpt2sas_base_stop_watchdog(ioc); > + mpt2sas_scsih_detach(ioc); > + > mpt2sas_base_detach(ioc); > list_del(&ioc->list); > - scsi_remove_host(shost); > - scsi_host_put(shost); > + > + scsi_host_put(ioc->shost); > } > > /** > @@ -8022,6 +8035,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > INIT_LIST_HEAD(&ioc->list); > list_add_tail(&ioc->list, &mpt2sas_ioc_list); > ioc->shost = shost; > + ioc->shost_attached = 1; > ioc->id = mpt_ids++; > sprintf(ioc->name, "%s%d", MPT2SAS_DRIVER_NAME, ioc->id); > ioc->pdev = pdev; > @@ -8144,6 +8158,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > list_del(&ioc->list); > scsi_remove_host(shost); > out_add_shost_fail: > + ioc->shost_attached = 0; > scsi_host_put(shost); > return -ENODEV; > } > @@ -8164,7 +8179,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > pci_power_t device_state; > > mpt2sas_base_stop_watchdog(ioc); > - scsi_block_requests(shost); > + if (ioc->shost_attached) > + scsi_block_requests(shost); > device_state = pci_choose_state(pdev, state); > printk(MPT2SAS_INFO_FMT "pdev=0x%p, slot=%s, entering " > "operating state [D%d]\n", ioc->name, pdev, > @@ -8204,7 +8220,8 @@ _scsih_resume(struct pci_dev *pdev) > return r; > > mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > - scsi_unblock_requests(shost); > + if (ioc->shost_attached) > + scsi_unblock_requests(shost); > mpt2sas_base_start_watchdog(ioc); > return 0; > } > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c > index 1836003..f8c25a1 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c > @@ -56,7 +56,6 @@ > #include <linux/dma-mapping.h> > #include <linux/io.h> > #include <linux/time.h> > -#include <linux/kthread.h> > #include <linux/aer.h> > > > @@ -110,28 +109,6 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug, > param_get_int, &mpt3sas_fwfault_debug, 0644); > > /** > - * mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc > - * @arg: input argument, used to derive ioc > - * > - * Return 0 if controller is removed from pci subsystem. > - * Return -1 for other case. > - */ > -static int mpt3sas_remove_dead_ioc_func(void *arg) > -{ > - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; > - struct pci_dev *pdev; > - > - if ((ioc == NULL)) > - return -1; > - > - pdev = ioc->pdev; > - if ((pdev == NULL)) > - return -1; > - pci_stop_and_remove_bus_device(pdev); > - return 0; > -} > - > -/** > * _base_fault_reset_work - workq handling ioc fault conditions > * @work: input argument, used to derive ioc > * Context: sleep. > @@ -146,8 +123,6 @@ _base_fault_reset_work(struct work_struct *work) > unsigned long flags; > u32 doorbell; > int rc; > - struct task_struct *p; > - > > spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); > if (ioc->shost_recovery) > @@ -167,22 +142,8 @@ _base_fault_reset_work(struct work_struct *work) > * command back from HW. > */ > ioc->schedule_dead_ioc_flush_running_cmds(ioc); > - /* > - * Set remove_host flag early since kernel thread will > - * take some time to execute. > - */ > - ioc->remove_host = 1; > - /*Remove the Dead Host */ > - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, > - "mpt3sas_dead_ioc_%d", ioc->id); > - if (IS_ERR(p)) > - pr_err(MPT3SAS_FMT > - "%s: Running mpt3sas_dead_ioc thread failed !!!!\n", > - ioc->name, __func__); > - else > - pr_err(MPT3SAS_FMT > - "%s: Running mpt3sas_dead_ioc thread success !!!!\n", > - ioc->name, __func__); > + mpt3sas_scsih_detach(ioc); > + > return; /* don't rearm timer */ > } > > diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h > index 994656c..9fbf56c 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_base.h > +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h > @@ -731,6 +731,7 @@ struct MPT3SAS_ADAPTER { > u8 broadcast_aen_busy; > u16 broadcast_aen_pending; > u8 shost_recovery; > + u8 shost_attached; > > struct mutex reset_in_progress_mutex; > spinlock_t ioc_reset_in_progress_lock; > @@ -990,6 +991,7 @@ int mpt3sas_port_enable(struct MPT3SAS_ADAPTER *ioc); > u8 mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, > u32 reply); > void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase); > +void mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc); > > int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, > uint channel, uint id, uint lun, u8 type, u16 smid_task, > diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > index dcbf7c8..9ae0914 100644 > --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c > +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c > @@ -7365,22 +7365,23 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc) > } > > /** > - * _scsih_remove - detach and remove add host > - * @pdev: PCI device struct > + * mpt3sas_scsih_detach - detach from SCSI midlayer and free resources > + * @ioc: per adapter object > * > - * Routine called when unloading the driver. > * Return nothing. > */ > -static void _scsih_remove(struct pci_dev *pdev) > +void > +mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc) > { > - struct Scsi_Host *shost = pci_get_drvdata(pdev); > - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > struct _sas_port *mpt3sas_port, *next_port; > struct _raid_device *raid_device, *next; > struct MPT3SAS_TARGET *sas_target_priv_data; > struct workqueue_struct *wq; > unsigned long flags; > > + if (!ioc->shost_attached) > + return; > + > ioc->remove_host = 1; > _scsih_fw_event_cleanup_queue(ioc); > > @@ -7429,11 +7430,9 @@ static void _scsih_remove(struct pci_dev *pdev) > ioc->sas_hba.num_phys = 0; > } > > - sas_remove_host(shost); > - mpt3sas_base_detach(ioc); > - list_del(&ioc->list); > - scsi_remove_host(shost); > - scsi_host_put(shost); > + sas_remove_host(ioc->shost); > + scsi_remove_host(ioc->shost); > + ioc->shost_attached = 0; > } > > /** > @@ -7447,23 +7446,37 @@ _scsih_shutdown(struct pci_dev *pdev) > { > struct Scsi_Host *shost = pci_get_drvdata(pdev); > struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > - struct workqueue_struct *wq; > - unsigned long flags; > > ioc->remove_host = 1; > - _scsih_fw_event_cleanup_queue(ioc); > - > - spin_lock_irqsave(&ioc->fw_event_lock, flags); > - wq = ioc->firmware_event_thread; > - ioc->firmware_event_thread = NULL; > - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); > - if (wq) > - destroy_workqueue(wq); > > - _scsih_ir_shutdown(ioc); > + mpt3sas_base_stop_watchdog(ioc); > + mpt3sas_scsih_detach(ioc); > mpt3sas_base_detach(ioc); > } > > +/** > + * _scsih_remove - detach and remove add host > + * @pdev: PCI device struct > + * > + * Routine called when unloading the driver. > + * Return nothing. > + */ > +static void > +_scsih_remove(struct pci_dev *pdev) > +{ > + struct Scsi_Host *shost = pci_get_drvdata(pdev); > + struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); > + > + ioc->remove_host = 1; > + > + mpt3sas_base_stop_watchdog(ioc); > + mpt3sas_scsih_detach(ioc); > + > + mpt3sas_base_detach(ioc); > + list_del(&ioc->list); > + > + scsi_host_put(ioc->shost); > +} > > /** > * _scsih_probe_boot_devices - reports 1st device > @@ -7735,6 +7748,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > INIT_LIST_HEAD(&ioc->list); > list_add_tail(&ioc->list, &mpt3sas_ioc_list); > ioc->shost = shost; > + ioc->shost_attached = 1; > ioc->id = mpt_ids++; > sprintf(ioc->name, "%s%d", MPT3SAS_DRIVER_NAME, ioc->id); > ioc->pdev = pdev; > @@ -7839,6 +7853,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) > list_del(&ioc->list); > scsi_remove_host(shost); > out_add_shost_fail: > + ioc->shost_attached = 0; > scsi_host_put(shost); > return -ENODEV; > } > @@ -7860,7 +7875,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) > > mpt3sas_base_stop_watchdog(ioc); > flush_scheduled_work(); > - scsi_block_requests(shost); > + if (ioc->shost_attached) > + scsi_block_requests(shost); > device_state = pci_choose_state(pdev, state); > pr_info(MPT3SAS_FMT > "pdev=0x%p, slot=%s, entering operating state [D%d]\n", > @@ -7899,7 +7915,8 @@ _scsih_resume(struct pci_dev *pdev) > return r; > > mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); > - scsi_unblock_requests(shost); > + if (ioc->shost_attached) > + scsi_unblock_requests(shost); > mpt3sas_base_start_watchdog(ioc); > return 0; > } > -- > 1.8.1.4 > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-05-17 15:29 ` Bjorn Helgaas @ 2013-05-17 21:42 ` Joe Lawrence 2013-10-10 21:59 ` Bjorn Helgaas 0 siblings, 1 reply; 12+ messages in thread From: Joe Lawrence @ 2013-05-17 21:42 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-scsi@vger.kernel.org, James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, linux-pci@vger.kernel.org On Fri, 17 May 2013 09:29:06 -0600 Bjorn Helgaas <bhelgaas@google.com> wrote: > [+cc linux-pci] > > On Wed, May 15, 2013 at 11:29 AM, Joe Lawrence > <joe.lawrence@stratus.com> wrote: > > From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 > > 2001 From: Joe Lawrence <joe.lawrence@stratus.com> > > Date: Wed, 15 May 2013 12:52:31 -0400 > > Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device > > removal safe > > > > Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce > > device removal races with PCI callback functions. > > > > Simplify the mpt2sas watchdog mechanism by separating PCI device > > removal from SCSI midlayer detachment. When the watchdog wishes to > > remove a SCSI device from the topology, detach from the SCSI > > midlayer, leaving the PCI device alone. Adjust various pci_driver > > callbacks to account for a potentially SCSI detached PCI device. > > I don't know the details of the SCSI detachment, but this approach > looks much cleaner to me. Thanks for commenting, Bjorn. I think this approach more closely represents what this watchdog is trying to accomplish. Off list, Sreekanth from LSI tested and noticed a few issues with this patch: - mpt2sas_base_stop_watchdog is called twice: The call from mpt2sas_base_detach is safe, but now unnecessary (as a call was added earlier up in the PCI driver callbacks to ensure that the watchdog was out of the way.) This second invocation can be removed. - If the watchdog detects a bad IOC, the watchdog remains running: The watchdog workqueue isn't cleaned up until mpt2sas_base_stop_watchdog is called, so in the case that the watchdog removes the device from SCSI topo, the workqueue will remain unused until PCI .remove/.shutdown cleans it up. Perhaps a single watchdog that iterates over all adapters would be simpler? Finally, if SCSI topo detachment is all that is interesting here, would it make more sense to move the watchdog into the MPT "scsi" code? I haven't looked at the code yet, but this might make an MPT fusion patch easier (due to dependencies between its "scsi" and "base" modules). Regards, -- Joe ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-05-17 21:42 ` Joe Lawrence @ 2013-10-10 21:59 ` Bjorn Helgaas 2013-10-21 14:24 ` Joe Lawrence 0 siblings, 1 reply; 12+ messages in thread From: Bjorn Helgaas @ 2013-10-10 21:59 UTC (permalink / raw) To: Joe Lawrence Cc: linux-scsi@vger.kernel.org, James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, linux-pci@vger.kernel.org On Fri, May 17, 2013 at 3:42 PM, Joe Lawrence <joe.lawrence@stratus.com> wrote: > On Fri, 17 May 2013 09:29:06 -0600 > Bjorn Helgaas <bhelgaas@google.com> wrote: > >> [+cc linux-pci] >> >> On Wed, May 15, 2013 at 11:29 AM, Joe Lawrence >> <joe.lawrence@stratus.com> wrote: >> > From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 >> > 2001 From: Joe Lawrence <joe.lawrence@stratus.com> >> > Date: Wed, 15 May 2013 12:52:31 -0400 >> > Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device >> > removal safe >> > >> > Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce >> > device removal races with PCI callback functions. >> > >> > Simplify the mpt2sas watchdog mechanism by separating PCI device >> > removal from SCSI midlayer detachment. When the watchdog wishes to >> > remove a SCSI device from the topology, detach from the SCSI >> > midlayer, leaving the PCI device alone. Adjust various pci_driver >> > callbacks to account for a potentially SCSI detached PCI device. >> >> I don't know the details of the SCSI detachment, but this approach >> looks much cleaner to me. > > Thanks for commenting, Bjorn. I think this approach more closely > represents what this watchdog is trying to accomplish. > > Off list, Sreekanth from LSI tested and noticed a few issues with this > patch: > > - mpt2sas_base_stop_watchdog is called twice: The call from > mpt2sas_base_detach is safe, but now unnecessary (as a call was > added earlier up in the PCI driver callbacks to ensure that the > watchdog was out of the way.) This second invocation can be removed. > > - If the watchdog detects a bad IOC, the watchdog remains running: > The watchdog workqueue isn't cleaned up until > mpt2sas_base_stop_watchdog is called, so in the case that the > watchdog removes the device from SCSI topo, the workqueue will > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a > single watchdog that iterates over all adapters would be simpler? > > Finally, if SCSI topo detachment is all that is interesting here, would > it make more sense to move the watchdog into the MPT "scsi" code? I > haven't looked at the code yet, but this might make an MPT fusion patch > easier (due to dependencies between its "scsi" and "base" modules). Hi Joe, I noticed this when looking through old email. I can't remember if anything happened with this. It seems like nice work; did it go anywhere? Bjorn ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-10-10 21:59 ` Bjorn Helgaas @ 2013-10-21 14:24 ` Joe Lawrence 0 siblings, 0 replies; 12+ messages in thread From: Joe Lawrence @ 2013-10-21 14:24 UTC (permalink / raw) To: Bjorn Helgaas Cc: linux-scsi@vger.kernel.org, James E.J. Bottomley, Sreekanth Reddy, Desai, Kashyap, Nagalakshmi Nandigama, linux-pci@vger.kernel.org On Thu, 10 Oct 2013 15:59:28 -0600 Bjorn Helgaas <bhelgaas@google.com> wrote: > On Fri, May 17, 2013 at 3:42 PM, Joe Lawrence <joe.lawrence@stratus.com> wrote: > > On Fri, 17 May 2013 09:29:06 -0600 > > Bjorn Helgaas <bhelgaas@google.com> wrote: > > > >> [+cc linux-pci] > >> > >> On Wed, May 15, 2013 at 11:29 AM, Joe Lawrence > >> <joe.lawrence@stratus.com> wrote: > >> > From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 > >> > 2001 From: Joe Lawrence <joe.lawrence@stratus.com> > >> > Date: Wed, 15 May 2013 12:52:31 -0400 > >> > Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device > >> > removal safe > >> > > >> > Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce > >> > device removal races with PCI callback functions. > >> > > >> > Simplify the mpt2sas watchdog mechanism by separating PCI device > >> > removal from SCSI midlayer detachment. When the watchdog wishes to > >> > remove a SCSI device from the topology, detach from the SCSI > >> > midlayer, leaving the PCI device alone. Adjust various pci_driver > >> > callbacks to account for a potentially SCSI detached PCI device. > >> > >> I don't know the details of the SCSI detachment, but this approach > >> looks much cleaner to me. > > > > Thanks for commenting, Bjorn. I think this approach more closely > > represents what this watchdog is trying to accomplish. > > > > Off list, Sreekanth from LSI tested and noticed a few issues with this > > patch: > > > > - mpt2sas_base_stop_watchdog is called twice: The call from > > mpt2sas_base_detach is safe, but now unnecessary (as a call was > > added earlier up in the PCI driver callbacks to ensure that the > > watchdog was out of the way.) This second invocation can be removed. > > > > - If the watchdog detects a bad IOC, the watchdog remains running: > > The watchdog workqueue isn't cleaned up until > > mpt2sas_base_stop_watchdog is called, so in the case that the > > watchdog removes the device from SCSI topo, the workqueue will > > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a > > single watchdog that iterates over all adapters would be simpler? > > > > Finally, if SCSI topo detachment is all that is interesting here, would > > it make more sense to move the watchdog into the MPT "scsi" code? I > > haven't looked at the code yet, but this might make an MPT fusion patch > > easier (due to dependencies between its "scsi" and "base" modules). > > Hi Joe, > > I noticed this when looking through old email. I can't remember if > anything happened with this. It seems like nice work; did it go > anywhere? Hello Bjorn, Sorry for the late reply, I was on vacation last week. This patchset started as an attempt to push upstream some of the changes that Stratus makes to the fusion/mpt* drivers. I had hoped to come up with a patchset that fixed hotplug vs. watchdog removal in each of the three LSI drivers. Unfortunately this was more of a spare time project and other work took priority. There is a chance that Stratus maybe revisiting the fusion driver in the near future. I will keep this patchset in mind if that happens. Regards, -- Joe ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: mpt2sas,mpt3sas watchdog device removal 2013-05-15 17:29 ` mpt2sas,mpt3sas watchdog device removal Joe Lawrence 2013-05-17 15:29 ` Bjorn Helgaas @ 2013-07-16 12:00 ` Reddy, Sreekanth 2013-07-16 12:03 ` James Bottomley 1 sibling, 1 reply; 12+ messages in thread From: Reddy, Sreekanth @ 2013-07-16 12:00 UTC (permalink / raw) To: Joe Lawrence, linux-scsi@vger.kernel.org; +Cc: James E.J. Bottomley James, This patch seem to be fine. Please consider this patch. Regards, Sreekanth. -----Original Message----- From: Joe Lawrence [mailto:joe.lawrence@stratus.com] Sent: Wednesday, May 15, 2013 11:00 PM To: linux-scsi@vger.kernel.org Cc: James E.J. Bottomley; Reddy, Sreekanth; Desai, Kashyap; Nandigama, Nagalakshmi; Bjorn Helgaas Subject: Re: mpt2sas,mpt3sas watchdog device removal >From 84ac7a35ebd61e84d4254eae78bb967de17254c2 Mon Sep 17 00:00:00 2001 From: Joe Lawrence <joe.lawrence@stratus.com> Date: Wed, 15 May 2013 12:52:31 -0400 Subject: [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Calling pci_stop_and_remove_bus_device from a SCSI LLD may introduce device removal races with PCI callback functions. Simplify the mpt2sas watchdog mechanism by separating PCI device removal from SCSI midlayer detachment. When the watchdog wishes to remove a SCSI device from the topology, detach from the SCSI midlayer, leaving the PCI device alone. Adjust various pci_driver callbacks to account for a potentially SCSI detached PCI device. Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com> --- drivers/scsi/mpt2sas/mpt2sas_base.c | 43 +--------------- drivers/scsi/mpt2sas/mpt2sas_base.h | 2 + drivers/scsi/mpt2sas/mpt2sas_scsih.c | 97 +++++++++++++++++++++--------------- drivers/scsi/mpt3sas/mpt3sas_base.c | 43 +--------------- drivers/scsi/mpt3sas/mpt3sas_base.h | 2 + drivers/scsi/mpt3sas/mpt3sas_scsih.c | 65 +++++++++++++++--------- 6 files changed, 105 insertions(+), 147 deletions(-) diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c index bcb23d2..cc1bf28 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.c +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c @@ -57,7 +57,6 @@ #include <linux/sort.h> #include <linux/io.h> #include <linux/time.h> -#include <linux/kthread.h> #include <linux/aer.h> #include "mpt2sas_base.h" @@ -115,29 +114,6 @@ module_param_call(mpt2sas_fwfault_debug, _scsih_set_fwfault_debug, param_get_int, &mpt2sas_fwfault_debug, 0644); /** - * mpt2sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc - * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. - */ -static int mpt2sas_remove_dead_ioc_func(void *arg) -{ - struct MPT2SAS_ADAPTER *ioc = (struct MPT2SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); - return 0; -} - - -/** * _base_fault_reset_work - workq handling ioc fault conditions * @work: input argument, used to derive ioc * Context: sleep. @@ -152,7 +128,6 @@ _base_fault_reset_work(struct work_struct *work) unsigned long flags; u32 doorbell; int rc; - struct task_struct *p; spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); if (ioc->shost_recovery || ioc->pci_error_recovery) @@ -186,23 +161,7 @@ _base_fault_reset_work(struct work_struct *work) * command back from HW. */ ioc->schedule_dead_ioc_flush_running_cmds(ioc); - /* - * Set remove_host flag early since kernel thread will - * take some time to execute. - */ - ioc->remove_host = 1; - /*Remove the Dead Host */ - p = kthread_run(mpt2sas_remove_dead_ioc_func, ioc, - "mpt2sas_dead_ioc_%d", ioc->id); - if (IS_ERR(p)) { - printk(MPT2SAS_ERR_FMT - "%s: Running mpt2sas_dead_ioc thread failed !!!!\n", - ioc->name, __func__); - } else { - printk(MPT2SAS_ERR_FMT - "%s: Running mpt2sas_dead_ioc thread success !!!!\n", - ioc->name, __func__); - } + mpt2sas_scsih_detach(ioc); return; /* don't rearm timer */ } diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.h b/drivers/scsi/mpt2sas/mpt2sas_base.h index 4caaac1..94d0e98 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_base.h +++ b/drivers/scsi/mpt2sas/mpt2sas_base.h @@ -817,6 +817,7 @@ struct MPT2SAS_ADAPTER { u8 broadcast_aen_busy; u16 broadcast_aen_pending; u8 shost_recovery; + u8 shost_attached; struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; @@ -1078,6 +1079,7 @@ struct _sas_device *mpt2sas_scsih_sas_device_find_by_sas_address( void mpt2sas_port_enable_complete(struct MPT2SAS_ADAPTER *ioc); void mpt2sas_scsih_reset_handler(struct MPT2SAS_ADAPTER *ioc, int reset_phase); +void mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc); /* config shared API */ u8 mpt2sas_config_done(struct MPT2SAS_ADAPTER *ioc, u16 smid, u8 msix_index, diff --git a/drivers/scsi/mpt2sas/mpt2sas_scsih.c b/drivers/scsi/mpt2sas/mpt2sas_scsih.c index c6bdc92..a06662c 100644 --- a/drivers/scsi/mpt2sas/mpt2sas_scsih.c +++ b/drivers/scsi/mpt2sas/mpt2sas_scsih.c @@ -7652,51 +7652,23 @@ _scsih_ir_shutdown(struct MPT2SAS_ADAPTER *ioc) } /** - * _scsih_shutdown - routine call during system shutdown - * @pdev: PCI device struct - * - * Return nothing. - */ -static void -_scsih_shutdown(struct pci_dev *pdev) -{ - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); - struct workqueue_struct *wq; - unsigned long flags; - - ioc->remove_host = 1; - _scsih_fw_event_cleanup_queue(ioc); - - spin_lock_irqsave(&ioc->fw_event_lock, flags); - wq = ioc->firmware_event_thread; - ioc->firmware_event_thread = NULL; - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); - if (wq) - destroy_workqueue(wq); - - _scsih_ir_shutdown(ioc); - mpt2sas_base_detach(ioc); -} - -/** - * _scsih_remove - detach and remove add host - * @pdev: PCI device struct + * mpt2sas_scsih_detach - detach from SCSI midlayer and free resources + * @ioc: per adapter object * - * Routine called when unloading the driver. * Return nothing. */ -static void -_scsih_remove(struct pci_dev *pdev) +void +mpt2sas_scsih_detach(struct MPT2SAS_ADAPTER *ioc) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); struct _sas_port *mpt2sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT2SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + if (!ioc->shost_attached) + return; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7745,11 +7717,52 @@ _scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); + sas_remove_host(ioc->shost); + scsi_remove_host(ioc->shost); + ioc->shost_attached = 0; +} + +/** + * _scsih_shutdown - routine call during system shutdown + * @pdev: PCI device struct + * + * Return nothing. + */ +static void +_scsih_shutdown(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt2sas_base_stop_watchdog(ioc); + mpt2sas_scsih_detach(ioc); + mpt2sas_base_detach(ioc); +} + +/** + * _scsih_remove - detach and remove add host + * @pdev: PCI device struct + * + * Routine called when unloading the driver. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT2SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt2sas_base_stop_watchdog(ioc); + mpt2sas_scsih_detach(ioc); + mpt2sas_base_detach(ioc); list_del(&ioc->list); - scsi_remove_host(shost); - scsi_host_put(shost); + + scsi_host_put(ioc->shost); } /** @@ -8022,6 +8035,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&ioc->list); list_add_tail(&ioc->list, &mpt2sas_ioc_list); ioc->shost = shost; + ioc->shost_attached = 1; ioc->id = mpt_ids++; sprintf(ioc->name, "%s%d", MPT2SAS_DRIVER_NAME, ioc->id); ioc->pdev = pdev; @@ -8144,6 +8158,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) list_del(&ioc->list); scsi_remove_host(shost); out_add_shost_fail: + ioc->shost_attached = 0; scsi_host_put(shost); return -ENODEV; } @@ -8164,7 +8179,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) pci_power_t device_state; mpt2sas_base_stop_watchdog(ioc); - scsi_block_requests(shost); + if (ioc->shost_attached) + scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); printk(MPT2SAS_INFO_FMT "pdev=0x%p, slot=%s, entering " "operating state [D%d]\n", ioc->name, pdev, @@ -8204,7 +8220,8 @@ _scsih_resume(struct pci_dev *pdev) return r; mpt2sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); + if (ioc->shost_attached) + scsi_unblock_requests(shost); mpt2sas_base_start_watchdog(ioc); return 0; } diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 1836003..f8c25a1 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -56,7 +56,6 @@ #include <linux/dma-mapping.h> #include <linux/io.h> #include <linux/time.h> -#include <linux/kthread.h> #include <linux/aer.h> @@ -110,28 +109,6 @@ module_param_call(mpt3sas_fwfault_debug, _scsih_set_fwfault_debug, param_get_int, &mpt3sas_fwfault_debug, 0644); /** - * mpt3sas_remove_dead_ioc_func - kthread context to remove dead ioc - * @arg: input argument, used to derive ioc - * - * Return 0 if controller is removed from pci subsystem. - * Return -1 for other case. - */ -static int mpt3sas_remove_dead_ioc_func(void *arg) -{ - struct MPT3SAS_ADAPTER *ioc = (struct MPT3SAS_ADAPTER *)arg; - struct pci_dev *pdev; - - if ((ioc == NULL)) - return -1; - - pdev = ioc->pdev; - if ((pdev == NULL)) - return -1; - pci_stop_and_remove_bus_device(pdev); - return 0; -} - -/** * _base_fault_reset_work - workq handling ioc fault conditions * @work: input argument, used to derive ioc * Context: sleep. @@ -146,8 +123,6 @@ _base_fault_reset_work(struct work_struct *work) unsigned long flags; u32 doorbell; int rc; - struct task_struct *p; - spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags); if (ioc->shost_recovery) @@ -167,22 +142,8 @@ _base_fault_reset_work(struct work_struct *work) * command back from HW. */ ioc->schedule_dead_ioc_flush_running_cmds(ioc); - /* - * Set remove_host flag early since kernel thread will - * take some time to execute. - */ - ioc->remove_host = 1; - /*Remove the Dead Host */ - p = kthread_run(mpt3sas_remove_dead_ioc_func, ioc, - "mpt3sas_dead_ioc_%d", ioc->id); - if (IS_ERR(p)) - pr_err(MPT3SAS_FMT - "%s: Running mpt3sas_dead_ioc thread failed !!!!\n", - ioc->name, __func__); - else - pr_err(MPT3SAS_FMT - "%s: Running mpt3sas_dead_ioc thread success !!!!\n", - ioc->name, __func__); + mpt3sas_scsih_detach(ioc); + return; /* don't rearm timer */ } diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h index 994656c..9fbf56c 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.h +++ b/drivers/scsi/mpt3sas/mpt3sas_base.h @@ -731,6 +731,7 @@ struct MPT3SAS_ADAPTER { u8 broadcast_aen_busy; u16 broadcast_aen_pending; u8 shost_recovery; + u8 shost_attached; struct mutex reset_in_progress_mutex; spinlock_t ioc_reset_in_progress_lock; @@ -990,6 +991,7 @@ int mpt3sas_port_enable(struct MPT3SAS_ADAPTER *ioc); u8 mpt3sas_scsih_event_callback(struct MPT3SAS_ADAPTER *ioc, u8 msix_index, u32 reply); void mpt3sas_scsih_reset_handler(struct MPT3SAS_ADAPTER *ioc, int reset_phase); +void mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc); int mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, uint channel, uint id, uint lun, u8 type, u16 smid_task, diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c index dcbf7c8..9ae0914 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c +++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c @@ -7365,22 +7365,23 @@ _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc) } /** - * _scsih_remove - detach and remove add host - * @pdev: PCI device struct + * mpt3sas_scsih_detach - detach from SCSI midlayer and free resources + * @ioc: per adapter object * - * Routine called when unloading the driver. * Return nothing. */ -static void _scsih_remove(struct pci_dev *pdev) +void +mpt3sas_scsih_detach(struct MPT3SAS_ADAPTER *ioc) { - struct Scsi_Host *shost = pci_get_drvdata(pdev); - struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); struct _sas_port *mpt3sas_port, *next_port; struct _raid_device *raid_device, *next; struct MPT3SAS_TARGET *sas_target_priv_data; struct workqueue_struct *wq; unsigned long flags; + if (!ioc->shost_attached) + return; + ioc->remove_host = 1; _scsih_fw_event_cleanup_queue(ioc); @@ -7429,11 +7430,9 @@ static void _scsih_remove(struct pci_dev *pdev) ioc->sas_hba.num_phys = 0; } - sas_remove_host(shost); - mpt3sas_base_detach(ioc); - list_del(&ioc->list); - scsi_remove_host(shost); - scsi_host_put(shost); + sas_remove_host(ioc->shost); + scsi_remove_host(ioc->shost); + ioc->shost_attached = 0; } /** @@ -7447,23 +7446,37 @@ _scsih_shutdown(struct pci_dev *pdev) { struct Scsi_Host *shost = pci_get_drvdata(pdev); struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); - struct workqueue_struct *wq; - unsigned long flags; ioc->remove_host = 1; - _scsih_fw_event_cleanup_queue(ioc); - - spin_lock_irqsave(&ioc->fw_event_lock, flags); - wq = ioc->firmware_event_thread; - ioc->firmware_event_thread = NULL; - spin_unlock_irqrestore(&ioc->fw_event_lock, flags); - if (wq) - destroy_workqueue(wq); - _scsih_ir_shutdown(ioc); + mpt3sas_base_stop_watchdog(ioc); + mpt3sas_scsih_detach(ioc); mpt3sas_base_detach(ioc); } +/** + * _scsih_remove - detach and remove add host + * @pdev: PCI device struct + * + * Routine called when unloading the driver. + * Return nothing. + */ +static void +_scsih_remove(struct pci_dev *pdev) +{ + struct Scsi_Host *shost = pci_get_drvdata(pdev); + struct MPT3SAS_ADAPTER *ioc = shost_priv(shost); + + ioc->remove_host = 1; + + mpt3sas_base_stop_watchdog(ioc); + mpt3sas_scsih_detach(ioc); + + mpt3sas_base_detach(ioc); + list_del(&ioc->list); + + scsi_host_put(ioc->shost); +} /** * _scsih_probe_boot_devices - reports 1st device @@ -7735,6 +7748,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) INIT_LIST_HEAD(&ioc->list); list_add_tail(&ioc->list, &mpt3sas_ioc_list); ioc->shost = shost; + ioc->shost_attached = 1; ioc->id = mpt_ids++; sprintf(ioc->name, "%s%d", MPT3SAS_DRIVER_NAME, ioc->id); ioc->pdev = pdev; @@ -7839,6 +7853,7 @@ _scsih_probe(struct pci_dev *pdev, const struct pci_device_id *id) list_del(&ioc->list); scsi_remove_host(shost); out_add_shost_fail: + ioc->shost_attached = 0; scsi_host_put(shost); return -ENODEV; } @@ -7860,7 +7875,8 @@ _scsih_suspend(struct pci_dev *pdev, pm_message_t state) mpt3sas_base_stop_watchdog(ioc); flush_scheduled_work(); - scsi_block_requests(shost); + if (ioc->shost_attached) + scsi_block_requests(shost); device_state = pci_choose_state(pdev, state); pr_info(MPT3SAS_FMT "pdev=0x%p, slot=%s, entering operating state [D%d]\n", @@ -7899,7 +7915,8 @@ _scsih_resume(struct pci_dev *pdev) return r; mpt3sas_base_hard_reset_handler(ioc, CAN_SLEEP, SOFT_RESET); - scsi_unblock_requests(shost); + if (ioc->shost_attached) + scsi_unblock_requests(shost); mpt3sas_base_start_watchdog(ioc); return 0; } -- 1.8.1.4 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-07-16 12:00 ` Reddy, Sreekanth @ 2013-07-16 12:03 ` James Bottomley 2013-07-16 15:21 ` Joe Lawrence 0 siblings, 1 reply; 12+ messages in thread From: James Bottomley @ 2013-07-16 12:03 UTC (permalink / raw) To: Reddy, Sreekanth; +Cc: Joe Lawrence, linux-scsi@vger.kernel.org On Tue, 2013-07-16 at 17:30 +0530, Reddy, Sreekanth wrote: > James, > > This patch seem to be fine. Please consider this patch. Where's the new version? The one that has all of this fixed: > Off list, Sreekanth from LSI tested and noticed a few issues with this > patch: > > - mpt2sas_base_stop_watchdog is called twice: The call from > mpt2sas_base_detach is safe, but now unnecessary (as a call was > added earlier up in the PCI driver callbacks to ensure that the > watchdog was out of the way.) This second invocation can be > removed. > > - If the watchdog detects a bad IOC, the watchdog remains running: > The watchdog workqueue isn't cleaned up until > mpt2sas_base_stop_watchdog is called, so in the case that the > watchdog removes the device from SCSI topo, the workqueue will > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a > single watchdog that iterates over all adapters would be simpler? > > Finally, if SCSI topo detachment is all that is interesting here, > would > it make more sense to move the watchdog into the MPT "scsi" code? I > haven't looked at the code yet, but this might make an MPT fusion > patch > easier (due to dependencies between its "scsi" and "base" modules). ? James ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: mpt2sas,mpt3sas watchdog device removal 2013-07-16 12:03 ` James Bottomley @ 2013-07-16 15:21 ` Joe Lawrence 2013-07-17 12:03 ` Reddy, Sreekanth 0 siblings, 1 reply; 12+ messages in thread From: Joe Lawrence @ 2013-07-16 15:21 UTC (permalink / raw) To: James Bottomley; +Cc: Reddy, Sreekanth, linux-scsi@vger.kernel.org On Tue, 16 Jul 2013 16:03:38 +0400 James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > On Tue, 2013-07-16 at 17:30 +0530, Reddy, Sreekanth wrote: > > James, > > > > This patch seem to be fine. Please consider this patch. > > Where's the new version? The one that has all of this fixed: > > > Off list, Sreekanth from LSI tested and noticed a few issues with this > > patch: > > > > - mpt2sas_base_stop_watchdog is called twice: The call from > > mpt2sas_base_detach is safe, but now unnecessary (as a call was > > added earlier up in the PCI driver callbacks to ensure that the > > watchdog was out of the way.) This second invocation can be > > removed. > > > > - If the watchdog detects a bad IOC, the watchdog remains running: > > The watchdog workqueue isn't cleaned up until > > mpt2sas_base_stop_watchdog is called, so in the case that the > > watchdog removes the device from SCSI topo, the workqueue will > > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a > > single watchdog that iterates over all adapters would be simpler? > > > > Finally, if SCSI topo detachment is all that is interesting here, > > would > > it make more sense to move the watchdog into the MPT "scsi" code? I > > haven't looked at the code yet, but this might make an MPT fusion > > patch > > easier (due to dependencies between its "scsi" and "base" modules). This patch fizzled out in May as other work took priority. If LSI is still interested in these changes, I can dust off my notes and test/rebase for the 3.11 series. A few of the issues quoted above are easily fixed, however I remember having an outstanding question of how to best clean up the driver's per device watchdog workqueue: The way the MPT drivers are working right now is that the watchdog workqueue function _base_fault_reset_work() initiates a PCI device removal via kthread. The PCI callback kthread context then tears down the device and cancel/flush/destroys the watchdog workqueue. This patch eliminated the kthread and its call into PCI API, simply detaching from the SCSI midlayer. In my opinion, the kthread complicated device removal and introduced potential races if the watchdog tried removing the device at the same time an ordinary device removal request occurred. At the time, the best solution I had was to leave the unused workqueue around until its PCI device was removed. Regards, -- Joe ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: mpt2sas,mpt3sas watchdog device removal 2013-07-16 15:21 ` Joe Lawrence @ 2013-07-17 12:03 ` Reddy, Sreekanth 0 siblings, 0 replies; 12+ messages in thread From: Reddy, Sreekanth @ 2013-07-17 12:03 UTC (permalink / raw) To: Joe Lawrence, James Bottomley; +Cc: linux-scsi@vger.kernel.org Hi Joe, >-----Original Message----- >From: Joe Lawrence [mailto:joe.lawrence@stratus.com] >Sent: Tuesday, July 16, 2013 8:52 PM >To: James Bottomley >Cc: Reddy, Sreekanth; linux-scsi@vger.kernel.org >Subject: Re: mpt2sas,mpt3sas watchdog device removal > >On Tue, 16 Jul 2013 16:03:38 +0400 >James Bottomley <James.Bottomley@hansenpartnership.com> wrote: > >> On Tue, 2013-07-16 at 17:30 +0530, Reddy, Sreekanth wrote: >> > James, >> > >> > This patch seem to be fine. Please consider this patch. >> >> Where's the new version? The one that has all of this fixed: >> >> > Off list, Sreekanth from LSI tested and noticed a few issues with >> > this >> > patch: >> > >> > - mpt2sas_base_stop_watchdog is called twice: The call from >> > mpt2sas_base_detach is safe, but now unnecessary (as a call was >> > added earlier up in the PCI driver callbacks to ensure that the >> > watchdog was out of the way.) This second invocation can be >> > removed. >> > >> > - If the watchdog detects a bad IOC, the watchdog remains running: >> > The watchdog workqueue isn't cleaned up until >> > mpt2sas_base_stop_watchdog is called, so in the case that the >> > watchdog removes the device from SCSI topo, the workqueue will >> > remain unused until PCI .remove/.shutdown cleans it up. Perhaps a >> > single watchdog that iterates over all adapters would be simpler? >> > >> > Finally, if SCSI topo detachment is all that is interesting here, >> > would it make more sense to move the watchdog into the MPT "scsi" >> > code? I haven't looked at the code yet, but this might make an MPT >> > fusion patch easier (due to dependencies between its "scsi" and >> > "base" modules). > >This patch fizzled out in May as other work took priority. If LSI is still >interested in these changes, I can dust off my notes and test/rebase for the >3.11 series. It would be grate help if you post the updated patch which will accommodate the fixes for above comments. > >A few of the issues quoted above are easily fixed, however I remember >having an outstanding question of how to best clean up the driver's per device >watchdog workqueue: > >The way the MPT drivers are working right now is that the watchdog >workqueue function _base_fault_reset_work() initiates a PCI device removal >via kthread. The PCI callback kthread context then tears down the device and >cancel/flush/destroys the watchdog workqueue. > >This patch eliminated the kthread and its call into PCI API, simply detaching >from the SCSI midlayer. In my opinion, the kthread complicated device >removal and introduced potential races if the watchdog tried removing the >device at the same time an ordinary device removal request occurred. > >At the time, the best solution I had was to leave the unused workqueue >around until its PCI device was removed. > >Regards, > >-- Joe Regards, Sreekanth ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2013-10-21 14:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-05-15 17:24 mpt2sas,mpt3sas watchdog device removal Joe Lawrence 2013-05-15 17:26 ` [PATCH] mpt2sas,mpt3sas: make watchdog instantiated device removal safe Joe Lawrence 2013-05-17 15:29 ` Bjorn Helgaas 2013-05-15 17:29 ` mpt2sas,mpt3sas watchdog device removal Joe Lawrence 2013-05-17 15:29 ` Bjorn Helgaas 2013-05-17 21:42 ` Joe Lawrence 2013-10-10 21:59 ` Bjorn Helgaas 2013-10-21 14:24 ` Joe Lawrence 2013-07-16 12:00 ` Reddy, Sreekanth 2013-07-16 12:03 ` James Bottomley 2013-07-16 15:21 ` Joe Lawrence 2013-07-17 12:03 ` Reddy, Sreekanth
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox