From mboxrd@z Thu Jan 1 00:00:00 1970 From: Heiko Carstens Subject: Re: [patch 1/2] zfcp: Fix race during ERP thread shutdown Date: Tue, 11 Mar 2008 17:43:41 +0100 Message-ID: <20080311164341.GA6380@osiris.boeblingen.de.ibm.com> References: <20080310151852.808020000@de.ibm.com> <20080310152328.463410000@de.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from mtagate2.uk.ibm.com ([195.212.29.135]:21056 "EHLO mtagate2.uk.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751576AbYCKQno (ORCPT ); Tue, 11 Mar 2008 12:43:44 -0400 Content-Disposition: inline In-Reply-To: <20080310152328.463410000@de.ibm.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Christof Schmitt Cc: James Bottomley , linux-scsi@vger.kernel.org, linux-s390@vger.kernel.org, Martin Peschke > --- a/drivers/s390/scsi/zfcp_erp.c 2008-03-10 12:34:48.000000000 +0100 > +++ b/drivers/s390/scsi/zfcp_erp.c 2008-03-10 16:05:26.000000000 +0100 > @@ -1002,7 +1002,7 @@ zfcp_erp_thread_setup(struct zfcp_adapte > &adapter->status)); > debug_text_event(adapter->erp_dbf, 5, "a_thset_ok"); > } > - > + atomic_set_mask(ZFCP_STATUS_ADAPTER_ERP_ACCEPT, &adapter->status); > return (retval < 0); This way the flag will be set even if the creation of the erp thread failed. It should be done somewhere within zfcp_erp_thread() instead. Besides that setting the flag must be done while holding the erp_lock. Otherwise zfcp_erp_action_enqueue might check the flag, other cpu clears the flag right after that, and then enqueueing continues while the flag is not set. > } > > @@ -1025,6 +1025,8 @@ zfcp_erp_thread_kill(struct zfcp_adapter > { > int retval = 0; > > + atomic_clear_mask(ZFCP_STATUS_ADAPTER_ERP_ACCEPT, &adapter->status); > + zfcp_erp_wait(adapter); > atomic_set_mask(ZFCP_STATUS_ADAPTER_ERP_THREAD_KILL, &adapter->status); > up(&adapter->erp_ready_sem); > > @@ -2940,8 +2942,7 @@ zfcp_erp_action_enqueue(int action, > * efficient. > */ > > - if (!atomic_test_mask(ZFCP_STATUS_ADAPTER_ERP_THREAD_UP, > - &adapter->status)) > + if (!atomic_test_mask(ZFCP_STATUS_ADAPTER_ERP_ACCEPT, &adapter->status)) > return -EIO; > > debug_event(adapter->erp_dbf, 4, &action, sizeof (int)); > --- a/drivers/s390/scsi/zfcp_ccw.c 2008-03-10 12:34:48.000000000 +0100 > +++ b/drivers/s390/scsi/zfcp_ccw.c 2008-03-10 16:05:26.000000000 +0100 > @@ -198,7 +198,6 @@ zfcp_ccw_set_offline(struct ccw_device * > down(&zfcp_data.config_sema); > adapter = dev_get_drvdata(&ccw_device->dev); > zfcp_erp_adapter_shutdown(adapter, 0); > - zfcp_erp_wait(adapter); > zfcp_erp_thread_kill(adapter); And as a sidenote, if the scenario this patch is supposed to fix can really happen, how can you make sure the adapter is still down before the erp thread gets killed? Some action could have been enqueued and finished which reopened the adapter after zfcp_erp_adapter_shutdown and before zfcp_erp_thread_kill. Just wondering... :)