From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from e19.ny.us.ibm.com (e19.ny.us.ibm.com [129.33.205.209]) (using TLSv1 with cipher CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id E277D1A043C for ; Fri, 25 Sep 2015 05:42:37 +1000 (AEST) Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 24 Sep 2015 15:42:35 -0400 Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 73912C90042 for ; Thu, 24 Sep 2015 15:33:36 -0400 (EDT) Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t8OJgXPu56557784 for ; Thu, 24 Sep 2015 19:42:33 GMT Received: from d01av03.pok.ibm.com (localhost [127.0.0.1]) by d01av03.pok.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t8OJgV6L031970 for ; Thu, 24 Sep 2015 15:42:32 -0400 From: "Matthew R. Ochs" To: linux-scsi@vger.kernel.org, James Bottomley , "Nicholas A. Bellinger" , Brian King , Ian Munsie , Daniel Axtens , Andrew Donnellan , Tomas Henzl , David Laight Cc: Michael Neuling , linuxppc-dev@lists.ozlabs.org, "Manoj N. Kumar" Subject: [PATCH v3 25/32] cxlflash: Fix to prevent EEH recovery failure Date: Thu, 24 Sep 2015 14:41:53 -0500 Message-Id: <1443123713-17622-1-git-send-email-mrochs@linux.vnet.ibm.com> In-Reply-To: <1443123193-16498-1-git-send-email-mrochs@linux.vnet.ibm.com> References: <1443123193-16498-1-git-send-email-mrochs@linux.vnet.ibm.com> List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , The process_sense() routine can perform a read capacity which can take some time to complete. If an EEH occurs while waiting on the read capacity, the EEH handler is unable to obtain the context's mutex in order to put the context in an error state. The EEH handler will sit and wait until the context is free, but this wait can last longer than the EEH handler tolerates, leading to a failed recovery. To address this issue, make the context unavailable to new, non-system owned threads and release the context while calling into process_sense(). After returning from process_sense() the context mutex is reacquired and the context is made available again. The context can be safely moved to the error state if needed during the unavailable window as no other threads will hold its reference. Signed-off-by: Matthew R. Ochs Signed-off-by: Manoj N. Kumar Reviewed-by: Brian King --- drivers/scsi/cxlflash/superpipe.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/scsi/cxlflash/superpipe.c b/drivers/scsi/cxlflash/superpipe.c index 329b27e..e2b28ea 100644 --- a/drivers/scsi/cxlflash/superpipe.c +++ b/drivers/scsi/cxlflash/superpipe.c @@ -1789,12 +1789,21 @@ static int cxlflash_disk_verify(struct scsi_device *sdev, * inquiry (i.e. the Unit attention is due to the WWN changing). */ if (verify->hint & DK_CXLFLASH_VERIFY_HINT_SENSE) { + /* Can't hold mutex across process_sense/read_cap16, + * since we could have an intervening EEH event. + */ + ctxi->unavail = true; + mutex_unlock(&ctxi->mutex); rc = process_sense(sdev, verify); if (unlikely(rc)) { dev_err(dev, "%s: Failed to validate sense data (%d)\n", __func__, rc); + mutex_lock(&ctxi->mutex); + ctxi->unavail = false; goto out; } + mutex_lock(&ctxi->mutex); + ctxi->unavail = false; } switch (gli->mode) { -- 2.1.0