From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF8571AB6C9; Tue, 3 Dec 2024 16:20:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733242844; cv=none; b=I/6dUb72STzqxknzYfP4IPP+L6mUtCd6cUWw/4EFHvuaGHzAF3aQwqjo8+7lHT97Wk8o17BEqcMSjTfNOe+yAA8vm5UKNVUqV4decV59losoTUawIzIwqc0llnt7sC2usxhWzghUAFZfMKfC3Rsr7G30ddb/jPia5PnA+BMFDDc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733242844; c=relaxed/simple; bh=6cW9z1n+d6IkiJj9pMv9Je+h9lEONEXvNjNJJGumaLQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CFVtaSchuQEa2GmE3slz0gdkXGcfM7yJmDwW0ehLDxbyR6NVamCxMcFotFqpwfUbKjPVlAt1rUB0REF3LdcSEDiSuwnVo5s+GXl/8MXEvEUIcTixYzl1JyuNNFzOl53HRc7CWvtjr5i5dS3l/xy9GvjZ1w72pTJdRnihf7LNJuE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=rZ9q5cD6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="rZ9q5cD6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50835C4CECF; Tue, 3 Dec 2024 16:20:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1733242843; bh=6cW9z1n+d6IkiJj9pMv9Je+h9lEONEXvNjNJJGumaLQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rZ9q5cD6aHiyeeyJI1uIGZ54VCJ3qFtVxyn8zXKFE2uNuFI3QDYUntXf7xVt/EOWA QkmFNSHiHPOSHnDNWWXOJ2dHlIvcQKXtPKy78MSZL4ZhCGqiPIElHUETjebimbddF9 4VClXh2aoce6Fg3ymDCSDwC6x5Q9c0AgnhG7sBSo= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Benjamin Coddington , Christoph Hellwig , Chuck Lever , Trond Myklebust , Sasha Levin Subject: [PATCH 6.12 819/826] nfs/blocklayout: Limit repeat device registration on failure Date: Tue, 3 Dec 2024 15:49:06 +0100 Message-ID: <20241203144815.700388400@linuxfoundation.org> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20241203144743.428732212@linuxfoundation.org> References: <20241203144743.428732212@linuxfoundation.org> User-Agent: quilt/0.67 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: stable@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.12-stable review patch. If anyone has any objections, please let me know. ------------------ From: Benjamin Coddington [ Upstream commit 614733f9441ed53bb442d4734112ec1e24bd6da7 ] Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the device which can drive GETDEVINFO, then finally may need to prep the device with a reservation. This slow work makes a mess of IO latencies if one of the later steps is going to fail for awhile. If we're unable to register a SCSI device, ensure we mark the device as unavailable so that it will timeout and be re-added via GETDEVINFO. This avoids repeated doomed attempts to register a device in the IO path. Add some clarifying comments as well. Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington Reviewed-by: Christoph Hellwig Reviewed-by: Chuck Lever Signed-off-by: Trond Myklebust Signed-off-by: Sasha Levin --- fs/nfs/blocklayout/blocklayout.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c index 0becdec129704..47189476b5538 100644 --- a/fs/nfs/blocklayout/blocklayout.c +++ b/fs/nfs/blocklayout/blocklayout.c @@ -571,19 +571,32 @@ bl_find_get_deviceid(struct nfs_server *server, if (!node) return ERR_PTR(-ENODEV); + /* + * Devices that are marked unavailable are left in the cache with a + * timeout to avoid sending GETDEVINFO after every LAYOUTGET, or + * constantly attempting to register the device. Once marked as + * unavailable they must be deleted and never reused. + */ if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) { unsigned long end = jiffies; unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT; if (!time_in_range(node->timestamp_unavailable, start, end)) { + /* Uncork subsequent GETDEVINFO operations for this device */ nfs4_delete_deviceid(node->ld, node->nfs_client, id); goto retry; } goto out_put; } - if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) + if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) { + /* + * If we cannot register, treat this device as transient: + * Make a negative cache entry for the device + */ + nfs4_mark_deviceid_unavailable(node); goto out_put; + } return node; -- 2.43.0