public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] two fixes for pNFS SCSI device handling
@ 2024-11-22 15:11 Benjamin Coddington
  2024-11-22 15:11 ` [PATCH v4 1/2] nfs/blocklayout: Don't attempt unregister for invalid block device Benjamin Coddington
  2024-11-22 15:11 ` [PATCH v4 2/2] nfs/blocklayout: Limit repeat device registration on failure Benjamin Coddington
  0 siblings, 2 replies; 3+ messages in thread
From: Benjamin Coddington @ 2024-11-22 15:11 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, Chuck Lever
  Cc: linux-nfs, linux-kernel, Christoph Hellwig

A bit late for v6.13 perhaps, but here are two fresh corrections for pNFS
SCSI device handling, and some comments as requested by Christoph.

On v2: add full commit subject in 1/2, change the caller in 2/2.
On v3: add r-b for Chuck, tweak comments in 2/2.
On v4: rebase on linux-next

Benjamin Coddington (2):
  nfs/blocklayout: Don't attempt unregister for invalid block device
  nfs/blocklayout: Limit repeat device registration on failure

 fs/nfs/blocklayout/blocklayout.c | 15 ++++++++++++++-
 fs/nfs/blocklayout/dev.c         |  6 ++----
 2 files changed, 16 insertions(+), 5 deletions(-)


base-commit: cfba9f07a1d6aeca38f47f1f472cfb0ba133d341
-- 
2.47.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v4 1/2] nfs/blocklayout: Don't attempt unregister for invalid block device
  2024-11-22 15:11 [PATCH v4 0/2] two fixes for pNFS SCSI device handling Benjamin Coddington
@ 2024-11-22 15:11 ` Benjamin Coddington
  2024-11-22 15:11 ` [PATCH v4 2/2] nfs/blocklayout: Limit repeat device registration on failure Benjamin Coddington
  1 sibling, 0 replies; 3+ messages in thread
From: Benjamin Coddington @ 2024-11-22 15:11 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, Chuck Lever
  Cc: linux-nfs, linux-kernel, Christoph Hellwig

Since commit d869da91cccb ("nfs/blocklayout: Fix premature PR key
unregistration") an unmount of a pNFS SCSI layout-enabled NFS may
dereference a NULL block_device in:

  bl_unregister_scsi+0x16/0xe0 [blocklayoutdriver]
  bl_free_device+0x70/0x80 [blocklayoutdriver]
  bl_free_deviceid_node+0x12/0x30 [blocklayoutdriver]
  nfs4_put_deviceid_node+0x60/0xc0 [nfsv4]
  nfs4_deviceid_purge_client+0x132/0x190 [nfsv4]
  unset_pnfs_layoutdriver+0x59/0x60 [nfsv4]
  nfs4_destroy_server+0x36/0x70 [nfsv4]
  nfs_free_server+0x23/0xe0 [nfs]
  deactivate_locked_super+0x30/0xb0
  cleanup_mnt+0xba/0x150
  task_work_run+0x59/0x90
  syscall_exit_to_user_mode+0x217/0x220
  do_syscall_64+0x8e/0x160

This happens because even though we were able to create the
nfs4_deviceid_node, the lookup for the device was unable to attach the
block device to the pnfs_block_dev.

If we never found a block device to register, we can avoid this case with
the PNFS_BDEV_REGISTERED flag.  Move the deref behind the test for the
flag.

Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/blocklayout/dev.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/blocklayout/dev.c b/fs/nfs/blocklayout/dev.c
index 6252f4447945..cab8809f0e0f 100644
--- a/fs/nfs/blocklayout/dev.c
+++ b/fs/nfs/blocklayout/dev.c
@@ -20,9 +20,6 @@ static void bl_unregister_scsi(struct pnfs_block_dev *dev)
 	const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
 	int status;
 
-	if (!test_and_clear_bit(PNFS_BDEV_REGISTERED, &dev->flags))
-		return;
-
 	status = ops->pr_register(bdev, dev->pr_key, 0, false);
 	if (status)
 		trace_bl_pr_key_unreg_err(bdev, dev->pr_key, status);
@@ -58,7 +55,8 @@ static void bl_unregister_dev(struct pnfs_block_dev *dev)
 		return;
 	}
 
-	if (dev->type == PNFS_BLOCK_VOLUME_SCSI)
+	if (dev->type == PNFS_BLOCK_VOLUME_SCSI &&
+		test_and_clear_bit(PNFS_BDEV_REGISTERED, &dev->flags))
 		bl_unregister_scsi(dev);
 }
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v4 2/2] nfs/blocklayout: Limit repeat device registration on failure
  2024-11-22 15:11 [PATCH v4 0/2] two fixes for pNFS SCSI device handling Benjamin Coddington
  2024-11-22 15:11 ` [PATCH v4 1/2] nfs/blocklayout: Don't attempt unregister for invalid block device Benjamin Coddington
@ 2024-11-22 15:11 ` Benjamin Coddington
  1 sibling, 0 replies; 3+ messages in thread
From: Benjamin Coddington @ 2024-11-22 15:11 UTC (permalink / raw)
  To: Trond Myklebust, Anna Schumaker, Chuck Lever
  Cc: linux-nfs, linux-kernel, Christoph Hellwig

Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the
device which can drive GETDEVINFO, then finally may need to prep the device
with a reservation.  This slow work makes a mess of IO latencies if one of
the later steps is going to fail for awhile.

If we're unable to register a SCSI device, ensure we mark the device as
unavailable so that it will timeout and be re-added via GETDEVINFO.  This
avoids repeated doomed attempts to register a device in the IO path.

Add some clarifying comments as well.

Fixes: d869da91cccb ("nfs/blocklayout: Fix premature PR key unregistration")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/blocklayout/blocklayout.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/blocklayout/blocklayout.c b/fs/nfs/blocklayout/blocklayout.c
index 0becdec12970..47189476b553 100644
--- a/fs/nfs/blocklayout/blocklayout.c
+++ b/fs/nfs/blocklayout/blocklayout.c
@@ -571,19 +571,32 @@ bl_find_get_deviceid(struct nfs_server *server,
 	if (!node)
 		return ERR_PTR(-ENODEV);
 
+	/*
+	 * Devices that are marked unavailable are left in the cache with a
+	 * timeout to avoid sending GETDEVINFO after every LAYOUTGET, or
+	 * constantly attempting to register the device.  Once marked as
+	 * unavailable they must be deleted and never reused.
+	 */
 	if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
 		unsigned long end = jiffies;
 		unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT;
 
 		if (!time_in_range(node->timestamp_unavailable, start, end)) {
+			/* Uncork subsequent GETDEVINFO operations for this device */
 			nfs4_delete_deviceid(node->ld, node->nfs_client, id);
 			goto retry;
 		}
 		goto out_put;
 	}
 
-	if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node)))
+	if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) {
+		/*
+		 * If we cannot register, treat this device as transient:
+		 * Make a negative cache entry for the device
+		 */
+		nfs4_mark_deviceid_unavailable(node);
 		goto out_put;
+	}
 
 	return node;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-11-22 15:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-22 15:11 [PATCH v4 0/2] two fixes for pNFS SCSI device handling Benjamin Coddington
2024-11-22 15:11 ` [PATCH v4 1/2] nfs/blocklayout: Don't attempt unregister for invalid block device Benjamin Coddington
2024-11-22 15:11 ` [PATCH v4 2/2] nfs/blocklayout: Limit repeat device registration on failure Benjamin Coddington

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox