All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] NFSD: Force all NFSv4.2 COPY requests to be synchronous
@ 2024-05-07 13:37 cel
  0 siblings, 0 replies; only message in thread
From: cel @ 2024-05-07 13:37 UTC (permalink / raw)
  To: linux-nfs; +Cc: Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

We've discovered that delivering a CB_OFFLOAD operation can be
unreliable in some pretty unremarkable situations. Examples
include:

 - The server dropped the connection because it lost a forechannel
   NFSv4 request and wishes to force the client to retransmit
 - The GSS sequence number window under-flowed
 - A network partition occurred

When that happens, all pending callback operations, including
CB_OFFLOAD, are lost. NFSD does not retransmit them.

Moreover, the Linux NFS client does not yet support sending an
OFFLOAD_STATUS operation to probe whether an asynchronous COPY
operation has finished. Thus, on Linux NFS clients, when a
CB_OFFLOAD is lost, asynchronous COPY can hang until manually
interrupted.

I've tried a couple of remedies, but so far the side-effects are
worse than the disease and they have had to be reverted. So
temporarily force COPY operations to be synchronous so that the use
of CB_OFFLOAD is avoided entirely. This is a fix that can easily be
backported to LTS kernels. I am working on client patches that
introduce an implementation of OFFLOAD_STATUS.

Note that NFSD arbitrarily limits the size of a copy_file_range
to 4MB to avoid indefinitely blocking an nfsd thread. A short
COPY result is returned in that case, and the client can present
a fresh COPY request for the remainder.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4proc.c | 7 +++++++
 1 file changed, 7 insertions(+)

Changes since v1:
- Clarify that this patch is for backporting, and a longer-term
  fix is in the works for subsequent upstream kernels
- Note that synchronous COPY operations don't indefinitely block
  nfsd threads

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index ea3cc3e870a7..46bd20fe5c0f 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1807,6 +1807,13 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	__be32 status;
 	struct nfsd4_copy *async_copy = NULL;
 
+	/*
+	 * Currently, async COPY is not reliable. Force all COPY
+	 * requests to be synchronous to avoid client application
+	 * hangs waiting for COPY completion.
+	 */
+	nfsd4_copy_set_sync(copy, true);
+
 	copy->cp_clp = cstate->clp;
 	if (nfsd4_ssc_is_inter(copy)) {
 		trace_nfsd_copy_inter(copy);

base-commit: 939cb14d51a150e3c12ef7a8ce0ba04ce6131bd2
-- 
2.44.0


^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2024-05-07 13:37 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-07 13:37 [PATCH v2] NFSD: Force all NFSv4.2 COPY requests to be synchronous cel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.