public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation
@ 2024-12-03 16:29 cel
  2024-12-03 16:29 ` [PATCH v1 1/7] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

SCSI implementation experience has shown that an interrupt-only
COPY offload implementation is not reliable. There are too many
common scenarios where the client can miss the completion interrupt
(in our case, this is a CB_OFFLOAD callback).

Therefore, a polling mechanism is needed. The NFSv4.2 protocol
provides one in the form of the new OFFLOAD_STATUS operation. Linux
NFSD implements OFFLOAD_STATUS already. This series adds a Linux NFS
client implementation of the OFFLOAD_STATUS operation that can query
the state of a background COPY on the server.

These patches are also available here:

https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/log/?h=fix-async-copy

Chuck Lever (7):
  NFS: CB_OFFLOAD can return NFS4ERR_DELAY
  NFS: Fix typo in OFFLOAD_CANCEL comment
  NFS: Rename struct nfs4_offloadcancel_data
  NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR
  NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
  NFS: Use NFSv4.2's OFFLOAD_STATUS operation
  NFS: Refactor trace_nfs4_offload_cancel

 fs/nfs/callback_proc.c    |   2 +-
 fs/nfs/nfs42proc.c        | 205 ++++++++++++++++++++++++++++++++++----
 fs/nfs/nfs42xdr.c         |  88 +++++++++++++++-
 fs/nfs/nfs4proc.c         |   3 +-
 fs/nfs/nfs4trace.h        |  11 +-
 fs/nfs/nfs4xdr.c          |   1 +
 include/linux/nfs4.h      |   1 +
 include/linux/nfs_fs_sb.h |   1 +
 include/linux/nfs_xdr.h   |   5 +-
 9 files changed, 292 insertions(+), 25 deletions(-)

-- 
2.47.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/7] NFS: CB_OFFLOAD can return NFS4ERR_DELAY
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
@ 2024-12-03 16:29 ` cel
  2024-12-03 16:29 ` [PATCH v1 2/7] NFS: Fix typo in OFFLOAD_CANCEL comment cel
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

RFC 7862 permits the callback service to respond to a CB_OFFLOAD
operation with NFS4ERR_DELAY. Use that instead of
NFS4ERR_SERVERFAULT for temporary memory allocation failure, as that
is more consistent with how other operations report memory
allocation failure.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/callback_proc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 7832fb0369a1..8397c43358bd 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -718,7 +718,7 @@ __be32 nfs4_callback_offload(void *data, void *dummy,
 
 	copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
 	if (!copy)
-		return htonl(NFS4ERR_SERVERFAULT);
+		return cpu_to_be32(NFS4ERR_DELAY);
 
 	spin_lock(&cps->clp->cl_lock);
 	rcu_read_lock();
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/7] NFS: Fix typo in OFFLOAD_CANCEL comment
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
  2024-12-03 16:29 ` [PATCH v1 1/7] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
@ 2024-12-03 16:29 ` cel
  2024-12-03 16:29 ` [PATCH v1 3/7] NFS: Rename struct nfs4_offloadcancel_data cel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42xdr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 9e3ae53e2205..ef5730c5e704 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -549,7 +549,7 @@ static void nfs4_xdr_enc_copy(struct rpc_rqst *req,
 }
 
 /*
- * Encode OFFLOAD_CANEL request
+ * Encode OFFLOAD_CANCEL request
  */
 static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
 					struct xdr_stream *xdr,
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 3/7] NFS: Rename struct nfs4_offloadcancel_data
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
  2024-12-03 16:29 ` [PATCH v1 1/7] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
  2024-12-03 16:29 ` [PATCH v1 2/7] NFS: Fix typo in OFFLOAD_CANCEL comment cel
@ 2024-12-03 16:29 ` cel
  2024-12-03 16:29 ` [PATCH v1 4/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Refactor: This struct can be used unchanged for the new
OFFLOAD_STATUS implementation, so give it a more generic name.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42proc.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 531c9c20ef1d..9d716907cf30 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -498,15 +498,15 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
 	return err;
 }
 
-struct nfs42_offloadcancel_data {
+struct nfs42_offload_data {
 	struct nfs_server *seq_server;
 	struct nfs42_offload_status_args args;
 	struct nfs42_offload_status_res res;
 };
 
-static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
+static void nfs42_offload_prepare(struct rpc_task *task, void *calldata)
 {
-	struct nfs42_offloadcancel_data *data = calldata;
+	struct nfs42_offload_data *data = calldata;
 
 	nfs4_setup_sequence(data->seq_server->nfs_client,
 				&data->args.osa_seq_args,
@@ -515,7 +515,7 @@ static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
 
 static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
 {
-	struct nfs42_offloadcancel_data *data = calldata;
+	struct nfs42_offload_data *data = calldata;
 
 	trace_nfs4_offload_cancel(&data->args, task->tk_status);
 	nfs41_sequence_done(task, &data->res.osr_seq_res);
@@ -525,22 +525,22 @@ static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
 		rpc_restart_call_prepare(task);
 }
 
-static void nfs42_free_offloadcancel_data(void *data)
+static void nfs42_offload_release(void *data)
 {
 	kfree(data);
 }
 
 static const struct rpc_call_ops nfs42_offload_cancel_ops = {
-	.rpc_call_prepare = nfs42_offload_cancel_prepare,
+	.rpc_call_prepare = nfs42_offload_prepare,
 	.rpc_call_done = nfs42_offload_cancel_done,
-	.rpc_release = nfs42_free_offloadcancel_data,
+	.rpc_release = nfs42_offload_release,
 };
 
 static int nfs42_do_offload_cancel_async(struct file *dst,
 					 nfs4_stateid *stateid)
 {
 	struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
-	struct nfs42_offloadcancel_data *data = NULL;
+	struct nfs42_offload_data *data = NULL;
 	struct nfs_open_context *ctx = nfs_file_open_context(dst);
 	struct rpc_task *task;
 	struct rpc_message msg = {
@@ -559,7 +559,7 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
 	if (!(dst_server->caps & NFS_CAP_OFFLOAD_CANCEL))
 		return -EOPNOTSUPP;
 
-	data = kzalloc(sizeof(struct nfs42_offloadcancel_data), GFP_KERNEL);
+	data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
 	if (data == NULL)
 		return -ENOMEM;
 
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 4/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
                   ` (2 preceding siblings ...)
  2024-12-03 16:29 ` [PATCH v1 3/7] NFS: Rename struct nfs4_offloadcancel_data cel
@ 2024-12-03 16:29 ` cel
  2024-12-03 16:29 ` [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Add XDR encoding and decoding functions for the NFSv4.2
OFFLOAD_STATUS operation.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42xdr.c       | 86 +++++++++++++++++++++++++++++++++++++++++
 fs/nfs/nfs4xdr.c        |  1 +
 include/linux/nfs4.h    |  1 +
 include/linux/nfs_xdr.h |  5 ++-
 4 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index ef5730c5e704..a928b7f90e59 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -35,6 +35,11 @@
 #define encode_offload_cancel_maxsz	(op_encode_hdr_maxsz + \
 					 XDR_QUADLEN(NFS4_STATEID_SIZE))
 #define decode_offload_cancel_maxsz	(op_decode_hdr_maxsz)
+#define encode_offload_status_maxsz	(op_encode_hdr_maxsz + \
+					 XDR_QUADLEN(NFS4_STATEID_SIZE))
+#define decode_offload_status_maxsz	(op_decode_hdr_maxsz + \
+					 2 /* osr_count */ + \
+					 2 /* osr_complete */)
 #define encode_copy_notify_maxsz	(op_encode_hdr_maxsz + \
 					 XDR_QUADLEN(NFS4_STATEID_SIZE) + \
 					 1 + /* nl4_type */ \
@@ -143,6 +148,14 @@
 					 decode_sequence_maxsz + \
 					 decode_putfh_maxsz + \
 					 decode_offload_cancel_maxsz)
+#define NFS4_enc_offload_status_sz	(compound_encode_hdr_maxsz + \
+					 encode_sequence_maxsz + \
+					 encode_putfh_maxsz + \
+					 encode_offload_status_maxsz)
+#define NFS4_dec_offload_status_sz	(compound_decode_hdr_maxsz + \
+					 decode_sequence_maxsz + \
+					 decode_putfh_maxsz + \
+					 decode_offload_status_maxsz)
 #define NFS4_enc_copy_notify_sz		(compound_encode_hdr_maxsz + \
 					 encode_putfh_maxsz + \
 					 encode_copy_notify_maxsz)
@@ -343,6 +356,14 @@ static void encode_offload_cancel(struct xdr_stream *xdr,
 	encode_nfs4_stateid(xdr, &args->osa_stateid);
 }
 
+static void encode_offload_status(struct xdr_stream *xdr,
+				  const struct nfs42_offload_status_args *args,
+				  struct compound_hdr *hdr)
+{
+	encode_op_hdr(xdr, OP_OFFLOAD_STATUS, decode_offload_status_maxsz, hdr);
+	encode_nfs4_stateid(xdr, &args->osa_stateid);
+}
+
 static void encode_copy_notify(struct xdr_stream *xdr,
 			       const struct nfs42_copy_notify_args *args,
 			       struct compound_hdr *hdr)
@@ -567,6 +588,25 @@ static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
 	encode_nops(&hdr);
 }
 
+/*
+ * Encode OFFLOAD_STATUS request
+ */
+static void nfs4_xdr_enc_offload_status(struct rpc_rqst *req,
+					struct xdr_stream *xdr,
+					const void *data)
+{
+	const struct nfs42_offload_status_args *args = data;
+	struct compound_hdr hdr = {
+		.minorversion = nfs4_xdr_minorversion(&args->osa_seq_args),
+	};
+
+	encode_compound_hdr(xdr, req, &hdr);
+	encode_sequence(xdr, &args->osa_seq_args, &hdr);
+	encode_putfh(xdr, args->osa_src_fh, &hdr);
+	encode_offload_status(xdr, args, &hdr);
+	encode_nops(&hdr);
+}
+
 /*
  * Encode COPY_NOTIFY request
  */
@@ -919,6 +959,26 @@ static int decode_offload_cancel(struct xdr_stream *xdr,
 	return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
 }
 
+static int decode_offload_status(struct xdr_stream *xdr,
+				 struct nfs42_offload_status_res *res)
+{
+	ssize_t result;
+	int status;
+
+	status = decode_op_hdr(xdr, OP_OFFLOAD_STATUS);
+	if (status)
+		return status;
+	/* osr_count */
+	if (xdr_stream_decode_u64(xdr, &res->osr_count) < 0)
+		return -EIO;
+	/* osr_complete<1> */
+	result = xdr_stream_decode_uint32_array(xdr, &res->osr_complete, 1);
+	if (result < 0)
+		return -EIO;
+	res->complete_count = result;
+	return 0;
+}
+
 static int decode_copy_notify(struct xdr_stream *xdr,
 			      struct nfs42_copy_notify_res *res)
 {
@@ -1368,6 +1428,32 @@ static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst *rqstp,
 	return status;
 }
 
+/*
+ * Decode OFFLOAD_STATUS response
+ */
+static int nfs4_xdr_dec_offload_status(struct rpc_rqst *rqstp,
+				       struct xdr_stream *xdr,
+				       void *data)
+{
+	struct nfs42_offload_status_res *res = data;
+	struct compound_hdr hdr;
+	int status;
+
+	status = decode_compound_hdr(xdr, &hdr);
+	if (status)
+		goto out;
+	status = decode_sequence(xdr, &res->osr_seq_res, rqstp);
+	if (status)
+		goto out;
+	status = decode_putfh(xdr);
+	if (status)
+		goto out;
+	status = decode_offload_status(xdr, res);
+
+out:
+	return status;
+}
+
 /*
  * Decode COPY_NOTIFY response
  */
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e8ac3f615f93..08be0a0cce24 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7702,6 +7702,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
 	PROC42(CLONE,		enc_clone,		dec_clone),
 	PROC42(COPY,		enc_copy,		dec_copy),
 	PROC42(OFFLOAD_CANCEL,	enc_offload_cancel,	dec_offload_cancel),
+	PROC42(OFFLOAD_STATUS,	enc_offload_status,	dec_offload_status),
 	PROC42(COPY_NOTIFY,	enc_copy_notify,	dec_copy_notify),
 	PROC(LOOKUPP,		enc_lookupp,		dec_lookupp),
 	PROC42(LAYOUTERROR,	enc_layouterror,	dec_layouterror),
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 8d7430d9f218..5de96243a252 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -695,6 +695,7 @@ enum {
 	NFSPROC4_CLNT_LISTXATTRS,
 	NFSPROC4_CLNT_REMOVEXATTR,
 	NFSPROC4_CLNT_READ_PLUS,
+	NFSPROC4_CLNT_OFFLOAD_STATUS,
 };
 
 /* nfs41 types */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 559273a0f16d..9ac6c7a26b44 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1520,8 +1520,9 @@ struct nfs42_offload_status_args {
 
 struct nfs42_offload_status_res {
 	struct nfs4_sequence_res	osr_seq_res;
-	uint64_t			osr_count;
-	int				osr_status;
+	u64				osr_count;
+	int				complete_count;
+	u32				osr_complete;
 };
 
 struct nfs42_copy_notify_args {
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
                   ` (3 preceding siblings ...)
  2024-12-03 16:29 ` [PATCH v1 4/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
@ 2024-12-03 16:29 ` cel
  2024-12-13  0:39   ` Olga Kornievskaia
  2024-12-03 16:29 ` [PATCH v1 6/7] NFS: Use " cel
  2024-12-03 16:29 ` [PATCH v1 7/7] NFS: Refactor trace_nfs4_offload_cancel cel
  6 siblings, 1 reply; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Enable the Linux NFS client to observe the progress of an offloaded
asynchronous COPY operation. This new operation will be put to use
in a subsequent patch.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42proc.c        | 117 ++++++++++++++++++++++++++++++++++++++
 fs/nfs/nfs4proc.c         |   3 +-
 include/linux/nfs_fs_sb.h |   1 +
 3 files changed, 120 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 9d716907cf30..fa180ce7c803 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -21,6 +21,8 @@
 
 #define NFSDBG_FACILITY NFSDBG_PROC
 static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
+static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
+				     u64 *copied);
 
 static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
 {
@@ -582,6 +584,121 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
 	return status;
 }
 
+static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
+{
+	struct nfs42_offload_data *data = calldata;
+
+	nfs41_sequence_done(task, &data->res.osr_seq_res);
+	switch (task->tk_status) {
+	case 0:
+		return;
+	case -NFS4ERR_ADMIN_REVOKED:
+	case -NFS4ERR_BAD_STATEID:
+	case -NFS4ERR_OLD_STATEID:
+		/*
+		 * Server does not recognize the COPY stateid. CB_OFFLOAD
+		 * could have purged it, or server might have rebooted.
+		 * Since COPY stateids don't have an associated inode,
+		 * avoid triggering state recovery.
+		 */
+		task->tk_status = -EBADF;
+		break;
+	case -NFS4ERR_NOTSUPP:
+	case -ENOTSUPP:
+	case -EOPNOTSUPP:
+		data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
+		task->tk_status = -EOPNOTSUPP;
+		break;
+	default:
+		if (nfs4_async_handle_error(task, data->seq_server,
+					    NULL, NULL) == -EAGAIN)
+			rpc_restart_call_prepare(task);
+		else
+			task->tk_status = -EIO;
+	}
+}
+
+static const struct rpc_call_ops nfs42_offload_status_ops = {
+	.rpc_call_prepare = nfs42_offload_prepare,
+	.rpc_call_done = nfs42_offload_status_done,
+	.rpc_release = nfs42_offload_release
+};
+
+/**
+ * nfs42_proc_offload_status - Poll completion status of an async copy operation
+ * @file: handle of file being copied
+ * @stateid: copy stateid (from async COPY result)
+ * @copied: OUT: number of bytes copied so far
+ *
+ * Return values:
+ *   %0: Server returned an NFS4_OK completion status
+ *   %-EINPROGRESS: Server returned no completion status
+ *   %-EREMOTEIO: Server returned an error completion status
+ *   %-EBADF: Server did not recognize the copy stateid
+ *   %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
+ *   %-ERESTARTSYS: Wait interrupted by signal
+ *
+ * Other negative errnos indicate the client could not complete the
+ * request.
+ */
+static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
+				     u64 *copied)
+{
+	struct nfs_open_context *ctx = nfs_file_open_context(file);
+	struct nfs_server *server = NFS_SERVER(file_inode(file));
+	struct nfs42_offload_data *data = NULL;
+	struct rpc_message msg = {
+		.rpc_proc	= &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
+		.rpc_cred	= ctx->cred,
+	};
+	struct rpc_task_setup task_setup_data = {
+		.rpc_client	= server->client,
+		.rpc_message	= &msg,
+		.callback_ops	= &nfs42_offload_status_ops,
+		.workqueue	= nfsiod_workqueue,
+		.flags		= RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
+	};
+	struct rpc_task *task;
+	int status;
+
+	if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
+		return -EOPNOTSUPP;
+
+	data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
+
+	data->seq_server = server;
+	data->args.osa_src_fh = NFS_FH(file_inode(file));
+	memcpy(&data->args.osa_stateid, stateid,
+		sizeof(data->args.osa_stateid));
+	msg.rpc_argp = &data->args;
+	msg.rpc_resp = &data->res;
+	task_setup_data.callback_data = data;
+	nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
+			   1, 0);
+	task = rpc_run_task(&task_setup_data);
+	if (IS_ERR(task)) {
+		nfs42_offload_release(data);
+		return PTR_ERR(task);
+	}
+	status = rpc_wait_for_completion_task(task);
+	if (status)
+		goto out;
+
+	*copied = data->res.osr_count;
+	if (task->tk_status)
+		status = task->tk_status;
+	else if (!data->res.complete_count)
+		status = -EINPROGRESS;
+	else if (data->res.osr_complete != NFS_OK)
+		status = -EREMOTEIO;
+
+out:
+	rpc_put_task(task);
+	return status;
+}
+
 static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
 				   struct nfs42_copy_notify_args *args,
 				   struct nfs42_copy_notify_res *res)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 405f17e6e0b4..973b8d8fa98b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -10769,7 +10769,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
 		| NFS_CAP_CLONE
 		| NFS_CAP_LAYOUTERROR
 		| NFS_CAP_READ_PLUS
-		| NFS_CAP_MOVEABLE,
+		| NFS_CAP_MOVEABLE
+		| NFS_CAP_OFFLOAD_STATUS,
 	.init_client = nfs41_init_client,
 	.shutdown_client = nfs41_shutdown_client,
 	.match_stateid = nfs41_match_stateid,
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index b804346a9741..946ca1c28773 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -290,6 +290,7 @@ struct nfs_server {
 #define NFS_CAP_CASE_INSENSITIVE	(1U << 6)
 #define NFS_CAP_CASE_PRESERVING	(1U << 7)
 #define NFS_CAP_REBOOT_LAYOUTRETURN	(1U << 8)
+#define NFS_CAP_OFFLOAD_STATUS	(1U << 9)
 #define NFS_CAP_OPEN_XOR	(1U << 12)
 #define NFS_CAP_DELEGTIME	(1U << 13)
 #define NFS_CAP_POSIX_LOCK	(1U << 14)
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 6/7] NFS: Use NFSv4.2's OFFLOAD_STATUS operation
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
                   ` (4 preceding siblings ...)
  2024-12-03 16:29 ` [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
@ 2024-12-03 16:29 ` cel
  2024-12-03 16:29 ` [PATCH v1 7/7] NFS: Refactor trace_nfs4_offload_cancel cel
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker
  Cc: linux-nfs, Chuck Lever, Olga Kornievskaia

From: Chuck Lever <chuck.lever@oracle.com>

We've found that there are cases where a transport disconnection
results in the loss of callback RPCs. NFS servers typically do not
retransmit callback operations after a disconnect.

This can be a problem for the Linux NFS client's current
implementation of asynchronous COPY, which waits indefinitely for a
CB_OFFLOAD callback. If a transport disconnect occurs while an async
COPY is running, there's a good chance the client will never get the
completing CB_OFFLOAD.

Fix this by implementing the OFFLOAD_STATUS operation so that the
Linux NFS client can probe the NFS server if it doesn't see a
CB_OFFLOAD in a reasonable amount of time.

This patch implements a simplistic check. As future work, the client
might also be able to detect whether there is no forward progress on
the request asynchronous COPY operation, and CANCEL it.

Suggested-by: Olga Kornievskaia <kolga@netapp.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42proc.c | 68 +++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 10 deletions(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index fa180ce7c803..5f2bfca00416 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -175,6 +175,25 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
 	return err;
 }
 
+/* Wait this long before checking progress on a COPY operation */
+enum {
+	NFS42_COPY_TIMEOUT	= 3 * HZ,
+};
+
+static void nfs4_copy_dequeue_callback(struct nfs_server *dst_server,
+				       struct nfs_server *src_server,
+				       struct nfs4_copy_state *copy)
+{
+	spin_lock(&dst_server->nfs_client->cl_lock);
+	list_del_init(&copy->copies);
+	spin_unlock(&dst_server->nfs_client->cl_lock);
+	if (dst_server != src_server) {
+		spin_lock(&src_server->nfs_client->cl_lock);
+		list_del_init(&copy->src_copies);
+		spin_unlock(&src_server->nfs_client->cl_lock);
+	}
+}
+
 static int handle_async_copy(struct nfs42_copy_res *res,
 			     struct nfs_server *dst_server,
 			     struct nfs_server *src_server,
@@ -184,9 +203,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,
 			     bool *restart)
 {
 	struct nfs4_copy_state *copy, *tmp_copy = NULL, *iter;
-	int status = NFS4_OK;
 	struct nfs_open_context *dst_ctx = nfs_file_open_context(dst);
 	struct nfs_open_context *src_ctx = nfs_file_open_context(src);
+	int status = NFS4_OK;
+	u64 copied;
 
 	copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
 	if (!copy)
@@ -224,15 +244,12 @@ static int handle_async_copy(struct nfs42_copy_res *res,
 		spin_unlock(&src_server->nfs_client->cl_lock);
 	}
 
-	status = wait_for_completion_interruptible(&copy->completion);
-	spin_lock(&dst_server->nfs_client->cl_lock);
-	list_del_init(&copy->copies);
-	spin_unlock(&dst_server->nfs_client->cl_lock);
-	if (dst_server != src_server) {
-		spin_lock(&src_server->nfs_client->cl_lock);
-		list_del_init(&copy->src_copies);
-		spin_unlock(&src_server->nfs_client->cl_lock);
-	}
+wait:
+	status = wait_for_completion_interruptible_timeout(&copy->completion,
+							   NFS42_COPY_TIMEOUT);
+	if (!status)
+		goto timeout;
+	nfs4_copy_dequeue_callback(dst_server, src_server, copy);
 	if (status == -ERESTARTSYS) {
 		goto out_cancel;
 	} else if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
@@ -242,6 +259,7 @@ static int handle_async_copy(struct nfs42_copy_res *res,
 	}
 out:
 	res->write_res.count = copy->count;
+	/* Copy out the updated write verifier provided by CB_OFFLOAD. */
 	memcpy(&res->write_res.verifier, &copy->verf, sizeof(copy->verf));
 	status = -copy->error;
 
@@ -253,6 +271,36 @@ static int handle_async_copy(struct nfs42_copy_res *res,
 	if (!nfs42_files_from_same_server(src, dst))
 		nfs42_do_offload_cancel_async(src, src_stateid);
 	goto out_free;
+timeout:
+	status = nfs42_proc_offload_status(dst, &copy->stateid, &copied);
+	if (status == -EINPROGRESS)
+		goto wait;
+	nfs4_copy_dequeue_callback(dst_server, src_server, copy);
+	switch (status) {
+	case 0:
+		/* The server recognized the copy stateid, so it hasn't
+		 * rebooted. Don't overwrite the verifier returned in the
+		 * COPY result. */
+		res->write_res.count = copied;
+		goto out_free;
+	case -EREMOTEIO:
+		/* COPY operation failed on the server. */
+		status = -EOPNOTSUPP;
+		res->write_res.count = copied;
+		goto out_free;
+	case -EBADF:
+		/* Server did not recognize the copy stateid. It has
+		 * probably restarted and lost the plot. */
+		res->write_res.count = 0;
+		status = -EOPNOTSUPP;
+		break;
+	case -EOPNOTSUPP:
+		/* RFC 7862 REQUIREs server to support OFFLOAD_STATUS when
+		 * it has signed up for an async COPY, so server is not
+		 * spec-compliant. */
+		res->write_res.count = 0;
+	}
+	goto out_free;
 }
 
 static int process_copy_commit(struct file *dst, loff_t pos_dst,
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 7/7] NFS: Refactor trace_nfs4_offload_cancel
  2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
                   ` (5 preceding siblings ...)
  2024-12-03 16:29 ` [PATCH v1 6/7] NFS: Use " cel
@ 2024-12-03 16:29 ` cel
  6 siblings, 0 replies; 10+ messages in thread
From: cel @ 2024-12-03 16:29 UTC (permalink / raw)
  To: Olga Kornievskaia, Anna Schumaker; +Cc: linux-nfs, Chuck Lever

From: Chuck Lever <chuck.lever@oracle.com>

Add a trace_nfs4_offload_status trace point that looks just like
trace_nfs4_offload_cancel. Promote that event to an event class to
avoid duplicating code.

An alternative approach would be to expand trace_nfs4_offload_status
to report more of the actual OFFLOAD_STATUS result.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs42proc.c |  2 ++
 fs/nfs/nfs4trace.h | 11 ++++++++++-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 5f2bfca00416..82efaf8720e4 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -636,6 +636,8 @@ static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
 {
 	struct nfs42_offload_data *data = calldata;
 
+	trace_nfs4_offload_status(&data->args, task->tk_status);
+
 	nfs41_sequence_done(task, &data->res.osr_seq_res);
 	switch (task->tk_status) {
 	case 0:
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 22c973316f0b..bc67fe6801b1 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -2608,7 +2608,7 @@ TRACE_EVENT(nfs4_copy_notify,
 		)
 );
 
-TRACE_EVENT(nfs4_offload_cancel,
+DECLARE_EVENT_CLASS(nfs4_offload_class,
 		TP_PROTO(
 			const struct nfs42_offload_status_args *args,
 			int error
@@ -2640,6 +2640,15 @@ TRACE_EVENT(nfs4_offload_cancel,
 			__entry->stateid_seq, __entry->stateid_hash
 		)
 );
+#define DEFINE_NFS4_OFFLOAD_EVENT(name) \
+	DEFINE_EVENT(nfs4_offload_class, name,  \
+			TP_PROTO( \
+				const struct nfs42_offload_status_args *args, \
+				int error \
+			), \
+			TP_ARGS(args, error))
+DEFINE_NFS4_OFFLOAD_EVENT(nfs4_offload_cancel);
+DEFINE_NFS4_OFFLOAD_EVENT(nfs4_offload_status);
 
 DECLARE_EVENT_CLASS(nfs4_xattr_event,
 		TP_PROTO(
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
  2024-12-03 16:29 ` [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
@ 2024-12-13  0:39   ` Olga Kornievskaia
  2024-12-13 18:40     ` Chuck Lever
  0 siblings, 1 reply; 10+ messages in thread
From: Olga Kornievskaia @ 2024-12-13  0:39 UTC (permalink / raw)
  To: cel; +Cc: Olga Kornievskaia, Anna Schumaker, linux-nfs, Chuck Lever

On Tue, Dec 3, 2024 at 11:29 AM <cel@kernel.org> wrote:
>
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Enable the Linux NFS client to observe the progress of an offloaded
> asynchronous COPY operation. This new operation will be put to use
> in a subsequent patch.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfs/nfs42proc.c        | 117 ++++++++++++++++++++++++++++++++++++++
>  fs/nfs/nfs4proc.c         |   3 +-
>  include/linux/nfs_fs_sb.h |   1 +
>  3 files changed, 120 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> index 9d716907cf30..fa180ce7c803 100644
> --- a/fs/nfs/nfs42proc.c
> +++ b/fs/nfs/nfs42proc.c
> @@ -21,6 +21,8 @@
>
>  #define NFSDBG_FACILITY NFSDBG_PROC
>  static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> +                                    u64 *copied);
>
>  static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
>  {
> @@ -582,6 +584,121 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
>         return status;
>  }
>
> +static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
> +{
> +       struct nfs42_offload_data *data = calldata;
> +
> +       nfs41_sequence_done(task, &data->res.osr_seq_res);
> +       switch (task->tk_status) {
> +       case 0:
> +               return;
> +       case -NFS4ERR_ADMIN_REVOKED:
> +       case -NFS4ERR_BAD_STATEID:
> +       case -NFS4ERR_OLD_STATEID:
> +               /*
> +                * Server does not recognize the COPY stateid. CB_OFFLOAD
> +                * could have purged it, or server might have rebooted.
> +                * Since COPY stateids don't have an associated inode,
> +                * avoid triggering state recovery.
> +                */
> +               task->tk_status = -EBADF;
> +               break;
> +       case -NFS4ERR_NOTSUPP:
> +       case -ENOTSUPP:
> +       case -EOPNOTSUPP:
> +               data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
> +               task->tk_status = -EOPNOTSUPP;
> +               break;
> +       default:
> +               if (nfs4_async_handle_error(task, data->seq_server,
> +                                           NULL, NULL) == -EAGAIN)
> +                       rpc_restart_call_prepare(task);
> +               else
> +                       task->tk_status = -EIO;
> +       }
> +}
> +
> +static const struct rpc_call_ops nfs42_offload_status_ops = {
> +       .rpc_call_prepare = nfs42_offload_prepare,
> +       .rpc_call_done = nfs42_offload_status_done,
> +       .rpc_release = nfs42_offload_release
> +};
> +
> +/**
> + * nfs42_proc_offload_status - Poll completion status of an async copy operation
> + * @file: handle of file being copied
> + * @stateid: copy stateid (from async COPY result)
> + * @copied: OUT: number of bytes copied so far
> + *
> + * Return values:
> + *   %0: Server returned an NFS4_OK completion status
> + *   %-EINPROGRESS: Server returned no completion status
> + *   %-EREMOTEIO: Server returned an error completion status
> + *   %-EBADF: Server did not recognize the copy stateid
> + *   %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
> + *   %-ERESTARTSYS: Wait interrupted by signal
> + *
> + * Other negative errnos indicate the client could not complete the
> + * request.
> + */
> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> +                                    u64 *copied)
> +{
> +       struct nfs_open_context *ctx = nfs_file_open_context(file);
> +       struct nfs_server *server = NFS_SERVER(file_inode(file));
> +       struct nfs42_offload_data *data = NULL;
> +       struct rpc_message msg = {
> +               .rpc_proc       = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
> +               .rpc_cred       = ctx->cred,
> +       };
> +       struct rpc_task_setup task_setup_data = {
> +               .rpc_client     = server->client,
> +               .rpc_message    = &msg,
> +               .callback_ops   = &nfs42_offload_status_ops,
> +               .workqueue      = nfsiod_workqueue,
> +               .flags          = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,

I wonder why we are making status_offload an async task? Copy within
which we are doing copy_offload is/was a sync task.

Why is it a SOFTCONN task?

> +       };
> +       struct rpc_task *task;
> +       int status;
> +
> +       if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
> +               return -EOPNOTSUPP;

Let's not forget to mark tasks RPC_TASK_MOVEABLE. I know other
nfs42proc need review and add that but since I remembered it here,
let's add it. It allows for if ever this transport were to be moved,
then the tasks can migrate to another transport.


> +
> +       data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
> +       if (data == NULL)
> +               return -ENOMEM;
> +
> +       data->seq_server = server;
> +       data->args.osa_src_fh = NFS_FH(file_inode(file));
> +       memcpy(&data->args.osa_stateid, stateid,
> +               sizeof(data->args.osa_stateid));
> +       msg.rpc_argp = &data->args;
> +       msg.rpc_resp = &data->res;
> +       task_setup_data.callback_data = data;
> +       nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
> +                          1, 0);
> +       task = rpc_run_task(&task_setup_data);
> +       if (IS_ERR(task)) {
> +               nfs42_offload_release(data);
> +               return PTR_ERR(task);
> +       }
> +       status = rpc_wait_for_completion_task(task);
> +       if (status)
> +               goto out;
> +
> +       *copied = data->res.osr_count;
> +       if (task->tk_status)
> +               status = task->tk_status;
> +       else if (!data->res.complete_count)
> +               status = -EINPROGRESS;
> +       else if (data->res.osr_complete != NFS_OK)
> +               status = -EREMOTEIO;
> +
> +out:
> +       rpc_put_task(task);
> +       return status;
> +}
> +
>  static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
>                                    struct nfs42_copy_notify_args *args,
>                                    struct nfs42_copy_notify_res *res)
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 405f17e6e0b4..973b8d8fa98b 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -10769,7 +10769,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
>                 | NFS_CAP_CLONE
>                 | NFS_CAP_LAYOUTERROR
>                 | NFS_CAP_READ_PLUS
> -               | NFS_CAP_MOVEABLE,
> +               | NFS_CAP_MOVEABLE
> +               | NFS_CAP_OFFLOAD_STATUS,
>         .init_client = nfs41_init_client,
>         .shutdown_client = nfs41_shutdown_client,
>         .match_stateid = nfs41_match_stateid,
> diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> index b804346a9741..946ca1c28773 100644
> --- a/include/linux/nfs_fs_sb.h
> +++ b/include/linux/nfs_fs_sb.h
> @@ -290,6 +290,7 @@ struct nfs_server {
>  #define NFS_CAP_CASE_INSENSITIVE       (1U << 6)
>  #define NFS_CAP_CASE_PRESERVING        (1U << 7)
>  #define NFS_CAP_REBOOT_LAYOUTRETURN    (1U << 8)
> +#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
>  #define NFS_CAP_OPEN_XOR       (1U << 12)
>  #define NFS_CAP_DELEGTIME      (1U << 13)
>  #define NFS_CAP_POSIX_LOCK     (1U << 14)
> --
> 2.47.0
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
  2024-12-13  0:39   ` Olga Kornievskaia
@ 2024-12-13 18:40     ` Chuck Lever
  0 siblings, 0 replies; 10+ messages in thread
From: Chuck Lever @ 2024-12-13 18:40 UTC (permalink / raw)
  To: Olga Kornievskaia, cel; +Cc: Olga Kornievskaia, Anna Schumaker, linux-nfs

On 12/12/24 7:39 PM, Olga Kornievskaia wrote:
> On Tue, Dec 3, 2024 at 11:29 AM <cel@kernel.org> wrote:
>>
>> From: Chuck Lever <chuck.lever@oracle.com>
>>
>> Enable the Linux NFS client to observe the progress of an offloaded
>> asynchronous COPY operation. This new operation will be put to use
>> in a subsequent patch.
>>
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>   fs/nfs/nfs42proc.c        | 117 ++++++++++++++++++++++++++++++++++++++
>>   fs/nfs/nfs4proc.c         |   3 +-
>>   include/linux/nfs_fs_sb.h |   1 +
>>   3 files changed, 120 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
>> index 9d716907cf30..fa180ce7c803 100644
>> --- a/fs/nfs/nfs42proc.c
>> +++ b/fs/nfs/nfs42proc.c
>> @@ -21,6 +21,8 @@
>>
>>   #define NFSDBG_FACILITY NFSDBG_PROC
>>   static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
>> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
>> +                                    u64 *copied);
>>
>>   static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
>>   {
>> @@ -582,6 +584,121 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
>>          return status;
>>   }
>>
>> +static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
>> +{
>> +       struct nfs42_offload_data *data = calldata;
>> +
>> +       nfs41_sequence_done(task, &data->res.osr_seq_res);
>> +       switch (task->tk_status) {
>> +       case 0:
>> +               return;
>> +       case -NFS4ERR_ADMIN_REVOKED:
>> +       case -NFS4ERR_BAD_STATEID:
>> +       case -NFS4ERR_OLD_STATEID:
>> +               /*
>> +                * Server does not recognize the COPY stateid. CB_OFFLOAD
>> +                * could have purged it, or server might have rebooted.
>> +                * Since COPY stateids don't have an associated inode,
>> +                * avoid triggering state recovery.
>> +                */
>> +               task->tk_status = -EBADF;
>> +               break;
>> +       case -NFS4ERR_NOTSUPP:
>> +       case -ENOTSUPP:
>> +       case -EOPNOTSUPP:
>> +               data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
>> +               task->tk_status = -EOPNOTSUPP;
>> +               break;
>> +       default:
>> +               if (nfs4_async_handle_error(task, data->seq_server,
>> +                                           NULL, NULL) == -EAGAIN)
>> +                       rpc_restart_call_prepare(task);
>> +               else
>> +                       task->tk_status = -EIO;
>> +       }
>> +}
>> +
>> +static const struct rpc_call_ops nfs42_offload_status_ops = {
>> +       .rpc_call_prepare = nfs42_offload_prepare,
>> +       .rpc_call_done = nfs42_offload_status_done,
>> +       .rpc_release = nfs42_offload_release
>> +};
>> +
>> +/**
>> + * nfs42_proc_offload_status - Poll completion status of an async copy operation
>> + * @file: handle of file being copied
>> + * @stateid: copy stateid (from async COPY result)
>> + * @copied: OUT: number of bytes copied so far
>> + *
>> + * Return values:
>> + *   %0: Server returned an NFS4_OK completion status
>> + *   %-EINPROGRESS: Server returned no completion status
>> + *   %-EREMOTEIO: Server returned an error completion status
>> + *   %-EBADF: Server did not recognize the copy stateid
>> + *   %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
>> + *   %-ERESTARTSYS: Wait interrupted by signal
>> + *
>> + * Other negative errnos indicate the client could not complete the
>> + * request.
>> + */
>> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
>> +                                    u64 *copied)
>> +{
>> +       struct nfs_open_context *ctx = nfs_file_open_context(file);
>> +       struct nfs_server *server = NFS_SERVER(file_inode(file));
>> +       struct nfs42_offload_data *data = NULL;
>> +       struct rpc_message msg = {
>> +               .rpc_proc       = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
>> +               .rpc_cred       = ctx->cred,
>> +       };
>> +       struct rpc_task_setup task_setup_data = {
>> +               .rpc_client     = server->client,
>> +               .rpc_message    = &msg,
>> +               .callback_ops   = &nfs42_offload_status_ops,
>> +               .workqueue      = nfsiod_workqueue,
>> +               .flags          = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
> 
> I wonder why we are making status_offload an async task? Copy within
> which we are doing copy_offload is/was a sync task.

I tried it as a sync task, there were some issues with that that I
no longer recall.


> Why is it a SOFTCONN task?

If there is no existing connection to the server, fail immediately
instead of waiting for minutes to reconnect.

Otherwise, just like RENEW, the client will stack up a bunch of
OFFLOAD_STATUS operations as long as the RPC transport is down.

Also, when retrying, wait interruptibly -- that way a ^C will work
as expected.


>> +       };
>> +       struct rpc_task *task;
>> +       int status;
>> +
>> +       if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
>> +               return -EOPNOTSUPP;
> 
> Let's not forget to mark tasks RPC_TASK_MOVEABLE. I know other
> nfs42proc need review and add that but since I remembered it here,
> let's add it. It allows for if ever this transport were to be moved,
> then the tasks can migrate to another transport.

OK.


>> +
>> +       data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
>> +       if (data == NULL)
>> +               return -ENOMEM;
>> +
>> +       data->seq_server = server;
>> +       data->args.osa_src_fh = NFS_FH(file_inode(file));
>> +       memcpy(&data->args.osa_stateid, stateid,
>> +               sizeof(data->args.osa_stateid));
>> +       msg.rpc_argp = &data->args;
>> +       msg.rpc_resp = &data->res;
>> +       task_setup_data.callback_data = data;
>> +       nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
>> +                          1, 0);
>> +       task = rpc_run_task(&task_setup_data);
>> +       if (IS_ERR(task)) {
>> +               nfs42_offload_release(data);
>> +               return PTR_ERR(task);
>> +       }
>> +       status = rpc_wait_for_completion_task(task);
>> +       if (status)
>> +               goto out;
>> +
>> +       *copied = data->res.osr_count;
>> +       if (task->tk_status)
>> +               status = task->tk_status;
>> +       else if (!data->res.complete_count)
>> +               status = -EINPROGRESS;
>> +       else if (data->res.osr_complete != NFS_OK)
>> +               status = -EREMOTEIO;
>> +
>> +out:
>> +       rpc_put_task(task);
>> +       return status;
>> +}
>> +
>>   static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
>>                                     struct nfs42_copy_notify_args *args,
>>                                     struct nfs42_copy_notify_res *res)
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 405f17e6e0b4..973b8d8fa98b 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -10769,7 +10769,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
>>                  | NFS_CAP_CLONE
>>                  | NFS_CAP_LAYOUTERROR
>>                  | NFS_CAP_READ_PLUS
>> -               | NFS_CAP_MOVEABLE,
>> +               | NFS_CAP_MOVEABLE
>> +               | NFS_CAP_OFFLOAD_STATUS,
>>          .init_client = nfs41_init_client,
>>          .shutdown_client = nfs41_shutdown_client,
>>          .match_stateid = nfs41_match_stateid,
>> diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
>> index b804346a9741..946ca1c28773 100644
>> --- a/include/linux/nfs_fs_sb.h
>> +++ b/include/linux/nfs_fs_sb.h
>> @@ -290,6 +290,7 @@ struct nfs_server {
>>   #define NFS_CAP_CASE_INSENSITIVE       (1U << 6)
>>   #define NFS_CAP_CASE_PRESERVING        (1U << 7)
>>   #define NFS_CAP_REBOOT_LAYOUTRETURN    (1U << 8)
>> +#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
>>   #define NFS_CAP_OPEN_XOR       (1U << 12)
>>   #define NFS_CAP_DELEGTIME      (1U << 13)
>>   #define NFS_CAP_POSIX_LOCK     (1U << 14)
>> --
>> 2.47.0
>>
>>


-- 
Chuck Lever

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-12-13 18:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-03 16:29 [PATCH v1 0/7] Client-side OFFLOAD_STATUS implementation cel
2024-12-03 16:29 ` [PATCH v1 1/7] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
2024-12-03 16:29 ` [PATCH v1 2/7] NFS: Fix typo in OFFLOAD_CANCEL comment cel
2024-12-03 16:29 ` [PATCH v1 3/7] NFS: Rename struct nfs4_offloadcancel_data cel
2024-12-03 16:29 ` [PATCH v1 4/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
2024-12-03 16:29 ` [PATCH v1 5/7] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
2024-12-13  0:39   ` Olga Kornievskaia
2024-12-13 18:40     ` Chuck Lever
2024-12-03 16:29 ` [PATCH v1 6/7] NFS: Use " cel
2024-12-03 16:29 ` [PATCH v1 7/7] NFS: Refactor trace_nfs4_offload_cancel cel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox