* [RFC PATCH 1/9] NFS: CB_OFFLOAD can return NFS4ERR_DELAY
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
@ 2024-10-08 13:47 ` cel
2024-10-08 13:47 ` [RFC PATCH 2/9] NFSD: Free async copy information in nfsd4_cb_offload_release() cel
` (8 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
RFC 7862 permits the callback service to respond to a CB_OFFLOAD
operation with NFS4ERR_DELAY. Use that instead of
NFS4ERR_SERVERFAULT for temporary memory allocation failure, as that
is more consistent with how other operations report memory
allocation failure.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/callback_proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c
index 7832fb0369a1..8397c43358bd 100644
--- a/fs/nfs/callback_proc.c
+++ b/fs/nfs/callback_proc.c
@@ -718,7 +718,7 @@ __be32 nfs4_callback_offload(void *data, void *dummy,
copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
if (!copy)
- return htonl(NFS4ERR_SERVERFAULT);
+ return cpu_to_be32(NFS4ERR_DELAY);
spin_lock(&cps->clp->cl_lock);
rcu_read_lock();
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* [RFC PATCH 2/9] NFSD: Free async copy information in nfsd4_cb_offload_release()
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
2024-10-08 13:47 ` [RFC PATCH 1/9] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
@ 2024-10-08 13:47 ` cel
2024-10-09 15:14 ` Chuck Lever
2024-10-08 13:47 ` [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD cel
` (7 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
RFC 7862 Section 4.8 states:
> A copy offload stateid will be valid until either (A) the client
> or server restarts or (B) the client returns the resource by
> issuing an OFFLOAD_CANCEL operation or the client replies to a
> CB_OFFLOAD operation.
Currently, NFSD purges the metadata for an async COPY operation as
soon as the CB_OFFLOAD callback has been sent. It does not wait for
even the client's CB_OFFLOAD response, as the paragraph above
suggests that it should.
This makes the OFFLOAD_STATUS operation ineffective in the window
between the completion of the COPY and the server's receipt of the
CB_OFFLOAD response. This is important if, for example, the client
responds with NFS4ERR_DELAY, or the transport is lost before the
server receives the response. A client might use OFFLOAD_STATUS to
query the server about the missing CB_OFFLOAD, but NFSD will respond
to OFFLOAD_STATUS as if it had never heard of the presented copy
stateid.
This patch starts to address this issue by extending the lifetime of
struct nfsd4_copy at least until the server has seen the CB_OFFLOAD
response, or its CB_OFFLOAD operation has timed out.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4proc.c | 18 +++++++++++-------
fs/nfsd/xdr4.h | 3 +++
2 files changed, 14 insertions(+), 7 deletions(-)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index b5a6bf4f459f..a3c564a9596c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -57,6 +57,8 @@ module_param(inter_copy_offload_enable, bool, 0644);
MODULE_PARM_DESC(inter_copy_offload_enable,
"Enable inter server to server copy offload. Default: false");
+static void cleanup_async_copy(struct nfsd4_copy *copy);
+
#ifdef CONFIG_NFSD_V4_2_INTER_SSC
static int nfsd4_ssc_umount_timeout = 900000; /* default to 15 mins */
module_param(nfsd4_ssc_umount_timeout, int, 0644);
@@ -1598,8 +1600,10 @@ static void nfsd4_cb_offload_release(struct nfsd4_callback *cb)
{
struct nfsd4_cb_offload *cbo =
container_of(cb, struct nfsd4_cb_offload, co_cb);
+ struct nfsd4_copy *copy =
+ container_of(cbo, struct nfsd4_copy, cp_cb_offload);
- kfree(cbo);
+ cleanup_async_copy(copy);
}
static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
@@ -1730,13 +1734,10 @@ static void cleanup_async_copy(struct nfsd4_copy *copy)
nfs4_put_copy(copy);
}
+/* Kick off a CB_OFFLOAD callback, but don't wait for the response */
static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
{
- struct nfsd4_cb_offload *cbo;
-
- cbo = kzalloc(sizeof(*cbo), GFP_KERNEL);
- if (!cbo)
- return;
+ struct nfsd4_cb_offload *cbo = ©->cp_cb_offload;
memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res));
memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh));
@@ -1786,10 +1787,13 @@ static int nfsd4_do_async_copy(void *data)
}
do_callback:
+ /* The kthread exits forthwith. Ensure that a subsequent
+ * OFFLOAD_CANCEL won't try to kill it again. */
+ set_bit(NFSD4_COPY_F_STOPPED, ©->cp_flags);
+
set_bit(NFSD4_COPY_F_COMPLETED, ©->cp_flags);
trace_nfsd_copy_async_done(copy);
nfsd4_send_cb_offload(copy);
- cleanup_async_copy(copy);
return 0;
}
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index 2a21a7662e03..dec29afa43f3 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -699,6 +699,9 @@ struct nfsd4_copy {
struct nfsd42_write_res cp_res;
struct knfsd_fh fh;
+ /* offload callback */
+ struct nfsd4_cb_offload cp_cb_offload;
+
struct nfs4_client *cp_clp;
struct nfsd_file *nf_src;
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 2/9] NFSD: Free async copy information in nfsd4_cb_offload_release()
2024-10-08 13:47 ` [RFC PATCH 2/9] NFSD: Free async copy information in nfsd4_cb_offload_release() cel
@ 2024-10-09 15:14 ` Chuck Lever
0 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2024-10-09 15:14 UTC (permalink / raw)
To: cel
Cc: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
linux-nfs
On Tue, Oct 08, 2024 at 09:47:21AM -0400, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> RFC 7862 Section 4.8 states:
>
> > A copy offload stateid will be valid until either (A) the client
> > or server restarts or (B) the client returns the resource by
> > issuing an OFFLOAD_CANCEL operation or the client replies to a
> > CB_OFFLOAD operation.
>
> Currently, NFSD purges the metadata for an async COPY operation as
> soon as the CB_OFFLOAD callback has been sent. It does not wait for
> even the client's CB_OFFLOAD response, as the paragraph above
> suggests that it should.
>
> This makes the OFFLOAD_STATUS operation ineffective in the window
> between the completion of the COPY and the server's receipt of the
> CB_OFFLOAD response. This is important if, for example, the client
> responds with NFS4ERR_DELAY, or the transport is lost before the
> server receives the response. A client might use OFFLOAD_STATUS to
> query the server about the missing CB_OFFLOAD, but NFSD will respond
> to OFFLOAD_STATUS as if it had never heard of the presented copy
> stateid.
>
> This patch starts to address this issue by extending the lifetime of
> struct nfsd4_copy at least until the server has seen the CB_OFFLOAD
> response, or its CB_OFFLOAD operation has timed out.
An OFFLOAD_CANCEL that is sent after the COPY completes but before
the CB_OFFLOAD reply is received will free the struct nfsd4_copy.
Won't that cause a UAF in nfsd4_cb_offload_release or
nfsd4_cb_offload_done ?
If I bump the struct's reference count, would that help?
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfs4proc.c | 18 +++++++++++-------
> fs/nfsd/xdr4.h | 3 +++
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index b5a6bf4f459f..a3c564a9596c 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -57,6 +57,8 @@ module_param(inter_copy_offload_enable, bool, 0644);
> MODULE_PARM_DESC(inter_copy_offload_enable,
> "Enable inter server to server copy offload. Default: false");
>
> +static void cleanup_async_copy(struct nfsd4_copy *copy);
> +
> #ifdef CONFIG_NFSD_V4_2_INTER_SSC
> static int nfsd4_ssc_umount_timeout = 900000; /* default to 15 mins */
> module_param(nfsd4_ssc_umount_timeout, int, 0644);
> @@ -1598,8 +1600,10 @@ static void nfsd4_cb_offload_release(struct nfsd4_callback *cb)
> {
> struct nfsd4_cb_offload *cbo =
> container_of(cb, struct nfsd4_cb_offload, co_cb);
> + struct nfsd4_copy *copy =
> + container_of(cbo, struct nfsd4_copy, cp_cb_offload);
>
> - kfree(cbo);
> + cleanup_async_copy(copy);
> }
>
> static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
> @@ -1730,13 +1734,10 @@ static void cleanup_async_copy(struct nfsd4_copy *copy)
> nfs4_put_copy(copy);
> }
>
> +/* Kick off a CB_OFFLOAD callback, but don't wait for the response */
> static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
> {
> - struct nfsd4_cb_offload *cbo;
> -
> - cbo = kzalloc(sizeof(*cbo), GFP_KERNEL);
> - if (!cbo)
> - return;
> + struct nfsd4_cb_offload *cbo = ©->cp_cb_offload;
>
> memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res));
> memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh));
> @@ -1786,10 +1787,13 @@ static int nfsd4_do_async_copy(void *data)
> }
>
> do_callback:
> + /* The kthread exits forthwith. Ensure that a subsequent
> + * OFFLOAD_CANCEL won't try to kill it again. */
> + set_bit(NFSD4_COPY_F_STOPPED, ©->cp_flags);
> +
> set_bit(NFSD4_COPY_F_COMPLETED, ©->cp_flags);
> trace_nfsd_copy_async_done(copy);
> nfsd4_send_cb_offload(copy);
> - cleanup_async_copy(copy);
> return 0;
> }
>
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index 2a21a7662e03..dec29afa43f3 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -699,6 +699,9 @@ struct nfsd4_copy {
> struct nfsd42_write_res cp_res;
> struct knfsd_fh fh;
>
> + /* offload callback */
> + struct nfsd4_cb_offload cp_cb_offload;
> +
> struct nfs4_client *cp_clp;
>
> struct nfsd_file *nf_src;
> --
> 2.46.2
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
2024-10-08 13:47 ` [RFC PATCH 1/9] NFS: CB_OFFLOAD can return NFS4ERR_DELAY cel
2024-10-08 13:47 ` [RFC PATCH 2/9] NFSD: Free async copy information in nfsd4_cb_offload_release() cel
@ 2024-10-08 13:47 ` cel
2024-10-08 21:54 ` NeilBrown
2024-10-08 13:47 ` [RFC PATCH 4/9] NFS: Fix typo in OFFLOAD_CANCEL comment cel
` (6 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
RFC 7862 permits callback services to respond to CB_OFFLOAD with
NFS4ERR_DELAY. Currently NFSD drops the CB_OFFLOAD in that case.
To improve the reliability of COPY offload, NFSD should rather send
another CB_OFFLOAD completion notification.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfsd/nfs4proc.c | 8 ++++++++
fs/nfsd/xdr4.h | 1 +
2 files changed, 9 insertions(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index a3c564a9596c..02e73ebbfe5c 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1613,6 +1613,13 @@ static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
container_of(cb, struct nfsd4_cb_offload, co_cb);
trace_nfsd_cb_offload_done(&cbo->co_res.cb_stateid, task);
+ switch (task->tk_status) {
+ case -NFS4ERR_DELAY:
+ if (cbo->co_retries--) {
+ rpc_delay(task, 1 * HZ);
+ return 0;
+ }
+ }
return 1;
}
@@ -1742,6 +1749,7 @@ static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res));
memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh));
cbo->co_nfserr = copy->nfserr;
+ cbo->co_retries = 5;
nfsd4_init_cb(&cbo->co_cb, copy->cp_clp, &nfsd4_cb_offload_ops,
NFSPROC4_CLNT_CB_OFFLOAD);
diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
index dec29afa43f3..cd2bf63651e3 100644
--- a/fs/nfsd/xdr4.h
+++ b/fs/nfsd/xdr4.h
@@ -675,6 +675,7 @@ struct nfsd4_cb_offload {
struct nfsd4_callback co_cb;
struct nfsd42_write_res co_res;
__be32 co_nfserr;
+ unsigned int co_retries;
struct knfsd_fh co_fh;
};
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD
2024-10-08 13:47 ` [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD cel
@ 2024-10-08 21:54 ` NeilBrown
2024-10-09 14:10 ` Chuck Lever
0 siblings, 1 reply; 19+ messages in thread
From: NeilBrown @ 2024-10-08 21:54 UTC (permalink / raw)
To: cel
Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
Chuck Lever
On Wed, 09 Oct 2024, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> RFC 7862 permits callback services to respond to CB_OFFLOAD with
> NFS4ERR_DELAY. Currently NFSD drops the CB_OFFLOAD in that case.
>
> To improve the reliability of COPY offload, NFSD should rather send
> another CB_OFFLOAD completion notification.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfsd/nfs4proc.c | 8 ++++++++
> fs/nfsd/xdr4.h | 1 +
> 2 files changed, 9 insertions(+)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index a3c564a9596c..02e73ebbfe5c 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1613,6 +1613,13 @@ static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
> container_of(cb, struct nfsd4_cb_offload, co_cb);
>
> trace_nfsd_cb_offload_done(&cbo->co_res.cb_stateid, task);
> + switch (task->tk_status) {
> + case -NFS4ERR_DELAY:
> + if (cbo->co_retries--) {
> + rpc_delay(task, 1 * HZ);
> + return 0;
Is 5 tries at 1 second interval really sufficient?
It is common to double the delay on each retry failure, so delays of
1,2,4,8,16 would give at total of 30 seconds for the client to get over
whatever congestion is affecting it. That seems safer.
NeilBrown
> + }
> + }
> return 1;
> }
>
> @@ -1742,6 +1749,7 @@ static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
> memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res));
> memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh));
> cbo->co_nfserr = copy->nfserr;
> + cbo->co_retries = 5;
>
> nfsd4_init_cb(&cbo->co_cb, copy->cp_clp, &nfsd4_cb_offload_ops,
> NFSPROC4_CLNT_CB_OFFLOAD);
> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> index dec29afa43f3..cd2bf63651e3 100644
> --- a/fs/nfsd/xdr4.h
> +++ b/fs/nfsd/xdr4.h
> @@ -675,6 +675,7 @@ struct nfsd4_cb_offload {
> struct nfsd4_callback co_cb;
> struct nfsd42_write_res co_res;
> __be32 co_nfserr;
> + unsigned int co_retries;
> struct knfsd_fh co_fh;
> };
>
> --
> 2.46.2
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD
2024-10-08 21:54 ` NeilBrown
@ 2024-10-09 14:10 ` Chuck Lever
0 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2024-10-09 14:10 UTC (permalink / raw)
To: NeilBrown
Cc: cel@kernel.org, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey, linux-nfs@vger.kernel.org
On Tue, Oct 08, 2024 at 05:54:27PM -0400, NeilBrown wrote:
> On Wed, 09 Oct 2024, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > RFC 7862 permits callback services to respond to CB_OFFLOAD with
> > NFS4ERR_DELAY. Currently NFSD drops the CB_OFFLOAD in that case.
> >
> > To improve the reliability of COPY offload, NFSD should rather send
> > another CB_OFFLOAD completion notification.
> >
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> > fs/nfsd/nfs4proc.c | 8 ++++++++
> > fs/nfsd/xdr4.h | 1 +
> > 2 files changed, 9 insertions(+)
> >
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index a3c564a9596c..02e73ebbfe5c 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -1613,6 +1613,13 @@ static int nfsd4_cb_offload_done(struct nfsd4_callback *cb,
> > container_of(cb, struct nfsd4_cb_offload, co_cb);
> >
> > trace_nfsd_cb_offload_done(&cbo->co_res.cb_stateid, task);
> > + switch (task->tk_status) {
> > + case -NFS4ERR_DELAY:
> > + if (cbo->co_retries--) {
> > + rpc_delay(task, 1 * HZ);
> > + return 0;
>
> Is 5 tries at 1 second interval really sufficient?
It doesn't matter, as long as the client can send an OFFLOAD_STATUS
when it hasn't seen the expected CB_OFFLOAD. In fact IMO an even
shorter delay would be better.
This is not a situation where the service endpoint is waiting for a
slow I/O device. The important part of this logic is the retry, not
the delay.
> It is common to double the delay on each retry failure, so delays of
> 1,2,4,8,16 would give at total of 30 seconds for the client to get over
> whatever congestion is affecting it. That seems safer.
I didn't find other callback operations in NFSD that implemented
exponential backoff.
I could compromise and do .1 sec, .2 sec, .4 sec, .8 sec, 1.6 sec.
> NeilBrown
>
> > + }
> > + }
> > return 1;
> > }
> >
> > @@ -1742,6 +1749,7 @@ static void nfsd4_send_cb_offload(struct nfsd4_copy *copy)
> > memcpy(&cbo->co_res, ©->cp_res, sizeof(copy->cp_res));
> > memcpy(&cbo->co_fh, ©->fh, sizeof(copy->fh));
> > cbo->co_nfserr = copy->nfserr;
> > + cbo->co_retries = 5;
> >
> > nfsd4_init_cb(&cbo->co_cb, copy->cp_clp, &nfsd4_cb_offload_ops,
> > NFSPROC4_CLNT_CB_OFFLOAD);
> > diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > index dec29afa43f3..cd2bf63651e3 100644
> > --- a/fs/nfsd/xdr4.h
> > +++ b/fs/nfsd/xdr4.h
> > @@ -675,6 +675,7 @@ struct nfsd4_cb_offload {
> > struct nfsd4_callback co_cb;
> > struct nfsd42_write_res co_res;
> > __be32 co_nfserr;
> > + unsigned int co_retries;
> > struct knfsd_fh co_fh;
> > };
> >
> > --
> > 2.46.2
> >
> >
> >
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 4/9] NFS: Fix typo in OFFLOAD_CANCEL comment
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (2 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 3/9] NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD cel
@ 2024-10-08 13:47 ` cel
2024-10-08 13:47 ` [RFC PATCH 5/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
` (5 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42xdr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index 9e3ae53e2205..ef5730c5e704 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -549,7 +549,7 @@ static void nfs4_xdr_enc_copy(struct rpc_rqst *req,
}
/*
- * Encode OFFLOAD_CANEL request
+ * Encode OFFLOAD_CANCEL request
*/
static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
struct xdr_stream *xdr,
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* [RFC PATCH 5/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (3 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 4/9] NFS: Fix typo in OFFLOAD_CANCEL comment cel
@ 2024-10-08 13:47 ` cel
2024-10-08 22:00 ` NeilBrown
2024-10-08 13:47 ` [RFC PATCH 6/9] NFS: Rename struct nfs4_offloadcancel_data cel
` (4 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Add XDR encoding and decoding functions for the NFSv4.2
OFFLOAD_STATUS operation.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42xdr.c | 86 +++++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4xdr.c | 1 +
include/linux/nfs4.h | 1 +
include/linux/nfs_xdr.h | 5 ++-
4 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index ef5730c5e704..ad3d1293f917 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -35,6 +35,11 @@
#define encode_offload_cancel_maxsz (op_encode_hdr_maxsz + \
XDR_QUADLEN(NFS4_STATEID_SIZE))
#define decode_offload_cancel_maxsz (op_decode_hdr_maxsz)
+#define encode_offload_status_maxsz (op_encode_hdr_maxsz + \
+ XDR_QUADLEN(NFS4_STATEID_SIZE))
+#define decode_offload_status_maxsz (op_decode_hdr_maxsz + \
+ 2 /* osr_count */ + \
+ 2 /* osr_complete */)
#define encode_copy_notify_maxsz (op_encode_hdr_maxsz + \
XDR_QUADLEN(NFS4_STATEID_SIZE) + \
1 + /* nl4_type */ \
@@ -143,6 +148,14 @@
decode_sequence_maxsz + \
decode_putfh_maxsz + \
decode_offload_cancel_maxsz)
+#define NFS4_enc_offload_status_sz (compound_encode_hdr_maxsz + \
+ encode_sequence_maxsz + \
+ encode_putfh_maxsz + \
+ encode_offload_status_maxsz)
+#define NFS4_dec_offload_status_sz (compound_decode_hdr_maxsz + \
+ decode_sequence_maxsz + \
+ decode_putfh_maxsz + \
+ decode_offload_status_maxsz)
#define NFS4_enc_copy_notify_sz (compound_encode_hdr_maxsz + \
encode_putfh_maxsz + \
encode_copy_notify_maxsz)
@@ -343,6 +356,14 @@ static void encode_offload_cancel(struct xdr_stream *xdr,
encode_nfs4_stateid(xdr, &args->osa_stateid);
}
+static void encode_offload_status(struct xdr_stream *xdr,
+ const struct nfs42_offload_status_args *args,
+ struct compound_hdr *hdr)
+{
+ encode_op_hdr(xdr, OP_OFFLOAD_STATUS, decode_offload_status_maxsz, hdr);
+ encode_nfs4_stateid(xdr, &args->osa_stateid);
+}
+
static void encode_copy_notify(struct xdr_stream *xdr,
const struct nfs42_copy_notify_args *args,
struct compound_hdr *hdr)
@@ -567,6 +588,25 @@ static void nfs4_xdr_enc_offload_cancel(struct rpc_rqst *req,
encode_nops(&hdr);
}
+/*
+ * Encode OFFLOAD_STATUS request
+ */
+static void nfs4_xdr_enc_offload_status(struct rpc_rqst *req,
+ struct xdr_stream *xdr,
+ const void *data)
+{
+ const struct nfs42_offload_status_args *args = data;
+ struct compound_hdr hdr = {
+ .minorversion = nfs4_xdr_minorversion(&args->osa_seq_args),
+ };
+
+ encode_compound_hdr(xdr, req, &hdr);
+ encode_sequence(xdr, &args->osa_seq_args, &hdr);
+ encode_putfh(xdr, args->osa_src_fh, &hdr);
+ encode_offload_status(xdr, args, &hdr);
+ encode_nops(&hdr);
+}
+
/*
* Encode COPY_NOTIFY request
*/
@@ -919,6 +959,26 @@ static int decode_offload_cancel(struct xdr_stream *xdr,
return decode_op_hdr(xdr, OP_OFFLOAD_CANCEL);
}
+static int decode_offload_status(struct xdr_stream *xdr,
+ struct nfs42_offload_status_res *res)
+{
+ ssize_t result;
+ int status;
+
+ status = decode_op_hdr(xdr, OP_OFFLOAD_STATUS);
+ if (status)
+ return status;
+ /* osr_count */
+ if (xdr_stream_decode_u64(xdr, &res->osr_count) < 0)
+ return -EIO;
+ /* osr_complete<1> */
+ result = xdr_stream_decode_uint32_array(xdr, res->osr_complete, 1);
+ if (result < 0)
+ return -EIO;
+ res->complete_count = result;
+ return 0;
+}
+
static int decode_copy_notify(struct xdr_stream *xdr,
struct nfs42_copy_notify_res *res)
{
@@ -1368,6 +1428,32 @@ static int nfs4_xdr_dec_offload_cancel(struct rpc_rqst *rqstp,
return status;
}
+/*
+ * Decode OFFLOAD_STATUS response
+ */
+static int nfs4_xdr_dec_offload_status(struct rpc_rqst *rqstp,
+ struct xdr_stream *xdr,
+ void *data)
+{
+ struct nfs42_offload_status_res *res = data;
+ struct compound_hdr hdr;
+ int status;
+
+ status = decode_compound_hdr(xdr, &hdr);
+ if (status)
+ goto out;
+ status = decode_sequence(xdr, &res->osr_seq_res, rqstp);
+ if (status)
+ goto out;
+ status = decode_putfh(xdr);
+ if (status)
+ goto out;
+ status = decode_offload_status(xdr, res);
+
+out:
+ return status;
+}
+
/*
* Decode COPY_NOTIFY response
*/
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index e8ac3f615f93..08be0a0cce24 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -7702,6 +7702,7 @@ const struct rpc_procinfo nfs4_procedures[] = {
PROC42(CLONE, enc_clone, dec_clone),
PROC42(COPY, enc_copy, dec_copy),
PROC42(OFFLOAD_CANCEL, enc_offload_cancel, dec_offload_cancel),
+ PROC42(OFFLOAD_STATUS, enc_offload_status, dec_offload_status),
PROC42(COPY_NOTIFY, enc_copy_notify, dec_copy_notify),
PROC(LOOKUPP, enc_lookupp, dec_lookupp),
PROC42(LAYOUTERROR, enc_layouterror, dec_layouterror),
diff --git a/include/linux/nfs4.h b/include/linux/nfs4.h
index 8d7430d9f218..5de96243a252 100644
--- a/include/linux/nfs4.h
+++ b/include/linux/nfs4.h
@@ -695,6 +695,7 @@ enum {
NFSPROC4_CLNT_LISTXATTRS,
NFSPROC4_CLNT_REMOVEXATTR,
NFSPROC4_CLNT_READ_PLUS,
+ NFSPROC4_CLNT_OFFLOAD_STATUS,
};
/* nfs41 types */
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 12d8e47bc5a3..ad337c0ebdba 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1520,8 +1520,9 @@ struct nfs42_offload_status_args {
struct nfs42_offload_status_res {
struct nfs4_sequence_res osr_seq_res;
- uint64_t osr_count;
- int osr_status;
+ u64 osr_count;
+ int complete_count;
+ u32 osr_complete[1];
};
struct nfs42_copy_notify_args {
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 5/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR
2024-10-08 13:47 ` [RFC PATCH 5/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
@ 2024-10-08 22:00 ` NeilBrown
0 siblings, 0 replies; 19+ messages in thread
From: NeilBrown @ 2024-10-08 22:00 UTC (permalink / raw)
To: cel
Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
Chuck Lever
On Wed, 09 Oct 2024, cel@kernel.org wrote:
>
> +static int decode_offload_status(struct xdr_stream *xdr,
> + struct nfs42_offload_status_res *res)
> +{
> + ssize_t result;
> + int status;
> +
> + status = decode_op_hdr(xdr, OP_OFFLOAD_STATUS);
> + if (status)
> + return status;
> + /* osr_count */
> + if (xdr_stream_decode_u64(xdr, &res->osr_count) < 0)
> + return -EIO;
> + /* osr_complete<1> */
> + result = xdr_stream_decode_uint32_array(xdr, res->osr_complete, 1);
nfsd4_decode_getdeviceinfo() handles a singleton array be having a
simple u32, and passing its address. Here you actually define a
singleton array
I'd probably lean to the former style, but if you really like the
latter, maybe we should change nfsd4_decode_getdeviceinfo() ??
Thanks,
NeilBrown
> struct nfs42_offload_status_res {
> struct nfs4_sequence_res osr_seq_res;
> - uint64_t osr_count;
> - int osr_status;
> + u64 osr_count;
> + int complete_count;
> + u32 osr_complete[1];
> };
>
> struct nfs42_copy_notify_args {
> --
> 2.46.2
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 6/9] NFS: Rename struct nfs4_offloadcancel_data
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (4 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 5/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR cel
@ 2024-10-08 13:47 ` cel
2024-10-08 13:47 ` [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
` (3 subsequent siblings)
9 siblings, 0 replies; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Refactor: This struct can be used unchanged for the new
OFFLOAD_STATUS implementation, so give it a more generic name.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42proc.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 28704f924612..869605a0a9d5 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -498,15 +498,15 @@ ssize_t nfs42_proc_copy(struct file *src, loff_t pos_src,
return err;
}
-struct nfs42_offloadcancel_data {
+struct nfs42_offload_data {
struct nfs_server *seq_server;
struct nfs42_offload_status_args args;
struct nfs42_offload_status_res res;
};
-static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
+static void nfs42_offload_prepare(struct rpc_task *task, void *calldata)
{
- struct nfs42_offloadcancel_data *data = calldata;
+ struct nfs42_offload_data *data = calldata;
nfs4_setup_sequence(data->seq_server->nfs_client,
&data->args.osa_seq_args,
@@ -515,7 +515,7 @@ static void nfs42_offload_cancel_prepare(struct rpc_task *task, void *calldata)
static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
{
- struct nfs42_offloadcancel_data *data = calldata;
+ struct nfs42_offload_data *data = calldata;
trace_nfs4_offload_cancel(&data->args, task->tk_status);
nfs41_sequence_done(task, &data->res.osr_seq_res);
@@ -525,22 +525,22 @@ static void nfs42_offload_cancel_done(struct rpc_task *task, void *calldata)
rpc_restart_call_prepare(task);
}
-static void nfs42_free_offloadcancel_data(void *data)
+static void nfs42_offload_release(void *data)
{
kfree(data);
}
static const struct rpc_call_ops nfs42_offload_cancel_ops = {
- .rpc_call_prepare = nfs42_offload_cancel_prepare,
+ .rpc_call_prepare = nfs42_offload_prepare,
.rpc_call_done = nfs42_offload_cancel_done,
- .rpc_release = nfs42_free_offloadcancel_data,
+ .rpc_release = nfs42_offload_release,
};
static int nfs42_do_offload_cancel_async(struct file *dst,
nfs4_stateid *stateid)
{
struct nfs_server *dst_server = NFS_SERVER(file_inode(dst));
- struct nfs42_offloadcancel_data *data = NULL;
+ struct nfs42_offload_data *data = NULL;
struct nfs_open_context *ctx = nfs_file_open_context(dst);
struct rpc_task *task;
struct rpc_message msg = {
@@ -559,7 +559,7 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
if (!(dst_server->caps & NFS_CAP_OFFLOAD_CANCEL))
return -EOPNOTSUPP;
- data = kzalloc(sizeof(struct nfs42_offloadcancel_data), GFP_KERNEL);
+ data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
if (data == NULL)
return -ENOMEM;
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (5 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 6/9] NFS: Rename struct nfs4_offloadcancel_data cel
@ 2024-10-08 13:47 ` cel
2024-10-08 22:09 ` NeilBrown
2024-10-08 13:47 ` [RFC PATCH 8/9] NFS: Use " cel
` (2 subsequent siblings)
9 siblings, 1 reply; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Enable the Linux NFS client to observe the progress of an offloaded
asynchronous COPY operation. This new operation will be put to use
in a subsequent patch.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42proc.c | 122 ++++++++++++++++++++++++++++++++++++++
fs/nfs/nfs4proc.c | 3 +-
include/linux/nfs_fs_sb.h | 1 +
3 files changed, 125 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 869605a0a9d5..175330843558 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -21,6 +21,8 @@
#define NFSDBG_FACILITY NFSDBG_PROC
static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
+static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
+ u64 *copied);
static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
{
@@ -582,6 +584,126 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
return status;
}
+static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
+{
+ struct nfs42_offload_data *data = calldata;
+
+ if (!nfs4_sequence_done(task, &data->res.osr_seq_res))
+ return;
+
+ switch (task->tk_status) {
+ case 0:
+ return;
+ case -NFS4ERR_DELAY:
+ if (nfs4_async_handle_error(task, data->seq_server,
+ NULL, NULL) == -EAGAIN)
+ rpc_restart_call_prepare(task);
+ else
+ task->tk_status = -EIO;
+ break;
+ case -NFS4ERR_GRACE:
+ case -NFS4ERR_ADMIN_REVOKED:
+ case -NFS4ERR_BAD_STATEID:
+ case -NFS4ERR_OLD_STATEID:
+ /*
+ * Server does not recognize the COPY stateid. CB_OFFLOAD
+ * could have purged it, or server might have rebooted.
+ * Since COPY stateids don't have an associated inode,
+ * avoid triggering state recovery.
+ */
+ task->tk_status = -EBADF;
+ break;
+ case -NFS4ERR_NOTSUPP:
+ case -ENOTSUPP:
+ case -EOPNOTSUPP:
+ data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
+ task->tk_status = -EOPNOTSUPP;
+ break;
+ default:
+ task->tk_status = -EIO;
+ }
+}
+
+static const struct rpc_call_ops nfs42_offload_status_ops = {
+ .rpc_call_prepare = nfs42_offload_prepare,
+ .rpc_call_done = nfs42_offload_status_done,
+ .rpc_release = nfs42_offload_release
+};
+
+/**
+ * nfs42_proc_offload_status - Poll completion status of an async copy operation
+ * @file: handle of file being copied
+ * @stateid: copy stateid (from async COPY result)
+ * @copied: OUT: number of bytes copied so far
+ *
+ * Return values:
+ * %0: Server returned an NFS4_OK completion status
+ * %-EINPROGRESS: Server returned no completion status
+ * %-EREMOTEIO: Server returned an error completion status
+ * %-EBADF: Server did not recognize the copy stateid
+ * %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
+ *
+ * Other negative errnos indicate the client could not complete the
+ * request.
+ */
+static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
+ u64 *copied)
+{
+ struct nfs_open_context *ctx = nfs_file_open_context(file);
+ struct nfs_server *server = NFS_SERVER(file_inode(file));
+ struct nfs42_offload_data *data = NULL;
+ struct rpc_message msg = {
+ .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
+ .rpc_cred = ctx->cred,
+ };
+ struct rpc_task_setup task_setup_data = {
+ .rpc_client = server->client,
+ .rpc_message = &msg,
+ .callback_ops = &nfs42_offload_status_ops,
+ .workqueue = nfsiod_workqueue,
+ .flags = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
+ };
+ struct rpc_task *task;
+ int status;
+
+ if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
+ return -EOPNOTSUPP;
+
+ data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
+ if (data == NULL)
+ return -ENOMEM;
+
+ data->seq_server = server;
+ data->args.osa_src_fh = NFS_FH(file_inode(file));
+ memcpy(&data->args.osa_stateid, stateid,
+ sizeof(data->args.osa_stateid));
+ msg.rpc_argp = &data->args;
+ msg.rpc_resp = &data->res;
+ task_setup_data.callback_data = data;
+ nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
+ 1, 0);
+ task = rpc_run_task(&task_setup_data);
+ if (IS_ERR(task)) {
+ nfs42_offload_release(data);
+ return PTR_ERR(task);
+ }
+ status = rpc_wait_for_completion_task(task);
+ if (status)
+ goto out;
+
+ *copied = data->res.osr_count;
+ if (task->tk_status)
+ status = task->tk_status;
+ else if (!data->res.complete_count)
+ status = -EINPROGRESS;
+ else if (data->res.osr_complete[0] != NFS_OK)
+ status = -EREMOTEIO;
+
+out:
+ rpc_put_task(task);
+ return status;
+}
+
static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
struct nfs42_copy_notify_args *args,
struct nfs42_copy_notify_res *res)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index cd2fbde2e6d7..324e38b70b9f 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -10763,7 +10763,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
| NFS_CAP_CLONE
| NFS_CAP_LAYOUTERROR
| NFS_CAP_READ_PLUS
- | NFS_CAP_MOVEABLE,
+ | NFS_CAP_MOVEABLE
+ | NFS_CAP_OFFLOAD_STATUS,
.init_client = nfs41_init_client,
.shutdown_client = nfs41_shutdown_client,
.match_stateid = nfs41_match_stateid,
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 853df3fcd4c2..05b8deadd3b1 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -289,6 +289,7 @@ struct nfs_server {
#define NFS_CAP_CASE_INSENSITIVE (1U << 6)
#define NFS_CAP_CASE_PRESERVING (1U << 7)
#define NFS_CAP_REBOOT_LAYOUTRETURN (1U << 8)
+#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
#define NFS_CAP_OPEN_XOR (1U << 12)
#define NFS_CAP_DELEGTIME (1U << 13)
#define NFS_CAP_POSIX_LOCK (1U << 14)
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 13:47 ` [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
@ 2024-10-08 22:09 ` NeilBrown
2024-10-09 14:47 ` Chuck Lever
0 siblings, 1 reply; 19+ messages in thread
From: NeilBrown @ 2024-10-08 22:09 UTC (permalink / raw)
To: cel
Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
Chuck Lever
On Wed, 09 Oct 2024, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Enable the Linux NFS client to observe the progress of an offloaded
> asynchronous COPY operation. This new operation will be put to use
> in a subsequent patch.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfs/nfs42proc.c | 122 ++++++++++++++++++++++++++++++++++++++
> fs/nfs/nfs4proc.c | 3 +-
> include/linux/nfs_fs_sb.h | 1 +
> 3 files changed, 125 insertions(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> index 869605a0a9d5..175330843558 100644
> --- a/fs/nfs/nfs42proc.c
> +++ b/fs/nfs/nfs42proc.c
> @@ -21,6 +21,8 @@
>
> #define NFSDBG_FACILITY NFSDBG_PROC
> static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> + u64 *copied);
>
> static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
> {
> @@ -582,6 +584,126 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
> return status;
> }
>
> +static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
> +{
> + struct nfs42_offload_data *data = calldata;
> +
> + if (!nfs4_sequence_done(task, &data->res.osr_seq_res))
> + return;
> +
> + switch (task->tk_status) {
> + case 0:
> + return;
> + case -NFS4ERR_DELAY:
> + if (nfs4_async_handle_error(task, data->seq_server,
> + NULL, NULL) == -EAGAIN)
> + rpc_restart_call_prepare(task);
> + else
> + task->tk_status = -EIO;
> + break;
> + case -NFS4ERR_GRACE:
> + case -NFS4ERR_ADMIN_REVOKED:
> + case -NFS4ERR_BAD_STATEID:
> + case -NFS4ERR_OLD_STATEID:
> + /*
> + * Server does not recognize the COPY stateid. CB_OFFLOAD
> + * could have purged it, or server might have rebooted.
> + * Since COPY stateids don't have an associated inode,
> + * avoid triggering state recovery.
> + */
> + task->tk_status = -EBADF;
> + break;
> + case -NFS4ERR_NOTSUPP:
> + case -ENOTSUPP:
> + case -EOPNOTSUPP:
> + data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
> + task->tk_status = -EOPNOTSUPP;
> + break;
> + default:
> + task->tk_status = -EIO;
> + }
> +}
> +
> +static const struct rpc_call_ops nfs42_offload_status_ops = {
> + .rpc_call_prepare = nfs42_offload_prepare,
> + .rpc_call_done = nfs42_offload_status_done,
> + .rpc_release = nfs42_offload_release
> +};
> +
> +/**
> + * nfs42_proc_offload_status - Poll completion status of an async copy operation
> + * @file: handle of file being copied
> + * @stateid: copy stateid (from async COPY result)
> + * @copied: OUT: number of bytes copied so far
> + *
> + * Return values:
> + * %0: Server returned an NFS4_OK completion status
> + * %-EINPROGRESS: Server returned no completion status
> + * %-EREMOTEIO: Server returned an error completion status
> + * %-EBADF: Server did not recognize the copy stateid
> + * %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
* %-ERESTARTSYS: a signal was received.
I'm wondering why this request is RPC_TASK_ASYNC rather than
rpc_call_sync().
NeilBrown
> + *
> + * Other negative errnos indicate the client could not complete the
> + * request.
> + */
> +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> + u64 *copied)
> +{
> + struct nfs_open_context *ctx = nfs_file_open_context(file);
> + struct nfs_server *server = NFS_SERVER(file_inode(file));
> + struct nfs42_offload_data *data = NULL;
> + struct rpc_message msg = {
> + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
> + .rpc_cred = ctx->cred,
> + };
> + struct rpc_task_setup task_setup_data = {
> + .rpc_client = server->client,
> + .rpc_message = &msg,
> + .callback_ops = &nfs42_offload_status_ops,
> + .workqueue = nfsiod_workqueue,
> + .flags = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
> + };
> + struct rpc_task *task;
> + int status;
> +
> + if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
> + return -EOPNOTSUPP;
> +
> + data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
> + if (data == NULL)
> + return -ENOMEM;
> +
> + data->seq_server = server;
> + data->args.osa_src_fh = NFS_FH(file_inode(file));
> + memcpy(&data->args.osa_stateid, stateid,
> + sizeof(data->args.osa_stateid));
> + msg.rpc_argp = &data->args;
> + msg.rpc_resp = &data->res;
> + task_setup_data.callback_data = data;
> + nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
> + 1, 0);
> + task = rpc_run_task(&task_setup_data);
> + if (IS_ERR(task)) {
> + nfs42_offload_release(data);
> + return PTR_ERR(task);
> + }
> + status = rpc_wait_for_completion_task(task);
> + if (status)
> + goto out;
> +
> + *copied = data->res.osr_count;
> + if (task->tk_status)
> + status = task->tk_status;
> + else if (!data->res.complete_count)
> + status = -EINPROGRESS;
> + else if (data->res.osr_complete[0] != NFS_OK)
> + status = -EREMOTEIO;
> +
> +out:
> + rpc_put_task(task);
> + return status;
> +}
> +
> static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
> struct nfs42_copy_notify_args *args,
> struct nfs42_copy_notify_res *res)
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index cd2fbde2e6d7..324e38b70b9f 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -10763,7 +10763,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
> | NFS_CAP_CLONE
> | NFS_CAP_LAYOUTERROR
> | NFS_CAP_READ_PLUS
> - | NFS_CAP_MOVEABLE,
> + | NFS_CAP_MOVEABLE
> + | NFS_CAP_OFFLOAD_STATUS,
> .init_client = nfs41_init_client,
> .shutdown_client = nfs41_shutdown_client,
> .match_stateid = nfs41_match_stateid,
> diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> index 853df3fcd4c2..05b8deadd3b1 100644
> --- a/include/linux/nfs_fs_sb.h
> +++ b/include/linux/nfs_fs_sb.h
> @@ -289,6 +289,7 @@ struct nfs_server {
> #define NFS_CAP_CASE_INSENSITIVE (1U << 6)
> #define NFS_CAP_CASE_PRESERVING (1U << 7)
> #define NFS_CAP_REBOOT_LAYOUTRETURN (1U << 8)
> +#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
> #define NFS_CAP_OPEN_XOR (1U << 12)
> #define NFS_CAP_DELEGTIME (1U << 13)
> #define NFS_CAP_POSIX_LOCK (1U << 14)
> --
> 2.46.2
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 22:09 ` NeilBrown
@ 2024-10-09 14:47 ` Chuck Lever
0 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2024-10-09 14:47 UTC (permalink / raw)
To: NeilBrown
Cc: cel@kernel.org, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey, linux-nfs@vger.kernel.org
On Tue, Oct 08, 2024 at 06:09:13PM -0400, NeilBrown wrote:
> On Wed, 09 Oct 2024, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > Enable the Linux NFS client to observe the progress of an offloaded
> > asynchronous COPY operation. This new operation will be put to use
> > in a subsequent patch.
> >
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> > fs/nfs/nfs42proc.c | 122 ++++++++++++++++++++++++++++++++++++++
> > fs/nfs/nfs4proc.c | 3 +-
> > include/linux/nfs_fs_sb.h | 1 +
> > 3 files changed, 125 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> > index 869605a0a9d5..175330843558 100644
> > --- a/fs/nfs/nfs42proc.c
> > +++ b/fs/nfs/nfs42proc.c
> > @@ -21,6 +21,8 @@
> >
> > #define NFSDBG_FACILITY NFSDBG_PROC
> > static int nfs42_do_offload_cancel_async(struct file *dst, nfs4_stateid *std);
> > +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> > + u64 *copied);
> >
> > static void nfs42_set_netaddr(struct file *filep, struct nfs42_netaddr *naddr)
> > {
> > @@ -582,6 +584,126 @@ static int nfs42_do_offload_cancel_async(struct file *dst,
> > return status;
> > }
> >
> > +static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
> > +{
> > + struct nfs42_offload_data *data = calldata;
> > +
> > + if (!nfs4_sequence_done(task, &data->res.osr_seq_res))
> > + return;
> > +
> > + switch (task->tk_status) {
> > + case 0:
> > + return;
> > + case -NFS4ERR_DELAY:
> > + if (nfs4_async_handle_error(task, data->seq_server,
> > + NULL, NULL) == -EAGAIN)
> > + rpc_restart_call_prepare(task);
> > + else
> > + task->tk_status = -EIO;
> > + break;
> > + case -NFS4ERR_GRACE:
> > + case -NFS4ERR_ADMIN_REVOKED:
> > + case -NFS4ERR_BAD_STATEID:
> > + case -NFS4ERR_OLD_STATEID:
> > + /*
> > + * Server does not recognize the COPY stateid. CB_OFFLOAD
> > + * could have purged it, or server might have rebooted.
> > + * Since COPY stateids don't have an associated inode,
> > + * avoid triggering state recovery.
> > + */
> > + task->tk_status = -EBADF;
> > + break;
> > + case -NFS4ERR_NOTSUPP:
> > + case -ENOTSUPP:
> > + case -EOPNOTSUPP:
> > + data->seq_server->caps &= ~NFS_CAP_OFFLOAD_STATUS;
> > + task->tk_status = -EOPNOTSUPP;
> > + break;
> > + default:
> > + task->tk_status = -EIO;
> > + }
> > +}
> > +
> > +static const struct rpc_call_ops nfs42_offload_status_ops = {
> > + .rpc_call_prepare = nfs42_offload_prepare,
> > + .rpc_call_done = nfs42_offload_status_done,
> > + .rpc_release = nfs42_offload_release
> > +};
> > +
> > +/**
> > + * nfs42_proc_offload_status - Poll completion status of an async copy operation
> > + * @file: handle of file being copied
> > + * @stateid: copy stateid (from async COPY result)
> > + * @copied: OUT: number of bytes copied so far
> > + *
> > + * Return values:
> > + * %0: Server returned an NFS4_OK completion status
> > + * %-EINPROGRESS: Server returned no completion status
> > + * %-EREMOTEIO: Server returned an error completion status
> > + * %-EBADF: Server did not recognize the copy stateid
> > + * %-EOPNOTSUPP: Server does not support OFFLOAD_STATUS
>
> * %-ERESTARTSYS: a signal was received.
>
> I'm wondering why this request is RPC_TASK_ASYNC rather than
> rpc_call_sync().
I had a prototype that used rpc_call_sync(). It was long enough ago
that I don't recall exactly why I needed to switch to ASYNC instead.
I suspect it was because it needs to accomplish certain things via
callback_ops, and that's not possible with rpc_call_sync().
> NeilBrown
>
>
> > + *
> > + * Other negative errnos indicate the client could not complete the
> > + * request.
> > + */
> > +static int nfs42_proc_offload_status(struct file *file, nfs4_stateid *stateid,
> > + u64 *copied)
> > +{
> > + struct nfs_open_context *ctx = nfs_file_open_context(file);
> > + struct nfs_server *server = NFS_SERVER(file_inode(file));
> > + struct nfs42_offload_data *data = NULL;
> > + struct rpc_message msg = {
> > + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_OFFLOAD_STATUS],
> > + .rpc_cred = ctx->cred,
> > + };
> > + struct rpc_task_setup task_setup_data = {
> > + .rpc_client = server->client,
> > + .rpc_message = &msg,
> > + .callback_ops = &nfs42_offload_status_ops,
> > + .workqueue = nfsiod_workqueue,
> > + .flags = RPC_TASK_ASYNC | RPC_TASK_SOFTCONN,
> > + };
> > + struct rpc_task *task;
> > + int status;
> > +
> > + if (!(server->caps & NFS_CAP_OFFLOAD_STATUS))
> > + return -EOPNOTSUPP;
> > +
> > + data = kzalloc(sizeof(struct nfs42_offload_data), GFP_KERNEL);
> > + if (data == NULL)
> > + return -ENOMEM;
> > +
> > + data->seq_server = server;
> > + data->args.osa_src_fh = NFS_FH(file_inode(file));
> > + memcpy(&data->args.osa_stateid, stateid,
> > + sizeof(data->args.osa_stateid));
> > + msg.rpc_argp = &data->args;
> > + msg.rpc_resp = &data->res;
> > + task_setup_data.callback_data = data;
> > + nfs4_init_sequence(&data->args.osa_seq_args, &data->res.osr_seq_res,
> > + 1, 0);
> > + task = rpc_run_task(&task_setup_data);
> > + if (IS_ERR(task)) {
> > + nfs42_offload_release(data);
> > + return PTR_ERR(task);
> > + }
> > + status = rpc_wait_for_completion_task(task);
> > + if (status)
> > + goto out;
> > +
> > + *copied = data->res.osr_count;
> > + if (task->tk_status)
> > + status = task->tk_status;
> > + else if (!data->res.complete_count)
> > + status = -EINPROGRESS;
> > + else if (data->res.osr_complete[0] != NFS_OK)
> > + status = -EREMOTEIO;
> > +
> > +out:
> > + rpc_put_task(task);
> > + return status;
> > +}
> > +
> > static int _nfs42_proc_copy_notify(struct file *src, struct file *dst,
> > struct nfs42_copy_notify_args *args,
> > struct nfs42_copy_notify_res *res)
> > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> > index cd2fbde2e6d7..324e38b70b9f 100644
> > --- a/fs/nfs/nfs4proc.c
> > +++ b/fs/nfs/nfs4proc.c
> > @@ -10763,7 +10763,8 @@ static const struct nfs4_minor_version_ops nfs_v4_2_minor_ops = {
> > | NFS_CAP_CLONE
> > | NFS_CAP_LAYOUTERROR
> > | NFS_CAP_READ_PLUS
> > - | NFS_CAP_MOVEABLE,
> > + | NFS_CAP_MOVEABLE
> > + | NFS_CAP_OFFLOAD_STATUS,
> > .init_client = nfs41_init_client,
> > .shutdown_client = nfs41_shutdown_client,
> > .match_stateid = nfs41_match_stateid,
> > diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
> > index 853df3fcd4c2..05b8deadd3b1 100644
> > --- a/include/linux/nfs_fs_sb.h
> > +++ b/include/linux/nfs_fs_sb.h
> > @@ -289,6 +289,7 @@ struct nfs_server {
> > #define NFS_CAP_CASE_INSENSITIVE (1U << 6)
> > #define NFS_CAP_CASE_PRESERVING (1U << 7)
> > #define NFS_CAP_REBOOT_LAYOUTRETURN (1U << 8)
> > +#define NFS_CAP_OFFLOAD_STATUS (1U << 9)
> > #define NFS_CAP_OPEN_XOR (1U << 12)
> > #define NFS_CAP_DELEGTIME (1U << 13)
> > #define NFS_CAP_POSIX_LOCK (1U << 14)
> > --
> > 2.46.2
> >
> >
> >
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 8/9] NFS: Use NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (6 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 7/9] NFS: Implement NFSv4.2's OFFLOAD_STATUS operation cel
@ 2024-10-08 13:47 ` cel
2024-10-08 22:10 ` NeilBrown
2024-10-08 13:47 ` [RFC PATCH 9/9] NFS: Refactor trace_nfs4_offload_cancel cel
2024-10-08 13:51 ` [RFC PATCH 0/9] async COPY fixes Chuck Lever III
9 siblings, 1 reply; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever, Olga Kornievskaia
From: Chuck Lever <chuck.lever@oracle.com>
We've found that there are cases where a transport disconnection
results in the loss of callback RPCs. NFS servers typically do not
retransmit callback operations after a disconnect.
This can be a problem for the Linux NFS client's current
implementation of asynchronous COPY, which waits indefinitely for a
CB_OFFLOAD callback. If a transport disconnect occurs while an async
COPY is running, there's a good chance the client will never get the
completing CB_OFFLOAD.
Fix this by implementing the OFFLOAD_STATUS operation so that the
Linux NFS client can probe the NFS server if it doesn't see a
CB_OFFLOAD in a reasonable amount of time.
This patch implements a simplistic check. As future work, the client
might also be able to detect whether there is no forward progress on
the request asynchronous COPY operation, and CANCEL it.
Suggested-by: Olga Kornievskaia <kolga@netapp.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42proc.c | 56 ++++++++++++++++++++++++++++++++++++++++------
1 file changed, 49 insertions(+), 7 deletions(-)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index 175330843558..fc4f64750dc5 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -175,6 +175,11 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
return err;
}
+/* Wait this long before checking progress on a COPY operation */
+enum {
+ NFS42_COPY_TIMEOUT = 3 * HZ,
+};
+
static int handle_async_copy(struct nfs42_copy_res *res,
struct nfs_server *dst_server,
struct nfs_server *src_server,
@@ -184,9 +189,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,
bool *restart)
{
struct nfs4_copy_state *copy, *tmp_copy = NULL, *iter;
- int status = NFS4_OK;
struct nfs_open_context *dst_ctx = nfs_file_open_context(dst);
struct nfs_open_context *src_ctx = nfs_file_open_context(src);
+ int status = NFS4_OK;
+ u64 copied;
copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
if (!copy)
@@ -224,7 +230,9 @@ static int handle_async_copy(struct nfs42_copy_res *res,
spin_unlock(&src_server->nfs_client->cl_lock);
}
- status = wait_for_completion_interruptible(©->completion);
+wait:
+ status = wait_for_completion_interruptible_timeout(©->completion,
+ NFS42_COPY_TIMEOUT);
spin_lock(&dst_server->nfs_client->cl_lock);
list_del_init(©->copies);
spin_unlock(&dst_server->nfs_client->cl_lock);
@@ -233,15 +241,21 @@ static int handle_async_copy(struct nfs42_copy_res *res,
list_del_init(©->src_copies);
spin_unlock(&src_server->nfs_client->cl_lock);
}
- if (status == -ERESTARTSYS) {
- goto out_cancel;
- } else if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
- status = -EAGAIN;
- *restart = true;
+ switch (status) {
+ case 0:
+ goto timeout;
+ case -ERESTARTSYS:
goto out_cancel;
+ default:
+ if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
+ status = -EAGAIN;
+ *restart = true;
+ goto out_cancel;
+ }
}
out:
res->write_res.count = copy->count;
+ /* Copy out the updated write verifier provided by CB_OFFLOAD. */
memcpy(&res->write_res.verifier, ©->verf, sizeof(copy->verf));
status = -copy->error;
@@ -253,6 +267,34 @@ static int handle_async_copy(struct nfs42_copy_res *res,
if (!nfs42_files_from_same_server(src, dst))
nfs42_do_offload_cancel_async(src, src_stateid);
goto out_free;
+timeout:
+ status = nfs42_proc_offload_status(src, ©->stateid, &copied);
+ switch (status) {
+ case 0:
+ case -EREMOTEIO:
+ /* The server recognized the copy stateid, so it hasn't
+ * rebooted. Don't overwrite the verifier returned in the
+ * COPY result. */
+ res->write_res.count = copied;
+ goto out_free;
+ case -EINPROGRESS:
+ goto wait;
+ case -EBADF:
+ /* Server did not recognize the copy stateid. It has
+ * probably restarted and lost the plot. State recovery
+ * might redrive the COPY from the beginning, in this
+ * case? */
+ res->write_res.count = 0;
+ status = -EREMOTEIO;
+ break;
+ case -EOPNOTSUPP:
+ /* RFC 7862 REQUIREs server to support OFFLOAD_STATUS when
+ * it has signed up for an async COPY, so server is not
+ * spec-compliant. */
+ res->write_res.count = 0;
+ status = -EREMOTEIO;
+ }
+ goto out;
}
static int process_copy_commit(struct file *dst, loff_t pos_dst,
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 8/9] NFS: Use NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 13:47 ` [RFC PATCH 8/9] NFS: Use " cel
@ 2024-10-08 22:10 ` NeilBrown
2024-10-09 15:45 ` Chuck Lever
0 siblings, 1 reply; 19+ messages in thread
From: NeilBrown @ 2024-10-08 22:10 UTC (permalink / raw)
To: cel
Cc: Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey, linux-nfs,
Chuck Lever, Olga Kornievskaia
On Wed, 09 Oct 2024, cel@kernel.org wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
>
> We've found that there are cases where a transport disconnection
> results in the loss of callback RPCs. NFS servers typically do not
> retransmit callback operations after a disconnect.
>
> This can be a problem for the Linux NFS client's current
> implementation of asynchronous COPY, which waits indefinitely for a
> CB_OFFLOAD callback. If a transport disconnect occurs while an async
> COPY is running, there's a good chance the client will never get the
> completing CB_OFFLOAD.
>
> Fix this by implementing the OFFLOAD_STATUS operation so that the
> Linux NFS client can probe the NFS server if it doesn't see a
> CB_OFFLOAD in a reasonable amount of time.
>
> This patch implements a simplistic check. As future work, the client
> might also be able to detect whether there is no forward progress on
> the request asynchronous COPY operation, and CANCEL it.
>
> Suggested-by: Olga Kornievskaia <kolga@netapp.com>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> fs/nfs/nfs42proc.c | 56 ++++++++++++++++++++++++++++++++++++++++------
> 1 file changed, 49 insertions(+), 7 deletions(-)
>
> diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> index 175330843558..fc4f64750dc5 100644
> --- a/fs/nfs/nfs42proc.c
> +++ b/fs/nfs/nfs42proc.c
> @@ -175,6 +175,11 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
> return err;
> }
>
> +/* Wait this long before checking progress on a COPY operation */
> +enum {
> + NFS42_COPY_TIMEOUT = 3 * HZ,
> +};
> +
> static int handle_async_copy(struct nfs42_copy_res *res,
> struct nfs_server *dst_server,
> struct nfs_server *src_server,
> @@ -184,9 +189,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> bool *restart)
> {
> struct nfs4_copy_state *copy, *tmp_copy = NULL, *iter;
> - int status = NFS4_OK;
> struct nfs_open_context *dst_ctx = nfs_file_open_context(dst);
> struct nfs_open_context *src_ctx = nfs_file_open_context(src);
> + int status = NFS4_OK;
> + u64 copied;
>
> copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
> if (!copy)
> @@ -224,7 +230,9 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> spin_unlock(&src_server->nfs_client->cl_lock);
> }
>
> - status = wait_for_completion_interruptible(©->completion);
> +wait:
> + status = wait_for_completion_interruptible_timeout(©->completion,
> + NFS42_COPY_TIMEOUT);
> spin_lock(&dst_server->nfs_client->cl_lock);
> list_del_init(©->copies);
> spin_unlock(&dst_server->nfs_client->cl_lock);
> @@ -233,15 +241,21 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> list_del_init(©->src_copies);
> spin_unlock(&src_server->nfs_client->cl_lock);
> }
> - if (status == -ERESTARTSYS) {
> - goto out_cancel;
> - } else if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
> - status = -EAGAIN;
> - *restart = true;
> + switch (status) {
> + case 0:
> + goto timeout;
> + case -ERESTARTSYS:
> goto out_cancel;
> + default:
> + if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
> + status = -EAGAIN;
> + *restart = true;
> + goto out_cancel;
> + }
> }
> out:
> res->write_res.count = copy->count;
> + /* Copy out the updated write verifier provided by CB_OFFLOAD. */
> memcpy(&res->write_res.verifier, ©->verf, sizeof(copy->verf));
> status = -copy->error;
>
> @@ -253,6 +267,34 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> if (!nfs42_files_from_same_server(src, dst))
> nfs42_do_offload_cancel_async(src, src_stateid);
> goto out_free;
> +timeout:
> + status = nfs42_proc_offload_status(src, ©->stateid, &copied);
> + switch (status) {
> + case 0:
> + case -EREMOTEIO:
> + /* The server recognized the copy stateid, so it hasn't
> + * rebooted. Don't overwrite the verifier returned in the
> + * COPY result. */
> + res->write_res.count = copied;
> + goto out_free;
> + case -EINPROGRESS:
> + goto wait;
> + case -EBADF:
> + /* Server did not recognize the copy stateid. It has
> + * probably restarted and lost the plot. State recovery
> + * might redrive the COPY from the beginning, in this
> + * case? */
> + res->write_res.count = 0;
> + status = -EREMOTEIO;
> + break;
> + case -EOPNOTSUPP:
> + /* RFC 7862 REQUIREs server to support OFFLOAD_STATUS when
> + * it has signed up for an async COPY, so server is not
> + * spec-compliant. */
> + res->write_res.count = 0;
> + status = -EREMOTEIO;
Should -ERESTARTSYS be handled here?
NeilBrown
> + }
> + goto out;
> }
>
> static int process_copy_commit(struct file *dst, loff_t pos_dst,
> --
> 2.46.2
>
>
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [RFC PATCH 8/9] NFS: Use NFSv4.2's OFFLOAD_STATUS operation
2024-10-08 22:10 ` NeilBrown
@ 2024-10-09 15:45 ` Chuck Lever
0 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever @ 2024-10-09 15:45 UTC (permalink / raw)
To: NeilBrown
Cc: cel@kernel.org, Jeff Layton, Olga Kornievskaia, Dai Ngo,
Tom Talpey, linux-nfs@vger.kernel.org, Olga Kornievskaia
On Tue, Oct 08, 2024 at 06:10:44PM -0400, NeilBrown wrote:
> On Wed, 09 Oct 2024, cel@kernel.org wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> >
> > We've found that there are cases where a transport disconnection
> > results in the loss of callback RPCs. NFS servers typically do not
> > retransmit callback operations after a disconnect.
> >
> > This can be a problem for the Linux NFS client's current
> > implementation of asynchronous COPY, which waits indefinitely for a
> > CB_OFFLOAD callback. If a transport disconnect occurs while an async
> > COPY is running, there's a good chance the client will never get the
> > completing CB_OFFLOAD.
> >
> > Fix this by implementing the OFFLOAD_STATUS operation so that the
> > Linux NFS client can probe the NFS server if it doesn't see a
> > CB_OFFLOAD in a reasonable amount of time.
> >
> > This patch implements a simplistic check. As future work, the client
> > might also be able to detect whether there is no forward progress on
> > the request asynchronous COPY operation, and CANCEL it.
> >
> > Suggested-by: Olga Kornievskaia <kolga@netapp.com>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218735
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> > fs/nfs/nfs42proc.c | 56 ++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 49 insertions(+), 7 deletions(-)
> >
> > diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
> > index 175330843558..fc4f64750dc5 100644
> > --- a/fs/nfs/nfs42proc.c
> > +++ b/fs/nfs/nfs42proc.c
> > @@ -175,6 +175,11 @@ int nfs42_proc_deallocate(struct file *filep, loff_t offset, loff_t len)
> > return err;
> > }
> >
> > +/* Wait this long before checking progress on a COPY operation */
> > +enum {
> > + NFS42_COPY_TIMEOUT = 3 * HZ,
> > +};
> > +
> > static int handle_async_copy(struct nfs42_copy_res *res,
> > struct nfs_server *dst_server,
> > struct nfs_server *src_server,
> > @@ -184,9 +189,10 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> > bool *restart)
> > {
> > struct nfs4_copy_state *copy, *tmp_copy = NULL, *iter;
> > - int status = NFS4_OK;
> > struct nfs_open_context *dst_ctx = nfs_file_open_context(dst);
> > struct nfs_open_context *src_ctx = nfs_file_open_context(src);
> > + int status = NFS4_OK;
> > + u64 copied;
> >
> > copy = kzalloc(sizeof(struct nfs4_copy_state), GFP_KERNEL);
> > if (!copy)
> > @@ -224,7 +230,9 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> > spin_unlock(&src_server->nfs_client->cl_lock);
> > }
> >
> > - status = wait_for_completion_interruptible(©->completion);
> > +wait:
> > + status = wait_for_completion_interruptible_timeout(©->completion,
> > + NFS42_COPY_TIMEOUT);
> > spin_lock(&dst_server->nfs_client->cl_lock);
> > list_del_init(©->copies);
> > spin_unlock(&dst_server->nfs_client->cl_lock);
> > @@ -233,15 +241,21 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> > list_del_init(©->src_copies);
> > spin_unlock(&src_server->nfs_client->cl_lock);
> > }
> > - if (status == -ERESTARTSYS) {
> > - goto out_cancel;
> > - } else if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
> > - status = -EAGAIN;
> > - *restart = true;
> > + switch (status) {
> > + case 0:
> > + goto timeout;
> > + case -ERESTARTSYS:
> > goto out_cancel;
> > + default:
> > + if (copy->flags || copy->error == NFS4ERR_PARTNER_NO_AUTH) {
> > + status = -EAGAIN;
> > + *restart = true;
> > + goto out_cancel;
> > + }
> > }
> > out:
> > res->write_res.count = copy->count;
> > + /* Copy out the updated write verifier provided by CB_OFFLOAD. */
> > memcpy(&res->write_res.verifier, ©->verf, sizeof(copy->verf));
> > status = -copy->error;
> >
> > @@ -253,6 +267,34 @@ static int handle_async_copy(struct nfs42_copy_res *res,
> > if (!nfs42_files_from_same_server(src, dst))
> > nfs42_do_offload_cancel_async(src, src_stateid);
> > goto out_free;
> > +timeout:
> > + status = nfs42_proc_offload_status(src, ©->stateid, &copied);
> > + switch (status) {
> > + case 0:
> > + case -EREMOTEIO:
> > + /* The server recognized the copy stateid, so it hasn't
> > + * rebooted. Don't overwrite the verifier returned in the
> > + * COPY result. */
> > + res->write_res.count = copied;
> > + goto out_free;
> > + case -EINPROGRESS:
> > + goto wait;
> > + case -EBADF:
> > + /* Server did not recognize the copy stateid. It has
> > + * probably restarted and lost the plot. State recovery
> > + * might redrive the COPY from the beginning, in this
> > + * case? */
> > + res->write_res.count = 0;
> > + status = -EREMOTEIO;
> > + break;
> > + case -EOPNOTSUPP:
> > + /* RFC 7862 REQUIREs server to support OFFLOAD_STATUS when
> > + * it has signed up for an async COPY, so server is not
> > + * spec-compliant. */
> > + res->write_res.count = 0;
> > + status = -EREMOTEIO;
>
> Should -ERESTARTSYS be handled here?
The "default" is to "goto out;" handle_async_copy() will return
-ERESTARTSYS in that case. What more should be done?
> NeilBrown
>
>
> > + }
> > + goto out;
> > }
> >
> > static int process_copy_commit(struct file *dst, loff_t pos_dst,
> > --
> > 2.46.2
> >
> >
> >
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 19+ messages in thread
* [RFC PATCH 9/9] NFS: Refactor trace_nfs4_offload_cancel
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (7 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 8/9] NFS: Use " cel
@ 2024-10-08 13:47 ` cel
2024-10-08 13:51 ` [RFC PATCH 0/9] async COPY fixes Chuck Lever III
9 siblings, 0 replies; 19+ messages in thread
From: cel @ 2024-10-08 13:47 UTC (permalink / raw)
To: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey
Cc: linux-nfs, Chuck Lever
From: Chuck Lever <chuck.lever@oracle.com>
Add a trace_nfs4_offload_status trace point that looks just like
trace_nfs4_offload_cancel. Promote that event to an event class to
avoid duplicating code.
An alternative approach would be to expand trace_nfs4_offload_status
to report more of the actual OFFLOAD_STATUS result.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/nfs/nfs42proc.c | 2 ++
fs/nfs/nfs4trace.h | 11 ++++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/fs/nfs/nfs42proc.c b/fs/nfs/nfs42proc.c
index fc4f64750dc5..969293bf8228 100644
--- a/fs/nfs/nfs42proc.c
+++ b/fs/nfs/nfs42proc.c
@@ -630,6 +630,8 @@ static void nfs42_offload_status_done(struct rpc_task *task, void *calldata)
{
struct nfs42_offload_data *data = calldata;
+ trace_nfs4_offload_status(&data->args, task->tk_status);
+
if (!nfs4_sequence_done(task, &data->res.osr_seq_res))
return;
diff --git a/fs/nfs/nfs4trace.h b/fs/nfs/nfs4trace.h
index 22c973316f0b..bc67fe6801b1 100644
--- a/fs/nfs/nfs4trace.h
+++ b/fs/nfs/nfs4trace.h
@@ -2608,7 +2608,7 @@ TRACE_EVENT(nfs4_copy_notify,
)
);
-TRACE_EVENT(nfs4_offload_cancel,
+DECLARE_EVENT_CLASS(nfs4_offload_class,
TP_PROTO(
const struct nfs42_offload_status_args *args,
int error
@@ -2640,6 +2640,15 @@ TRACE_EVENT(nfs4_offload_cancel,
__entry->stateid_seq, __entry->stateid_hash
)
);
+#define DEFINE_NFS4_OFFLOAD_EVENT(name) \
+ DEFINE_EVENT(nfs4_offload_class, name, \
+ TP_PROTO( \
+ const struct nfs42_offload_status_args *args, \
+ int error \
+ ), \
+ TP_ARGS(args, error))
+DEFINE_NFS4_OFFLOAD_EVENT(nfs4_offload_cancel);
+DEFINE_NFS4_OFFLOAD_EVENT(nfs4_offload_status);
DECLARE_EVENT_CLASS(nfs4_xattr_event,
TP_PROTO(
--
2.46.2
^ permalink raw reply related [flat|nested] 19+ messages in thread* Re: [RFC PATCH 0/9] async COPY fixes
2024-10-08 13:47 [RFC PATCH 0/9] async COPY fixes cel
` (8 preceding siblings ...)
2024-10-08 13:47 ` [RFC PATCH 9/9] NFS: Refactor trace_nfs4_offload_cancel cel
@ 2024-10-08 13:51 ` Chuck Lever III
9 siblings, 0 replies; 19+ messages in thread
From: Chuck Lever III @ 2024-10-08 13:51 UTC (permalink / raw)
To: Chuck Lever
Cc: Neil Brown, Jeff Layton, Olga Kornievskaia, Dai Ngo, Tom Talpey,
Linux NFS Mailing List
> On Oct 8, 2024, at 9:47 AM, cel@kernel.org wrote:
>
> From: Chuck Lever <chuck.lever@oracle.com>
>
> Hi -
>
> There are edge cases on both the Linux NFS server and client that
> can be improved to make notification of COPY completion reliable.
> This series accomplishes two things, primarily:
>
> 1. Add support for OFFLOAD_STATUS to the Linux NFS client
>
> 2. Modify NFSD to keep the copy state ID around so that
> OFFLOAD_STATUS can find the completed COPY
>
> That should be enough to make async COPY reliable so that it can be
> enabled again in NFSD. I'd like to shoot for doing that in v6.13.
>
> Due to other bugs and issues, this has unfortunately been somewhat
> of a background task lately, so any independent testing would be
> greatly appreciated.
A branch containing these patches, plus one to re-enable async
COPY in NFSD, is available here:
https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git/log/?h=fix-async-copy
> When we agree that these are ready, I can split them up so they can
> be merged independently via the client and server trees.
>
> Chuck Lever (9):
> NFS: CB_OFFLOAD can return NFS4ERR_DELAY
> NFSD: Free async copy information in nfsd4_cb_offload_release()
> NFSD: Handle an NFS4ERR_DELAY response to CB_OFFLOAD
> NFS: Fix typo in OFFLOAD_CANCEL comment
> NFS: Implement NFSv4.2's OFFLOAD_STATUS XDR
> NFS: Rename struct nfs4_offloadcancel_data
> NFS: Implement NFSv4.2's OFFLOAD_STATUS operation
> NFS: Use NFSv4.2's OFFLOAD_STATUS operation
> NFS: Refactor trace_nfs4_offload_cancel
>
> fs/nfs/callback_proc.c | 2 +-
> fs/nfs/nfs42proc.c | 198 +++++++++++++++++++++++++++++++++++---
> fs/nfs/nfs42xdr.c | 88 ++++++++++++++++-
> fs/nfs/nfs4proc.c | 3 +-
> fs/nfs/nfs4trace.h | 11 ++-
> fs/nfs/nfs4xdr.c | 1 +
> fs/nfsd/nfs4proc.c | 26 +++--
> fs/nfsd/xdr4.h | 4 +
> include/linux/nfs4.h | 1 +
> include/linux/nfs_fs_sb.h | 1 +
> include/linux/nfs_xdr.h | 5 +-
> 11 files changed, 311 insertions(+), 29 deletions(-)
>
> --
> 2.46.2
>
>
--
Chuck Lever
^ permalink raw reply [flat|nested] 19+ messages in thread