linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] NFSv4: Infinite loop in lease recovery when rpc.gssd is not running.
@ 2014-02-10 21:06 Steve Dickson
  2014-02-10 21:10 ` Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Dickson @ 2014-02-10 21:06 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: Linux NFS Mailing list

[ Resent with Trond's correct email address ]

Commit 0ea9de0e introduce a regression in the lease recovery code.

An infinite loop is caused when nfs4_establish_lease() fails
with -EACCES. This causes nfs4_handle_reclaim_lease_error()
to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
This in turn causes nfs4_state_manager() to try and
reestablished the lease, again, again, again...

The problem is a valid RPCSEC_GSS client is being created when
rpc.gssd is not running. This is causing the RPC code to fail
with the -EACCES sending the lease reestablished off the
deep end.

Moving the gssd_running() check back into nfs4_init_client(),
stopping the RPCSEC_GSS client from being create, stops
the looping

Signed-off-by: Steve Dickson <steved@redhat.com>
---
 fs/nfs/nfs4client.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 860ad26..a60269f 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -372,7 +372,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
 	__set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
 	__set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
 
-	error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
+	error =  -EINVAL;
+	if (gssd_running(clp->cl_net))
+		error = nfs_create_rpc_client(clp, timeparms,
+				RPC_AUTH_GSS_KRB5I);
 	if (error == -EINVAL)
 		error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
 	if (error < 0)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] NFSv4: Infinite loop in lease recovery when rpc.gssd is not running.
  2014-02-10 21:06 [PATCH] NFSv4: Infinite loop in lease recovery when rpc.gssd is not running Steve Dickson
@ 2014-02-10 21:10 ` Trond Myklebust
  2014-02-10 21:48   ` [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2014-02-10 21:10 UTC (permalink / raw)
  To: Dickson Steve; +Cc: Linux NFS Mailing list


On Feb 10, 2014, at 16:06, Steve Dickson <steved@redhat.com> wrote:

> [ Resent with Trond's correct email address ]
> 
> Commit 0ea9de0e introduce a regression in the lease recovery code.
> 
> An infinite loop is caused when nfs4_establish_lease() fails
> with -EACCES. This causes nfs4_handle_reclaim_lease_error()
> to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
> This in turn causes nfs4_state_manager() to try and
> reestablished the lease, again, again, again...
> 
> The problem is a valid RPCSEC_GSS client is being created when
> rpc.gssd is not running. This is causing the RPC code to fail
> with the -EACCES sending the lease reestablished off the
> deep end.
> 
> Moving the gssd_running() check back into nfs4_init_client(),
> stopping the RPCSEC_GSS client from being create, stops
> the looping
> 
> Signed-off-by: Steve Dickson <steved@redhat.com>
> ---
> fs/nfs/nfs4client.c |    5 ++++-
> 1 files changed, 4 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 860ad26..a60269f 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -372,7 +372,10 @@ struct nfs_client *nfs4_init_client(struct nfs_client *clp,
> 	__set_bit(NFS_CS_DISCRTRY, &clp->cl_flags);
> 	__set_bit(NFS_CS_NO_RETRANS_TIMEOUT, &clp->cl_flags);
> 
> -	error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_GSS_KRB5I);
> +	error =  -EINVAL;
> +	if (gssd_running(clp->cl_net))
> +		error = nfs_create_rpc_client(clp, timeparms,
> +				RPC_AUTH_GSS_KRB5I);
> 	if (error == -EINVAL)
> 		error = nfs_create_rpc_client(clp, timeparms, RPC_AUTH_UNIX);
> 	if (error < 0)
> -- 
> 1.7.1
> 

NACK. gssd_running() is not an acceptable solution outside of the RPC layer. 

_________________________________
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@primarydata.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running
  2014-02-10 21:10 ` Trond Myklebust
@ 2014-02-10 21:48   ` Trond Myklebust
  2014-02-10 23:01     ` Steve Dickson
  0 siblings, 1 reply; 5+ messages in thread
From: Trond Myklebust @ 2014-02-10 21:48 UTC (permalink / raw)
  To: Steve Dickson; +Cc: linux-nfs

An infinite loop is caused when nfs4_establish_lease() fails
with -EACCES. This causes nfs4_handle_reclaim_lease_error()
to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
This in turn causes nfs4_state_manager() to try and
reestablished the lease, again, again, again...

The problem is a valid RPCSEC_GSS client is being created when
rpc.gssd is not running.

Link: http://lkml.kernel.org/r/1392066375-16502-1-git-send-email-steved@redhat.com
Fixes: 0ea9de0ea6a4 (sunrpc: turn warn_gssd() log message into a dprintk())
Reported-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
---
 net/sunrpc/auth_gss/auth_gss.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index 6c0513a7f992..44a61e8fda6f 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -991,6 +991,8 @@ gss_create_new(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
 	gss_auth->service = gss_pseudoflavor_to_service(gss_auth->mech, flavor);
 	if (gss_auth->service == 0)
 		goto err_put_mech;
+	if (!gssd_running(gss_auth->net))
+		goto err_put_mech;
 	auth = &gss_auth->rpc_auth;
 	auth->au_cslack = GSS_CRED_SLACK >> 2;
 	auth->au_rslack = GSS_VERF_SLACK >> 2;
-- 
1.8.5.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running
  2014-02-10 21:48   ` [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running Trond Myklebust
@ 2014-02-10 23:01     ` Steve Dickson
  2014-02-11 11:26       ` Steve Dickson
  0 siblings, 1 reply; 5+ messages in thread
From: Steve Dickson @ 2014-02-10 23:01 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs



On 02/10/2014 04:48 PM, Trond Myklebust wrote:
> An infinite loop is caused when nfs4_establish_lease() fails
> with -EACCES. This causes nfs4_handle_reclaim_lease_error()
> to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
> This in turn causes nfs4_state_manager() to try and
> reestablished the lease, again, again, again...
> 
> The problem is a valid RPCSEC_GSS client is being created when
> rpc.gssd is not running.
> 
> Link: http://lkml.kernel.org/r/1392066375-16502-1-git-send-email-steved@redhat.com
> Fixes: 0ea9de0ea6a4 (sunrpc: turn warn_gssd() log message into a dprintk())
> Reported-by: Steve Dickson <steved@redhat.com>
> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
> ---
>  net/sunrpc/auth_gss/auth_gss.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
> index 6c0513a7f992..44a61e8fda6f 100644
> --- a/net/sunrpc/auth_gss/auth_gss.c
> +++ b/net/sunrpc/auth_gss/auth_gss.c
> @@ -991,6 +991,8 @@ gss_create_new(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
>  	gss_auth->service = gss_pseudoflavor_to_service(gss_auth->mech, flavor);
>  	if (gss_auth->service == 0)
>  		goto err_put_mech;
> +	if (!gssd_running(gss_auth->net))
> +		goto err_put_mech;
>  	auth = &gss_auth->rpc_auth;
>  	auth->au_cslack = GSS_CRED_SLACK >> 2;
>  	auth->au_rslack = GSS_VERF_SLACK >> 2;
> 
Unfortunately I'm seeing the same loop but this time its with _nfs4_proc_exchange_id

Here is the trace point output:
  192.168.62.8-ma-20371 [000] .... 955443.604229: nfs4_exchange_id: error=-13 (EACCES) dstaddr=192.168.62.8

and here is the rpcdebug output:
[ 2782.341981] NFS call  exchange_id auth=RPCSEC_GSS, 'Linux NFSv4.1 <client>'
[ 2782.360540] NFS reply exchange_id: -13

All three mounts (v4.0, v4.1, v4.2) are hung...

Looking into it...

steved.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running
  2014-02-10 23:01     ` Steve Dickson
@ 2014-02-11 11:26       ` Steve Dickson
  0 siblings, 0 replies; 5+ messages in thread
From: Steve Dickson @ 2014-02-11 11:26 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs



On 02/10/2014 06:01 PM, Steve Dickson wrote:
> 
> 
> On 02/10/2014 04:48 PM, Trond Myklebust wrote:
>> An infinite loop is caused when nfs4_establish_lease() fails
>> with -EACCES. This causes nfs4_handle_reclaim_lease_error()
>> to sleep a bit and resets the NFS4CLNT_LEASE_EXPIRED bit.
>> This in turn causes nfs4_state_manager() to try and
>> reestablished the lease, again, again, again...
>>
>> The problem is a valid RPCSEC_GSS client is being created when
>> rpc.gssd is not running.
>>
>> Link: http://lkml.kernel.org/r/1392066375-16502-1-git-send-email-steved@redhat.com
>> Fixes: 0ea9de0ea6a4 (sunrpc: turn warn_gssd() log message into a dprintk())
>> Reported-by: Steve Dickson <steved@redhat.com>
>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
>> ---
>>  net/sunrpc/auth_gss/auth_gss.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
>> index 6c0513a7f992..44a61e8fda6f 100644
>> --- a/net/sunrpc/auth_gss/auth_gss.c
>> +++ b/net/sunrpc/auth_gss/auth_gss.c
>> @@ -991,6 +991,8 @@ gss_create_new(struct rpc_auth_create_args *args, struct rpc_clnt *clnt)
>>  	gss_auth->service = gss_pseudoflavor_to_service(gss_auth->mech, flavor);
>>  	if (gss_auth->service == 0)
>>  		goto err_put_mech;
>> +	if (!gssd_running(gss_auth->net))
>> +		goto err_put_mech;
>>  	auth = &gss_auth->rpc_auth;
>>  	auth->au_cslack = GSS_CRED_SLACK >> 2;
>>  	auth->au_rslack = GSS_VERF_SLACK >> 2;
>>
> Unfortunately I'm seeing the same loop but this time its with _nfs4_proc_exchange_id
> 
> Here is the trace point output:
>   192.168.62.8-ma-20371 [000] .... 955443.604229: nfs4_exchange_id: error=-13 (EACCES) dstaddr=192.168.62.8
> 
> and here is the rpcdebug output:
> [ 2782.341981] NFS call  exchange_id auth=RPCSEC_GSS, 'Linux NFSv4.1 <client>'
> [ 2782.360540] NFS reply exchange_id: -13
> 
> All three mounts (v4.0, v4.1, v4.2) are hung...
> 
> Looking into it...
Pilot error on my part... I only reloaded sunrpc.ko not auth_rpcgss.ko
What a good night sleep can do for you... :-) 

Tested-by: Steve Dickson <steved@redhat.com> 

Question, should we be checking that gssd still running when
gss_auth pointer is found in the hash table? I'm thinking 
of the case where gssd was started and then stopped. 

steved.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-02-11 11:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-10 21:06 [PATCH] NFSv4: Infinite loop in lease recovery when rpc.gssd is not running Steve Dickson
2014-02-10 21:10 ` Trond Myklebust
2014-02-10 21:48   ` [PATCH] SUNRPC: Don't create a gss auth cache unless rpc.gssd is running Trond Myklebust
2014-02-10 23:01     ` Steve Dickson
2014-02-11 11:26       ` Steve Dickson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).