linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
@ 2011-11-16 18:14 John Hughes
  2011-11-16 19:47 ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: John Hughes @ 2011-11-16 18:14 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

With recent kernels if the Kerberos ticket for a nfs4 mount expires any 
user process trying to access the mount hangs until a new ticket is 
obtained.  Simultaneously a (luckily rate-limited, but still seemingly 
endless) stream of "Error: state manager encountered RPCSEC_GSS session 
expired against NFSv4 server" messages is written to the kernel log.

In a common setup with user home directories nfs4 mounted on 
workstations one of the processes that is likely to hang is the 
screen-unlock function which would normally (via pam_krb5 or similar) 
get the new ticket.

In older kernels the EKEYEXPIRED error would be passed to userland, 
which would usualy just give up.

This patch restores the old behavior, which makes nfs4 mounted home 
directories usable for me.




[-- Attachment #2: nfs4-ekeyexpired.patch --]
[-- Type: text/x-patch, Size: 1277 bytes --]

Signed-off-by: John Hughes <john@calva.com>

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 4700fae..dc28a78 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -302,7 +302,6 @@ static int nfs4_handle_exception(struct nfs_server *server, int errorcode, struc
 			}
 		case -NFS4ERR_GRACE:
 		case -NFS4ERR_DELAY:
-		case -EKEYEXPIRED:
 			ret = nfs4_delay(server->client, &exception->timeout);
 			if (ret != 0)
 				break;
@@ -3732,7 +3731,6 @@ nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server,
 		case -NFS4ERR_DELAY:
 			nfs_inc_server_stats(server, NFSIOS_DELAY);
 		case -NFS4ERR_GRACE:
-		case -EKEYEXPIRED:
 			rpc_delay(task, NFS4_POLL_RETRY_MAX);
 			task->tk_status = 0;
 			return -EAGAIN;
diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
index 39914be..2bee41e 100644
--- a/fs/nfs/nfs4state.c
+++ b/fs/nfs/nfs4state.c
@@ -1377,8 +1377,9 @@ static int nfs4_recovery_handle_error(struct nfs_client *clp, int error)
 			/* Zero session reset errors */
 			return 0;
 		case -EKEYEXPIRED:
-			/* Nothing we can do */
-			nfs4_warn_keyexpired(clp->cl_hostname);
+			/* Nothing we can do, so do nothing.  Don't even
+			   print a warning message, this is not a kernel
+			   problem */
 			return 0;
 	}
 	return error;

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-16 18:14 [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires John Hughes
@ 2011-11-16 19:47 ` Jeff Layton
  2011-11-16 23:44   ` Jim Rees
  2011-11-17  9:37   ` John Hughes
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2011-11-16 19:47 UTC (permalink / raw)
  To: John Hughes; +Cc: Trond Myklebust, linux-nfs, linux-kernel

On Wed, 16 Nov 2011 19:14:35 +0100
John Hughes <john@calvaedi.com> wrote:

> With recent kernels if the Kerberos ticket for a nfs4 mount expires any 
> user process trying to access the mount hangs until a new ticket is 
> obtained.  Simultaneously a (luckily rate-limited, but still seemingly 
> endless) stream of "Error: state manager encountered RPCSEC_GSS session 
> expired against NFSv4 server" messages is written to the kernel log.
> 
> In a common setup with user home directories nfs4 mounted on 
> workstations one of the processes that is likely to hang is the 
> screen-unlock function which would normally (via pam_krb5 or similar) 
> get the new ticket.
> 
> In older kernels the EKEYEXPIRED error would be passed to userland, 
> which would usualy just give up.
> 
> This patch restores the old behavior, which makes nfs4 mounted home 
> directories usable for me.
> 

Uhhh, no...EKEYEXPIRED was never passed to userland. The patchset that
added EKEYEXPIRED returns in this codepath also added the code to make
it hang. 

This not a bug, or at least it's intentional behavior. When a krb5
ticket expires, we *want* the process to hang. Otherwise, people with
long running jobs will often find that their jobs error out
inexplicably when their ticket expires.

The patches that introduced this behavior went into 2.6.34. See the
commits around 2c64348 (and some preceding ones in the rpc layer).

If you want to fix this use case, you'll need to come up with a scheme
that doesn't regress this behavior. I think that you'll really need to
ensure that whatever process you expect to re-fetch your TGT is not
dependent on accessing kerberized nfs mounts. That really seems like an
untenable chicken and egg situation.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-16 19:47 ` Jeff Layton
@ 2011-11-16 23:44   ` Jim Rees
  2011-11-17  1:31     ` Jeff Layton
  2011-11-17  9:37   ` John Hughes
  1 sibling, 1 reply; 16+ messages in thread
From: Jim Rees @ 2011-11-16 23:44 UTC (permalink / raw)
  To: Jeff Layton; +Cc: John Hughes, Trond Myklebust, linux-nfs, linux-kernel

Jeff Layton wrote:

  Uhhh, no...EKEYEXPIRED was never passed to userland. The patchset that
  added EKEYEXPIRED returns in this codepath also added the code to make
  it hang. 
  
  This not a bug, or at least it's intentional behavior. When a krb5
  ticket expires, we *want* the process to hang. Otherwise, people with
  long running jobs will often find that their jobs error out
  inexplicably when their ticket expires.

Who decided that?  This seems completely wrong to me.  If my credentials
expire, I want to get permission denied, not a client hang.  In 20 years of
using authenticated file systems I never once wished my process had hung
when my ticket expired.

Why should this be any different from any other failure condition?  If you
try to open a file that doesn't exist, do you want your process to hang
instead of getting ENOENT, just in case the file magically appears at some
point in the future?

This seems a recipe for disaster.  Suppose I have a cron job that fires once
a minute, and all those jobs hang waiting for a ticket.  I come to work in
the morning and discover I've got 10,000 hung processes.  Or not, because my
computer has crashed from resource exhaustion.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-16 23:44   ` Jim Rees
@ 2011-11-17  1:31     ` Jeff Layton
  2011-11-17  1:38       ` Jeff Layton
  2011-11-17  1:46       ` Matt W. Benjamin
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2011-11-17  1:31 UTC (permalink / raw)
  To: Jim Rees; +Cc: John Hughes, Trond Myklebust, linux-nfs, linux-kernel

On Wed, 16 Nov 2011 18:44:34 -0500
Jim Rees <rees@umich.edu> wrote:

> Jeff Layton wrote:
> 
>   Uhhh, no...EKEYEXPIRED was never passed to userland. The patchset that
>   added EKEYEXPIRED returns in this codepath also added the code to make
>   it hang. 
>   
>   This not a bug, or at least it's intentional behavior. When a krb5
>   ticket expires, we *want* the process to hang. Otherwise, people with
>   long running jobs will often find that their jobs error out
>   inexplicably when their ticket expires.
> 
> Who decided that?  This seems completely wrong to me.  If my credentials
> expire, I want to get permission denied, not a client hang.  In 20 years of
> using authenticated file systems I never once wished my process had hung
> when my ticket expired.
> 

I proposed it, we discussed it on the list, and Trond and Steve
committed the patches necessary to make it happen. This was back in
late 2009/early 2010 though, so my memory is a bit fuzzy...

> Why should this be any different from any other failure condition?  If you
> try to open a file that doesn't exist, do you want your process to hang
> instead of getting ENOENT, just in case the file magically appears at some
> point in the future?
>

That's different. Not renewing your credentials is often a temporary
situation. Kerberos is different than other authentication methods in
that you get a ticket only for a period of time, so expired credentials
are not a situation that's common with other authentication methods.

> This seems a recipe for disaster.  Suppose I have a cron job that fires once
> a minute, and all those jobs hang waiting for a ticket.  I come to work in
> the morning and discover I've got 10,000 hung processes.  Or not, because my
> computer has crashed from resource exhaustion.

The previous situation was also a recipe for disaster, and was often
cited as a primary reason why people didn't want to deploy kerberized
NFS. Having everything fall down and go boom when your ticket expires
is not desirable either.

I suppose we'll have to agree to disagree on this point. That said, I'm
open to sane suggestions however that don't regress the behavior for
those users who need to be able to cope with expired tickets.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17  1:31     ` Jeff Layton
@ 2011-11-17  1:38       ` Jeff Layton
  2011-11-17 11:05         ` John Hughes
  2011-11-17  1:46       ` Matt W. Benjamin
  1 sibling, 1 reply; 16+ messages in thread
From: Jeff Layton @ 2011-11-17  1:38 UTC (permalink / raw)
  Cc: Jim Rees, John Hughes, Trond Myklebust, linux-nfs, linux-kernel

On Wed, 16 Nov 2011 20:31:19 -0500
Jeff Layton <jlayton@redhat.com> wrote:

> On Wed, 16 Nov 2011 18:44:34 -0500
> Jim Rees <rees@umich.edu> wrote:
> 
> > Jeff Layton wrote:
> > 
> >   Uhhh, no...EKEYEXPIRED was never passed to userland. The patchset that
> >   added EKEYEXPIRED returns in this codepath also added the code to make
> >   it hang. 
> >   
> >   This not a bug, or at least it's intentional behavior. When a krb5
> >   ticket expires, we *want* the process to hang. Otherwise, people with
> >   long running jobs will often find that their jobs error out
> >   inexplicably when their ticket expires.
> > 
> > Who decided that?  This seems completely wrong to me.  If my credentials
> > expire, I want to get permission denied, not a client hang.  In 20 years of
> > using authenticated file systems I never once wished my process had hung
> > when my ticket expired.
> > 
> 
> I proposed it, we discussed it on the list, and Trond and Steve
> committed the patches necessary to make it happen. This was back in
> late 2009/early 2010 though, so my memory is a bit fuzzy...
> 
> > Why should this be any different from any other failure condition?  If you
> > try to open a file that doesn't exist, do you want your process to hang
> > instead of getting ENOENT, just in case the file magically appears at some
> > point in the future?
> >
> 
> That's different. Not renewing your credentials is often a temporary
> situation. Kerberos is different than other authentication methods in
> that you get a ticket only for a period of time, so expired credentials
> are not a situation that's common with other authentication methods.
> 
> > This seems a recipe for disaster.  Suppose I have a cron job that fires once
> > a minute, and all those jobs hang waiting for a ticket.  I come to work in
> > the morning and discover I've got 10,000 hung processes.  Or not, because my
> > computer has crashed from resource exhaustion.
> 
> The previous situation was also a recipe for disaster, and was often
> cited as a primary reason why people didn't want to deploy kerberized
> NFS. Having everything fall down and go boom when your ticket expires
> is not desirable either.
> 
> I suppose we'll have to agree to disagree on this point. That said, I'm
> open to sane suggestions however that don't regress the behavior for
> those users who need to be able to cope with expired tickets.
> 

Note too that the gssd code distinguishes between an expired TGT and a
non-existent credcache. The latter will give you the error you desire
here. So one possibility is just to remove the credcache from /tmp in
this situation.

Another possibility might be a new option to rpc.gssd that allows the
user to select the error that it passes back to the kernel on an
expired ticket.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17  1:31     ` Jeff Layton
  2011-11-17  1:38       ` Jeff Layton
@ 2011-11-17  1:46       ` Matt W. Benjamin
  1 sibling, 0 replies; 16+ messages in thread
From: Matt W. Benjamin @ 2011-11-17  1:46 UTC (permalink / raw)
  To: Jeff Layton
  Cc: John Hughes, Trond Myklebust, linux-nfs, linux-kernel, Jim Rees

Hi,

While I'm not expert in this area, my impression had been that the established
practice was that used with AFS, i.e., run jobs under a process capable of renewing
kerberos tickets, e.g., kstart (http://www.eyrie.org/~eagle/software/kstart/).

Matt

----- "Jeff Layton" <jlayton@redhat.com> wrote:

> The previous situation was also a recipe for disaster, and was often
> cited as a primary reason why people didn't want to deploy kerberized
> NFS. Having everything fall down and go boom when your ticket expires
> is not desirable either.
> 
> I suppose we'll have to agree to disagree on this point. That said,
> I'm
> open to sane suggestions however that don't regress the behavior for
> those users who need to be able to cope with expired tickets.
> 
> -- 
> Jeff Layton <jlayton@redhat.com>

-- 

Matt Benjamin

The Linux Box
206 South Fifth Ave. Suite 150
Ann Arbor, MI  48104

http://linuxbox.com

tel. 734-761-4689
fax. 734-769-8938
cel. 734-216-5309

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-16 19:47 ` Jeff Layton
  2011-11-16 23:44   ` Jim Rees
@ 2011-11-17  9:37   ` John Hughes
  1 sibling, 0 replies; 16+ messages in thread
From: John Hughes @ 2011-11-17  9:37 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Trond Myklebust, linux-nfs, linux-kernel

On 16/11/11 20:47, Jeff Layton wrote:
> On Wed, 16 Nov 2011 19:14:35 +0100
> John Hughes<john@calvaedi.com>  wrote:
>
>> With recent kernels if the Kerberos ticket for a nfs4 mount expires any
>> user process trying to access the mount hangs until a new ticket is
>> obtained.  Simultaneously a (luckily rate-limited, but still seemingly
>> endless) stream of "Error: state manager encountered RPCSEC_GSS session
>> expired against NFSv4 server" messages is written to the kernel log.
[...]
>> This patch restores the old behavior, which makes nfs4 mounted home
>> directories usable for me.
>>
> Uhhh, no...EKEYEXPIRED was never passed to userland. The patchset that
> added EKEYEXPIRED returns in this codepath also added the code to make
> it hang.

You are, of course, right.  userland used to get EPERM.

> This not a bug, or at least it's intentional behavior. When a krb5
> ticket expires, we *want* the process to hang. Otherwise, people with
> long running jobs will often find that their jobs error out
> inexplicably when their ticket expires.
I thought that was what kstart/krenew were for.
> The patches that introduced this behavior went into 2.6.34. See the
> commits around 2c64348 (and some preceding ones in the rpc layer).

Ah, I'm a Debian user - 2.6.32 for the moment, soon to be 3.?

> If you want to fix this use case, you'll need to come up with a scheme
> that doesn't regress this behavior. I think that you'll really need to
> ensure that whatever process you expect to re-fetch your TGT is not
> dependent on accessing kerberized nfs mounts. That really seems like an
> untenable chicken and egg situation.

Ow.  "Fixing" (at least) Gnome-3 and Gnome-2 screen-lock/screensavers.

How about a mount option to chose between the two behaviours?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17  1:38       ` Jeff Layton
@ 2011-11-17 11:05         ` John Hughes
  2011-11-17 13:13           ` John Hughes
  0 siblings, 1 reply; 16+ messages in thread
From: John Hughes @ 2011-11-17 11:05 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Jim Rees, Trond Myklebust, linux-nfs, linux-kernel

On 17/11/11 02:38, Jeff Layton wrote:
> Note too that the gssd code distinguishes between an expired TGT and a
> non-existent credcache. The latter will give you the error you desire
> here. So one possibility is just to remove the credcache from /tmp in
> this situation.

Something to scan /tmp for expired credentials and zap em?  rpc.gssd 
would communicate that to the kernel?

Whadaya know, that works.

With the 3.1-rc10 kernel I let my ticket expire, did a ls - it hangs.

Now, from another terminal I do a kdestroy on my ticket cache, and (a 
second or so later) the ls gets an EPERM.

So this behaviour can be changed from userland with no changes to the 
kernel, rpc.gssd or anything else.

Some fun racing possibilities.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17 11:05         ` John Hughes
@ 2011-11-17 13:13           ` John Hughes
  2011-11-17 21:46             ` Jeff Layton
  0 siblings, 1 reply; 16+ messages in thread
From: John Hughes @ 2011-11-17 13:13 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Jim Rees, Trond Myklebust, linux-nfs, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 931 bytes --]

On 17/11/11 12:05, John Hughes wrote:
> On 17/11/11 02:38, Jeff Layton wrote:
>> Note too that the gssd code distinguishes between an expired TGT and a
>> non-existent credcache. The latter will give you the error you desire
>> here. So one possibility is just to remove the credcache from /tmp in
>> this situation.
>
> Something to scan /tmp for expired credentials and zap em?  rpc.gssd 
> would communicate that to the kernel?
>
> Whadaya know, that works.
Here's a dumb perl script that could be run from, for example, .xsession 
to automatically destroy expired ticket caches.

Would need a bit of trickery to make it go away on end of session and 
something in /etc/pm/sleep.d to send it a SIGALRM when the system wakes 
from suspend or hibernate.

It has a potential race between destroying an expired ticket and a new 
ticket being granted.

I guess now I'll look at a hack to rpc.gssd for a neater way of doing this.





[-- Attachment #2: monitor-tickets --]
[-- Type: text/plain, Size: 1308 bytes --]

#! /usr/bin/perl -w

my $ALARMED = 0;

$SIG{ALRM} = sub { ++$ALARMED; };

use POSIX qw(mktime);

# Work out ticket expiry

# Valid starting     Expires            Service principal
# 11/17/11 10:34:23  11/17/11 20:34:23  krbtgt/CALVAEDI.COM@CALVAEDI.COM
# 	renew until 11/18/11 10:34:23
# 11/17/11 10:34:23  11/17/11 20:34:23  nfs/olympic.calvaedi.com@CALVAEDI.COM
# 	renew until 11/18/11 10:34:23
# 11/17/11 11:24:24  11/17/11 20:34:23  host/olympic.calvaedi.com@CALVAEDI.COM
# 	renew until 11/18/11 10:34:23

# Eurgh - non localised, US format dates.

sub expiry {
	local *KLIST;
	open KLIST, "/usr/bin/klist | " or return;
	my $expiry;
	while (<KLIST>) {
		if (m((\d+)/(\d+)/(\d+) (\d+):(\d+):(\d+)  krbtgt)) {
			$expiry = mktime ($6, $5, $4, $2, $1 - 1, 100 +  $3);
			last;
		}
	}

	$expiry;
}


for (;;) {
	my $sleepytime = 60;

	my $expiry = expiry ();
	
	if (defined $expiry) {
		my $left = $expiry - time;
		if ($left <= 0) {
			# Ticket expired, zap it.  Potential race with
			# new ticket creation.
			print "Destroy expired ticket\n";
			system "/usr/bin/kdestroy";
		}
		else {
			$sleepytime = $left;
		}
	}

	if ($ALARMED) {
		$ALARMED = 0;
		next;
	}

	# If machine freezes during this sleap how long will
	# it sleep for?
	print "Sleeping for $sleepytime seconds\n";
	sleep $sleepytime;
}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17 13:13           ` John Hughes
@ 2011-11-17 21:46             ` Jeff Layton
  2011-11-18  1:51               ` Jim Rees
       [not found]               ` <4EC62325.1060009@Calva.COM>
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Layton @ 2011-11-17 21:46 UTC (permalink / raw)
  To: John Hughes; +Cc: Jim Rees, Trond Myklebust, linux-nfs, linux-kernel

On Thu, 17 Nov 2011 14:13:38 +0100
John Hughes <john@Calva.COM> wrote:

> On 17/11/11 12:05, John Hughes wrote:
> > On 17/11/11 02:38, Jeff Layton wrote:
> >> Note too that the gssd code distinguishes between an expired TGT and a
> >> non-existent credcache. The latter will give you the error you desire
> >> here. So one possibility is just to remove the credcache from /tmp in
> >> this situation.
> >
> > Something to scan /tmp for expired credentials and zap em?  rpc.gssd 
> > would communicate that to the kernel?
> >
> > Whadaya know, that works.
> Here's a dumb perl script that could be run from, for example, .xsession 
> to automatically destroy expired ticket caches.
> 
> Would need a bit of trickery to make it go away on end of session and 
> something in /etc/pm/sleep.d to send it a SIGALRM when the system wakes 
> from suspend or hibernate.
> 
> It has a potential race between destroying an expired ticket and a new 
> ticket being granted.
> 
> I guess now I'll look at a hack to rpc.gssd for a neater way of doing this.
> 

Ok, I can remember a bit more about the genesis of this scheme...

At the time the argument went something like this:

No one expects that when their krb5 ticket expires that their
applications will fail. A case in point is something like a krb5 ssh
session. If I had a valid ticket when I initiated the session, then it
we would consider it a bug if it were to suddenly die when the ticket
expired.

Contrast that however with applications running on a kerberized NFS
mount. As soon as the ticket expires they start failing with
non-transient errors. This is probably the case as well with screen
locker you're using, but it's apparently able to recover enough to
allow the TGT to be renewed. I expect though, that you may have other
less visible programs that are dying in this situation or are getting
unexpected errors.

The current behavior was really intended as a first approximation. I
fully expected that it would need some refinement, but AFAIK, no one
has complained loudly about the current behavior until now, so I
haven't seen need to mess with it.

I'm not that familiar with kstart, but I assume that it gets a
renewable TGT and just renews it as needed? I have to wonder if that
sort of tool might be verboten in security conscious sites (the very
sort that want kerberized nfs).

If we decide that making this behavior switchable is the right thing to
do, then what you'll probably want to do is add a new command-line
option to rpc.gssd, and make it conditionally return -EKEYEXPIRED or
-EACCES in the downcall based on it. It should be a fairly simple
patch. See process_krb5_upcall() in rpc.gssd...

Long term, we probably need to consider this use-case in the GSSAPI
proxy initiative that Simo has been scoping out. It would be nice to
have a solution that would work for both home directory configurations
and long-running jobs without needing these sorts of hacks.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-17 21:46             ` Jeff Layton
@ 2011-11-18  1:51               ` Jim Rees
  2011-11-18  2:03                 ` Jeff Layton
       [not found]               ` <4EC62325.1060009@Calva.COM>
  1 sibling, 1 reply; 16+ messages in thread
From: Jim Rees @ 2011-11-18  1:51 UTC (permalink / raw)
  To: Jeff Layton; +Cc: John Hughes, Trond Myklebust, linux-nfs, linux-kernel

I would argue that if you don't want your applications to stop working when
your ticket expires, you shouldn't let the ticket expire.  If you don't want
to have to renew your ticket, you should use an infinite ticket lifetime.

It sounds like you've made up your mind, but I would urge you to make this
a mount option, analogous to the hard/soft mount option.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-18  1:51               ` Jim Rees
@ 2011-11-18  2:03                 ` Jeff Layton
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Layton @ 2011-11-18  2:03 UTC (permalink / raw)
  To: Jim Rees; +Cc: John Hughes, Trond Myklebust, linux-nfs, linux-kernel

On Thu, 17 Nov 2011 20:51:16 -0500
Jim Rees <rees@umich.edu> wrote:

> I would argue that if you don't want your applications to stop working when
> your ticket expires, you shouldn't let the ticket expire.  If you don't want
> to have to renew your ticket, you should use an infinite ticket lifetime.
> 

That's the ideal situation, but shit happens, and losing a long-running
job can often be an expensive proposition.

> It sounds like you've made up your mind, but I would urge you to make this
> a mount option, analogous to the hard/soft mount option.

I've not made up my mind about anything, and in any case it's not my
decision to make. I think you need to convince Trond here... :)

I'm quite open to sane proposals as long as we can accomodate those who
are dependent on the current behavior. As I said before, when I
originally did the patches a couple of years ago, I sort of figured the
current behavior was a first approximation.

A mount option will be harder to implement than a rpc.gssd command-line
option, but it sounds reasonable. Still, it would be better not to have
to make this an either/or decision somehow.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
       [not found]               ` <4EC62325.1060009@Calva.COM>
@ 2011-11-18 12:50                 ` Jim Rees
  0 siblings, 0 replies; 16+ messages in thread
From: Jim Rees @ 2011-11-18 12:50 UTC (permalink / raw)
  To: John Hughes; +Cc: Jeff Layton, Trond Myklebust, linux-nfs, linux-kernel

I'm not arguing that the client machine should hang when tickets expire.
I'm arguing that user processes should not hang when tickets expire.  Sorry
if that wasn't clear.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
@ 2011-11-18 17:16 Myklebust, Trond
  2011-11-18 17:54 ` Jim Rees
  0 siblings, 1 reply; 16+ messages in thread
From: Myklebust, Trond @ 2011-11-18 17:16 UTC (permalink / raw)
  To: Jim Rees, John Hughes; +Cc: Jeff Layton, linux-nfs, linux-kernel

So what are they supposed to do without tickets? Crash?

Jim Rees <rees@umich.edu> wrote:

I'm not arguing that the client machine should hang when tickets expire.
I'm arguing that user processes should not hang when tickets expire.  Sorry
if that wasn't clear.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-18 17:16 Myklebust, Trond
@ 2011-11-18 17:54 ` Jim Rees
  2011-11-18 18:23   ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: Jim Rees @ 2011-11-18 17:54 UTC (permalink / raw)
  To: Myklebust, Trond; +Cc: John Hughes, Jeff Layton, linux-nfs, linux-kernel

Myklebust, Trond wrote:

  So what are they supposed to do without tickets? Crash?

No, why would they want to do that?

I feel like I've entered the Twilight Zone here, so please continue without
me while I dig up the relevant background info.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires
  2011-11-18 17:54 ` Jim Rees
@ 2011-11-18 18:23   ` Trond Myklebust
  0 siblings, 0 replies; 16+ messages in thread
From: Trond Myklebust @ 2011-11-18 18:23 UTC (permalink / raw)
  To: Jim Rees; +Cc: John Hughes, Jeff Layton, linux-nfs, linux-kernel

On Fri, 2011-11-18 at 12:54 -0500, Jim Rees wrote: 
> Myklebust, Trond wrote:
> 
>   So what are they supposed to do without tickets? Crash?
> 
> No, why would they want to do that?
> 
> I feel like I've entered the Twilight Zone here, so please continue without
> me while I dig up the relevant background info.

The point is that if the server won't allow them to issue read or write
system calls, then there are 2 options available to them: 

     1. hang until someone renews their ticket 
     2. get an error which means crash, since most (all?) applications
        aren't written according to the non-posix assumption that read
        and write can return EACCES/EKEYEXPIRED errors.



Trond

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2011-11-18 18:23 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-16 18:14 [PATCH] Don't hang user processes if Kerberos ticket for nfs4 mount expires John Hughes
2011-11-16 19:47 ` Jeff Layton
2011-11-16 23:44   ` Jim Rees
2011-11-17  1:31     ` Jeff Layton
2011-11-17  1:38       ` Jeff Layton
2011-11-17 11:05         ` John Hughes
2011-11-17 13:13           ` John Hughes
2011-11-17 21:46             ` Jeff Layton
2011-11-18  1:51               ` Jim Rees
2011-11-18  2:03                 ` Jeff Layton
     [not found]               ` <4EC62325.1060009@Calva.COM>
2011-11-18 12:50                 ` Jim Rees
2011-11-17  1:46       ` Matt W. Benjamin
2011-11-17  9:37   ` John Hughes
  -- strict thread matches above, loose matches on Subject: below --
2011-11-18 17:16 Myklebust, Trond
2011-11-18 17:54 ` Jim Rees
2011-11-18 18:23   ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).