linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] rpc.idmapd dies with 'I/O possible'
@ 2011-05-16 13:38 Luca Giuzzi
  2011-05-16 13:56 ` Jeff Layton
  2011-05-16 14:23 ` Steve Dickson
  0 siblings, 2 replies; 5+ messages in thread
From: Luca Giuzzi @ 2011-05-16 13:38 UTC (permalink / raw)
  To: linux-nfs

Dear All,
 we have had problems on some of our machines (all Fedora 14), where
rpc.idmapd used to die with an `I/O possible' message at (basically)
random times. A strace suggested the issue being in nfsopen() where a
signal type is reset before notification is disabled; a signal at just
the right time might be the cause of the problem; see
https://bugzilla.redhat.com/show_bug.cgi?id=684308
I am now proposing the (trivial) patch to the list as well.

Best Regards,
  luca giuzzi

--- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig	2011-05-13
16:27:08.000000000 +0200
+++ nfs-utils-1.2.3/utils/idmapd/idmapd.c	2011-05-13 16:27:16.000000000 +0200
@@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic)
 	} else {
 		event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic);
 		event_add(&ic->ic_event, NULL);
-		fcntl(ic->ic_dirfd, F_SETSIG, 0);
 		fcntl(ic->ic_dirfd, F_NOTIFY, 0);
+		fcntl(ic->ic_dirfd, F_SETSIG, 0);
 		if (verbose > 0)
 			xlog_warn("Opened %s", ic->ic_path);
 	}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rpc.idmapd dies with 'I/O possible'
  2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi
@ 2011-05-16 13:56 ` Jeff Layton
  2011-05-16 14:23 ` Steve Dickson
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2011-05-16 13:56 UTC (permalink / raw)
  To: Luca Giuzzi; +Cc: linux-nfs

On Mon, 16 May 2011 15:38:27 +0200
Luca Giuzzi <luca.giuzzi@gmail.com> wrote:

> Dear All,
>  we have had problems on some of our machines (all Fedora 14), where
> rpc.idmapd used to die with an `I/O possible' message at (basically)
> random times. A strace suggested the issue being in nfsopen() where a
> signal type is reset before notification is disabled; a signal at just
> the right time might be the cause of the problem; see
> https://bugzilla.redhat.com/show_bug.cgi?id=684308
> I am now proposing the (trivial) patch to the list as well.
> 
> Best Regards,
>   luca giuzzi
> 
> --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig	2011-05-13
> 16:27:08.000000000 +0200
> +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c	2011-05-13 16:27:16.000000000 +0200
> @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic)
>  	} else {
>  		event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic);
>  		event_add(&ic->ic_event, NULL);
> -		fcntl(ic->ic_dirfd, F_SETSIG, 0);
>  		fcntl(ic->ic_dirfd, F_NOTIFY, 0);
> +		fcntl(ic->ic_dirfd, F_SETSIG, 0);
>  		if (verbose > 0)
>  			xlog_warn("Opened %s", ic->ic_path);
>  	}
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Looks correct to me:

Reviewed-by: Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rpc.idmapd dies with 'I/O possible'
  2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi
  2011-05-16 13:56 ` Jeff Layton
@ 2011-05-16 14:23 ` Steve Dickson
  2011-05-16 14:39   ` J. Bruce Fields
  2011-05-16 14:47   ` Luca Giuzzi
  1 sibling, 2 replies; 5+ messages in thread
From: Steve Dickson @ 2011-05-16 14:23 UTC (permalink / raw)
  To: Luca Giuzzi; +Cc: linux-nfs



On 05/16/2011 09:38 AM, Luca Giuzzi wrote:
> Dear All,
>  we have had problems on some of our machines (all Fedora 14), where
> rpc.idmapd used to die with an `I/O possible' message at (basically)
> random times. A strace suggested the issue being in nfsopen() where a
> signal type is reset before notification is disabled; a signal at just
> the right time might be the cause of the problem; see
> https://bugzilla.redhat.com/show_bug.cgi?id=684308
> I am now proposing the (trivial) patch to the list as well.
> 
> Best Regards,
>   luca giuzzi
> 
> --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig	2011-05-13
> 16:27:08.000000000 +0200
> +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c	2011-05-13 16:27:16.000000000 +0200
> @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic)
>  	} else {
>  		event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic);
>  		event_add(&ic->ic_event, NULL);
> -		fcntl(ic->ic_dirfd, F_SETSIG, 0);
>  		fcntl(ic->ic_dirfd, F_NOTIFY, 0);
> +		fcntl(ic->ic_dirfd, F_SETSIG, 0);
>  		if (verbose > 0)
>  			xlog_warn("Opened %s", ic->ic_path);
>  	}
I just updated the bz with these questions, but we can have the 
conversation here...

I'm just a bit concerned by the fact that there are
a couple of places that sets F_SETSIG and then sets F_NOTIFY, 
so does that mean those places are potential race cases as 
well?

Also, after you applied this patch, did the problem go away?
Finally, how often did this problem occur?

tia,

steved.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rpc.idmapd dies with 'I/O possible'
  2011-05-16 14:23 ` Steve Dickson
@ 2011-05-16 14:39   ` J. Bruce Fields
  2011-05-16 14:47   ` Luca Giuzzi
  1 sibling, 0 replies; 5+ messages in thread
From: J. Bruce Fields @ 2011-05-16 14:39 UTC (permalink / raw)
  To: Steve Dickson; +Cc: Luca Giuzzi, linux-nfs

On Mon, May 16, 2011 at 10:23:12AM -0400, Steve Dickson wrote:
> 
> 
> On 05/16/2011 09:38 AM, Luca Giuzzi wrote:
> > Dear All,
> >  we have had problems on some of our machines (all Fedora 14), where
> > rpc.idmapd used to die with an `I/O possible' message at (basically)
> > random times. A strace suggested the issue being in nfsopen() where a
> > signal type is reset before notification is disabled; a signal at just
> > the right time might be the cause of the problem; see
> > https://bugzilla.redhat.com/show_bug.cgi?id=684308
> > I am now proposing the (trivial) patch to the list as well.
> > 
> > Best Regards,
> >   luca giuzzi
> > 
> > --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig	2011-05-13
> > 16:27:08.000000000 +0200
> > +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c	2011-05-13 16:27:16.000000000 +0200
> > @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic)
> >  	} else {
> >  		event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic);
> >  		event_add(&ic->ic_event, NULL);
> > -		fcntl(ic->ic_dirfd, F_SETSIG, 0);
> >  		fcntl(ic->ic_dirfd, F_NOTIFY, 0);
> > +		fcntl(ic->ic_dirfd, F_SETSIG, 0);
> >  		if (verbose > 0)
> >  			xlog_warn("Opened %s", ic->ic_path);
> >  	}
> I just updated the bz with these questions, but we can have the 
> conversation here...
> 
> I'm just a bit concerned by the fact that there are
> a couple of places that sets F_SETSIG and then sets F_NOTIFY, 
> so does that mean those places are potential race cases as 
> well?

I wondered about that too; looking:

utils/gssd/gssd_main_loop.c:		fcntl(tdi->fd, F_NOTIFY,
		      	DN_CREATE|DN_DELETE|DN_MODIFY|DN_MULTISHOT);

utils/gssd/gssd_proc.c:	fcntl(clp->dir_fd, F_NOTIFY, DN_CREATE | DN_DELETE | DN_MULTISHOT);
utils/idmapd/idmapd.c:		if (fcntl(fd, F_NOTIFY,
			DN_CREATE | DN_DELETE | DN_MODIFY | DN_MULTISHOT) == -1) {
utils/idmapd/idmapd.c:			fcntl(ic->ic_dirfd, F_NOTIFY,
			DN_CREATE | DN_DELETE | DN_MULTISHOT);
utils/idmapd/idmapd.c:		fcntl(ic->ic_dirfd, F_NOTIFY, 0);

So the last is the only one where we're turning notifications *off*.

So I think the patch is OK.

--b.

> 
> Also, after you applied this patch, did the problem go away?
> Finally, how often did this problem occur?
> 
> tia,
> 
> steved.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rpc.idmapd dies with 'I/O possible'
  2011-05-16 14:23 ` Steve Dickson
  2011-05-16 14:39   ` J. Bruce Fields
@ 2011-05-16 14:47   ` Luca Giuzzi
  1 sibling, 0 replies; 5+ messages in thread
From: Luca Giuzzi @ 2011-05-16 14:47 UTC (permalink / raw)
  To: Steve Dickson; +Cc: linux-nfs

On Mon, May 16, 2011 at 4:23 PM, Steve Dickson <SteveD@redhat.com> wrote:

> I just updated the bz with these questions, but we can have the
> conversation here...
>
> I'm just a bit concerned by the fact that there are
> a couple of places that sets F_SETSIG and then sets F_NOTIFY,
> so does that mean those places are potential race cases as
> well?
>
where does this happen? There are 3 places where I have seen these fcntl's.
The first three are:
1. in idmapd.c, at nfsopen()
2. in gssd_proc.c at process_clnt_dir()
3. in gssd_main_loop.c at topdirs_add_entry()
In all of these cases, the pattern is to set the SETSIG to something useful, and
then enable NOTIFY, though; thus they should not determine a race.
[there is also a fourth one in idmapd, where the signals are actually
initialized (lines 357-360); however, in the case of failure of the
first SETSIG the daemon does not start]

> Also, after you applied this patch, did the problem go away?

I honestly don't know: the problem seems to have disappeared, in the
sense that the patched daemon has not died. However, before it
occurred with quite a random
pattern (usually, once per day per machine, but it really depended) so
I cannot be sure (it might be it was also related to the overall load
- unfortunately I do not
have the proper statistics).


best,
 lg

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-05-16 14:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi
2011-05-16 13:56 ` Jeff Layton
2011-05-16 14:23 ` Steve Dickson
2011-05-16 14:39   ` J. Bruce Fields
2011-05-16 14:47   ` Luca Giuzzi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).