* [PATCH] rpc.idmapd dies with 'I/O possible'
@ 2011-05-16 13:38 Luca Giuzzi
2011-05-16 13:56 ` Jeff Layton
2011-05-16 14:23 ` Steve Dickson
0 siblings, 2 replies; 5+ messages in thread
From: Luca Giuzzi @ 2011-05-16 13:38 UTC (permalink / raw)
To: linux-nfs
Dear All,
we have had problems on some of our machines (all Fedora 14), where
rpc.idmapd used to die with an `I/O possible' message at (basically)
random times. A strace suggested the issue being in nfsopen() where a
signal type is reset before notification is disabled; a signal at just
the right time might be the cause of the problem; see
https://bugzilla.redhat.com/show_bug.cgi?id=684308
I am now proposing the (trivial) patch to the list as well.
Best Regards,
luca giuzzi
--- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig 2011-05-13
16:27:08.000000000 +0200
+++ nfs-utils-1.2.3/utils/idmapd/idmapd.c 2011-05-13 16:27:16.000000000 +0200
@@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic)
} else {
event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic);
event_add(&ic->ic_event, NULL);
- fcntl(ic->ic_dirfd, F_SETSIG, 0);
fcntl(ic->ic_dirfd, F_NOTIFY, 0);
+ fcntl(ic->ic_dirfd, F_SETSIG, 0);
if (verbose > 0)
xlog_warn("Opened %s", ic->ic_path);
}
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] rpc.idmapd dies with 'I/O possible' 2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi @ 2011-05-16 13:56 ` Jeff Layton 2011-05-16 14:23 ` Steve Dickson 1 sibling, 0 replies; 5+ messages in thread From: Jeff Layton @ 2011-05-16 13:56 UTC (permalink / raw) To: Luca Giuzzi; +Cc: linux-nfs On Mon, 16 May 2011 15:38:27 +0200 Luca Giuzzi <luca.giuzzi@gmail.com> wrote: > Dear All, > we have had problems on some of our machines (all Fedora 14), where > rpc.idmapd used to die with an `I/O possible' message at (basically) > random times. A strace suggested the issue being in nfsopen() where a > signal type is reset before notification is disabled; a signal at just > the right time might be the cause of the problem; see > https://bugzilla.redhat.com/show_bug.cgi?id=684308 > I am now proposing the (trivial) patch to the list as well. > > Best Regards, > luca giuzzi > > --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig 2011-05-13 > 16:27:08.000000000 +0200 > +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c 2011-05-13 16:27:16.000000000 +0200 > @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic) > } else { > event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic); > event_add(&ic->ic_event, NULL); > - fcntl(ic->ic_dirfd, F_SETSIG, 0); > fcntl(ic->ic_dirfd, F_NOTIFY, 0); > + fcntl(ic->ic_dirfd, F_SETSIG, 0); > if (verbose > 0) > xlog_warn("Opened %s", ic->ic_path); > } > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Looks correct to me: Reviewed-by: Jeff Layton <jlayton@redhat.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rpc.idmapd dies with 'I/O possible' 2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi 2011-05-16 13:56 ` Jeff Layton @ 2011-05-16 14:23 ` Steve Dickson 2011-05-16 14:39 ` J. Bruce Fields 2011-05-16 14:47 ` Luca Giuzzi 1 sibling, 2 replies; 5+ messages in thread From: Steve Dickson @ 2011-05-16 14:23 UTC (permalink / raw) To: Luca Giuzzi; +Cc: linux-nfs On 05/16/2011 09:38 AM, Luca Giuzzi wrote: > Dear All, > we have had problems on some of our machines (all Fedora 14), where > rpc.idmapd used to die with an `I/O possible' message at (basically) > random times. A strace suggested the issue being in nfsopen() where a > signal type is reset before notification is disabled; a signal at just > the right time might be the cause of the problem; see > https://bugzilla.redhat.com/show_bug.cgi?id=684308 > I am now proposing the (trivial) patch to the list as well. > > Best Regards, > luca giuzzi > > --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig 2011-05-13 > 16:27:08.000000000 +0200 > +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c 2011-05-13 16:27:16.000000000 +0200 > @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic) > } else { > event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic); > event_add(&ic->ic_event, NULL); > - fcntl(ic->ic_dirfd, F_SETSIG, 0); > fcntl(ic->ic_dirfd, F_NOTIFY, 0); > + fcntl(ic->ic_dirfd, F_SETSIG, 0); > if (verbose > 0) > xlog_warn("Opened %s", ic->ic_path); > } I just updated the bz with these questions, but we can have the conversation here... I'm just a bit concerned by the fact that there are a couple of places that sets F_SETSIG and then sets F_NOTIFY, so does that mean those places are potential race cases as well? Also, after you applied this patch, did the problem go away? Finally, how often did this problem occur? tia, steved. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rpc.idmapd dies with 'I/O possible' 2011-05-16 14:23 ` Steve Dickson @ 2011-05-16 14:39 ` J. Bruce Fields 2011-05-16 14:47 ` Luca Giuzzi 1 sibling, 0 replies; 5+ messages in thread From: J. Bruce Fields @ 2011-05-16 14:39 UTC (permalink / raw) To: Steve Dickson; +Cc: Luca Giuzzi, linux-nfs On Mon, May 16, 2011 at 10:23:12AM -0400, Steve Dickson wrote: > > > On 05/16/2011 09:38 AM, Luca Giuzzi wrote: > > Dear All, > > we have had problems on some of our machines (all Fedora 14), where > > rpc.idmapd used to die with an `I/O possible' message at (basically) > > random times. A strace suggested the issue being in nfsopen() where a > > signal type is reset before notification is disabled; a signal at just > > the right time might be the cause of the problem; see > > https://bugzilla.redhat.com/show_bug.cgi?id=684308 > > I am now proposing the (trivial) patch to the list as well. > > > > Best Regards, > > luca giuzzi > > > > --- nfs-utils-1.2.3/utils/idmapd/idmapd.c.orig 2011-05-13 > > 16:27:08.000000000 +0200 > > +++ nfs-utils-1.2.3/utils/idmapd/idmapd.c 2011-05-13 16:27:16.000000000 +0200 > > @@ -778,8 +778,8 @@ nfsopen(struct idmap_client *ic) > > } else { > > event_set(&ic->ic_event, ic->ic_fd, EV_READ, nfscb, ic); > > event_add(&ic->ic_event, NULL); > > - fcntl(ic->ic_dirfd, F_SETSIG, 0); > > fcntl(ic->ic_dirfd, F_NOTIFY, 0); > > + fcntl(ic->ic_dirfd, F_SETSIG, 0); > > if (verbose > 0) > > xlog_warn("Opened %s", ic->ic_path); > > } > I just updated the bz with these questions, but we can have the > conversation here... > > I'm just a bit concerned by the fact that there are > a couple of places that sets F_SETSIG and then sets F_NOTIFY, > so does that mean those places are potential race cases as > well? I wondered about that too; looking: utils/gssd/gssd_main_loop.c: fcntl(tdi->fd, F_NOTIFY, DN_CREATE|DN_DELETE|DN_MODIFY|DN_MULTISHOT); utils/gssd/gssd_proc.c: fcntl(clp->dir_fd, F_NOTIFY, DN_CREATE | DN_DELETE | DN_MULTISHOT); utils/idmapd/idmapd.c: if (fcntl(fd, F_NOTIFY, DN_CREATE | DN_DELETE | DN_MODIFY | DN_MULTISHOT) == -1) { utils/idmapd/idmapd.c: fcntl(ic->ic_dirfd, F_NOTIFY, DN_CREATE | DN_DELETE | DN_MULTISHOT); utils/idmapd/idmapd.c: fcntl(ic->ic_dirfd, F_NOTIFY, 0); So the last is the only one where we're turning notifications *off*. So I think the patch is OK. --b. > > Also, after you applied this patch, did the problem go away? > Finally, how often did this problem occur? > > tia, > > steved. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] rpc.idmapd dies with 'I/O possible' 2011-05-16 14:23 ` Steve Dickson 2011-05-16 14:39 ` J. Bruce Fields @ 2011-05-16 14:47 ` Luca Giuzzi 1 sibling, 0 replies; 5+ messages in thread From: Luca Giuzzi @ 2011-05-16 14:47 UTC (permalink / raw) To: Steve Dickson; +Cc: linux-nfs On Mon, May 16, 2011 at 4:23 PM, Steve Dickson <SteveD@redhat.com> wrote: > I just updated the bz with these questions, but we can have the > conversation here... > > I'm just a bit concerned by the fact that there are > a couple of places that sets F_SETSIG and then sets F_NOTIFY, > so does that mean those places are potential race cases as > well? > where does this happen? There are 3 places where I have seen these fcntl's. The first three are: 1. in idmapd.c, at nfsopen() 2. in gssd_proc.c at process_clnt_dir() 3. in gssd_main_loop.c at topdirs_add_entry() In all of these cases, the pattern is to set the SETSIG to something useful, and then enable NOTIFY, though; thus they should not determine a race. [there is also a fourth one in idmapd, where the signals are actually initialized (lines 357-360); however, in the case of failure of the first SETSIG the daemon does not start] > Also, after you applied this patch, did the problem go away? I honestly don't know: the problem seems to have disappeared, in the sense that the patched daemon has not died. However, before it occurred with quite a random pattern (usually, once per day per machine, but it really depended) so I cannot be sure (it might be it was also related to the overall load - unfortunately I do not have the proper statistics). best, lg ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-05-16 14:47 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-05-16 13:38 [PATCH] rpc.idmapd dies with 'I/O possible' Luca Giuzzi 2011-05-16 13:56 ` Jeff Layton 2011-05-16 14:23 ` Steve Dickson 2011-05-16 14:39 ` J. Bruce Fields 2011-05-16 14:47 ` Luca Giuzzi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).