* [PATCH] gssd: Improve scalability by not waiting for child processes
@ 2015-09-23 21:20 Steve Dickson
2015-09-25 10:53 ` Jeff Layton
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Steve Dickson @ 2015-09-23 21:20 UTC (permalink / raw)
To: Linux NFS Mailing list; +Cc: Systemd Mailing List
Instead of waiting on every fork, which would
become a bottle neck during a mount storm, simply
set a SIGCHLD signal handler to do the wait on
the child process
Signed-off-by: Steve Dickson <steved@redhat.com>
---
utils/gssd/gssd.c | 18 ++++++++++++++++++
utils/gssd/gssd_proc.c | 11 ++---------
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
index e480349..8b778cb 100644
--- a/utils/gssd/gssd.c
+++ b/utils/gssd/gssd.c
@@ -44,11 +44,13 @@
#define _GNU_SOURCE
#endif
+#include <sys/types.h>
#include <sys/param.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/inotify.h>
+#include <sys/wait.h>
#include <rpc/rpc.h>
#include <netinet/in.h>
#include <arpa/inet.h>
@@ -736,6 +738,21 @@ sig_die(int signal)
printerr(1, "exiting on signal %d\n", signal);
exit(0);
}
+static void
+sig_child(int signal)
+{
+ int err;
+ pid_t pid;
+
+ /* Parent: just wait on child to exit and return */
+ do {
+ pid = wait(&err);
+ } while(pid == -1 && errno != -ECHILD);
+
+ if (WIFSIGNALED(err))
+ printerr(0, "WARNING: forked child was killed"
+ "with signal %d\n", WTERMSIG(err));
+}
static void
usage(char *progname)
@@ -902,6 +919,7 @@ main(int argc, char *argv[])
signal(SIGINT, sig_die);
signal(SIGTERM, sig_die);
+ signal(SIGCHLD, sig_child);
signal_set(&sighup_ev, SIGHUP, gssd_scan_cb, NULL);
signal_add(&sighup_ev, NULL);
event_set(&inotify_ev, inotify_fd, EV_READ | EV_PERSIST, gssd_inotify_cb, NULL);
diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
index 11168b2..8f5ca03 100644
--- a/utils/gssd/gssd_proc.c
+++ b/utils/gssd/gssd_proc.c
@@ -656,16 +656,9 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
/* fork() failed! */
printerr(0, "WARNING: unable to fork() to handle"
"upcall: %s\n", strerror(errno));
- return;
+ /* FALLTHROUGH */
default:
- /* Parent: just wait on child to exit and return */
- do {
- pid = wait(&err);
- } while(pid == -1 && errno != -ECHILD);
-
- if (WIFSIGNALED(err))
- printerr(0, "WARNING: forked child was killed"
- "with signal %d\n", WTERMSIG(err));
+ /* Parent: Return and wait for the SIGCHLD */
return;
}
no_fork:
--
2.4.3
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] gssd: Improve scalability by not waiting for child processes
2015-09-23 21:20 [PATCH] gssd: Improve scalability by not waiting for child processes Steve Dickson
@ 2015-09-25 10:53 ` Jeff Layton
2015-09-26 13:55 ` Steve Dickson
2015-09-25 11:13 ` Jeff Layton
2015-10-04 8:19 ` [systemd-devel] " Florian Weimer
2 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2015-09-25 10:53 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list, Systemd Mailing List
On Wed, 23 Sep 2015 17:20:50 -0400
Steve Dickson <steved@redhat.com> wrote:
> Instead of waiting on every fork, which would
> become a bottle neck during a mount storm, simply
> set a SIGCHLD signal handler to do the wait on
> the child process
>
> Signed-off-by: Steve Dickson <steved@redhat.com>
> ---
> utils/gssd/gssd.c | 18 ++++++++++++++++++
> utils/gssd/gssd_proc.c | 11 ++---------
> 2 files changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
> index e480349..8b778cb 100644
> --- a/utils/gssd/gssd.c
> +++ b/utils/gssd/gssd.c
> @@ -44,11 +44,13 @@
> #define _GNU_SOURCE
> #endif
>
> +#include <sys/types.h>
> #include <sys/param.h>
> #include <sys/socket.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <sys/inotify.h>
> +#include <sys/wait.h>
> #include <rpc/rpc.h>
> #include <netinet/in.h>
> #include <arpa/inet.h>
> @@ -736,6 +738,21 @@ sig_die(int signal)
> printerr(1, "exiting on signal %d\n", signal);
> exit(0);
> }
> +static void
> +sig_child(int signal)
> +{
> + int err;
> + pid_t pid;
> +
> + /* Parent: just wait on child to exit and return */
> + do {
> + pid = wait(&err);
> + } while(pid == -1 && errno != -ECHILD);
> +
> + if (WIFSIGNALED(err))
> + printerr(0, "WARNING: forked child was killed"
> + "with signal %d\n", WTERMSIG(err));
> +}
>
> static void
> usage(char *progname)
> @@ -902,6 +919,7 @@ main(int argc, char *argv[])
>
> signal(SIGINT, sig_die);
> signal(SIGTERM, sig_die);
> + signal(SIGCHLD, sig_child);
> signal_set(&sighup_ev, SIGHUP, gssd_scan_cb, NULL);
> signal_add(&sighup_ev, NULL);
> event_set(&inotify_ev, inotify_fd, EV_READ | EV_PERSIST, gssd_inotify_cb, NULL);
> diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
> index 11168b2..8f5ca03 100644
> --- a/utils/gssd/gssd_proc.c
> +++ b/utils/gssd/gssd_proc.c
> @@ -656,16 +656,9 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
> /* fork() failed! */
> printerr(0, "WARNING: unable to fork() to handle"
> "upcall: %s\n", strerror(errno));
> - return;
> + /* FALLTHROUGH */
> default:
> - /* Parent: just wait on child to exit and return */
> - do {
> - pid = wait(&err);
> - } while(pid == -1 && errno != -ECHILD);
> -
> - if (WIFSIGNALED(err))
> - printerr(0, "WARNING: forked child was killed"
> - "with signal %d\n", WTERMSIG(err));
> + /* Parent: Return and wait for the SIGCHLD */
> return;
> }
> no_fork:
I was thinking that there was some reason that we couldn't do this --
that there were data structures that would get wiped if you got another
upcall while the first was being processed. The forking should prevent
that though, so I think this looks reasonable.
Acked-by: Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] gssd: Improve scalability by not waiting for child processes
2015-09-25 10:53 ` Jeff Layton
@ 2015-09-26 13:55 ` Steve Dickson
2015-09-26 13:59 ` Jeff Layton
0 siblings, 1 reply; 7+ messages in thread
From: Steve Dickson @ 2015-09-26 13:55 UTC (permalink / raw)
To: Jeff Layton; +Cc: Linux NFS Mailing list
On 09/25/2015 06:53 AM, Jeff Layton wrote:
> On Wed, 23 Sep 2015 17:20:50 -0400
> Steve Dickson <steved@redhat.com> wrote:
>
>> Instead of waiting on every fork, which would
>> become a bottle neck during a mount storm, simply
>> set a SIGCHLD signal handler to do the wait on
>> the child process
>>
>> Signed-off-by: Steve Dickson <steved@redhat.com>
>> ---
>> utils/gssd/gssd.c | 18 ++++++++++++++++++
>> utils/gssd/gssd_proc.c | 11 ++---------
>> 2 files changed, 20 insertions(+), 9 deletions(-)
>>
>> diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
>> index e480349..8b778cb 100644
>> --- a/utils/gssd/gssd.c
>> +++ b/utils/gssd/gssd.c
>> @@ -44,11 +44,13 @@
>> #define _GNU_SOURCE
>> #endif
>>
>> +#include <sys/types.h>
>> #include <sys/param.h>
>> #include <sys/socket.h>
>> #include <sys/time.h>
>> #include <sys/resource.h>
>> #include <sys/inotify.h>
>> +#include <sys/wait.h>
>> #include <rpc/rpc.h>
>> #include <netinet/in.h>
>> #include <arpa/inet.h>
>> @@ -736,6 +738,21 @@ sig_die(int signal)
>> printerr(1, "exiting on signal %d\n", signal);
>> exit(0);
>> }
>> +static void
>> +sig_child(int signal)
>> +{
>> + int err;
>> + pid_t pid;
>> +
>> + /* Parent: just wait on child to exit and return */
>> + do {
>> + pid = wait(&err);
>> + } while(pid == -1 && errno != -ECHILD);
>> +
>> + if (WIFSIGNALED(err))
>> + printerr(0, "WARNING: forked child was killed"
>> + "with signal %d\n", WTERMSIG(err));
>> +}
>>
>> static void
>> usage(char *progname)
>> @@ -902,6 +919,7 @@ main(int argc, char *argv[])
>>
>> signal(SIGINT, sig_die);
>> signal(SIGTERM, sig_die);
>> + signal(SIGCHLD, sig_child);
>> signal_set(&sighup_ev, SIGHUP, gssd_scan_cb, NULL);
>> signal_add(&sighup_ev, NULL);
>> event_set(&inotify_ev, inotify_fd, EV_READ | EV_PERSIST, gssd_inotify_cb, NULL);
>> diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
>> index 11168b2..8f5ca03 100644
>> --- a/utils/gssd/gssd_proc.c
>> +++ b/utils/gssd/gssd_proc.c
>> @@ -656,16 +656,9 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
>> /* fork() failed! */
>> printerr(0, "WARNING: unable to fork() to handle"
>> "upcall: %s\n", strerror(errno));
>> - return;
>> + /* FALLTHROUGH */
>> default:
>> - /* Parent: just wait on child to exit and return */
>> - do {
>> - pid = wait(&err);
>> - } while(pid == -1 && errno != -ECHILD);
>> -
>> - if (WIFSIGNALED(err))
>> - printerr(0, "WARNING: forked child was killed"
>> - "with signal %d\n", WTERMSIG(err));
>> + /* Parent: Return and wait for the SIGCHLD */
>> return;
>> }
>> no_fork:
>
> I was thinking that there was some reason that we couldn't do this --
> that there were data structures that would get wiped if you got another
> upcall while the first was being processed. The forking should prevent
> that though, so I think this looks reasonable.
>
> Acked-by: Jeff Layton <jlayton@poochiereds.net>
>
Self Nak... During my testing there was a large number zombie rpc.gssd
process... I'm not sure why but they are there...
steved.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] gssd: Improve scalability by not waiting for child processes
2015-09-26 13:55 ` Steve Dickson
@ 2015-09-26 13:59 ` Jeff Layton
0 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2015-09-26 13:59 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list
On Sat, 26 Sep 2015 09:55:21 -0400
Steve Dickson <SteveD@redhat.com> wrote:
>
>
> On 09/25/2015 06:53 AM, Jeff Layton wrote:
> > On Wed, 23 Sep 2015 17:20:50 -0400
> > Steve Dickson <steved@redhat.com> wrote:
> >
> >> Instead of waiting on every fork, which would
> >> become a bottle neck during a mount storm, simply
> >> set a SIGCHLD signal handler to do the wait on
> >> the child process
> >>
> >> Signed-off-by: Steve Dickson <steved@redhat.com>
> >> ---
> >> utils/gssd/gssd.c | 18 ++++++++++++++++++
> >> utils/gssd/gssd_proc.c | 11 ++---------
> >> 2 files changed, 20 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
> >> index e480349..8b778cb 100644
> >> --- a/utils/gssd/gssd.c
> >> +++ b/utils/gssd/gssd.c
> >> @@ -44,11 +44,13 @@
> >> #define _GNU_SOURCE
> >> #endif
> >>
> >> +#include <sys/types.h>
> >> #include <sys/param.h>
> >> #include <sys/socket.h>
> >> #include <sys/time.h>
> >> #include <sys/resource.h>
> >> #include <sys/inotify.h>
> >> +#include <sys/wait.h>
> >> #include <rpc/rpc.h>
> >> #include <netinet/in.h>
> >> #include <arpa/inet.h>
> >> @@ -736,6 +738,21 @@ sig_die(int signal)
> >> printerr(1, "exiting on signal %d\n", signal);
> >> exit(0);
> >> }
> >> +static void
> >> +sig_child(int signal)
> >> +{
> >> + int err;
> >> + pid_t pid;
> >> +
> >> + /* Parent: just wait on child to exit and return */
> >> + do {
> >> + pid = wait(&err);
> >> + } while(pid == -1 && errno != -ECHILD);
> >> +
> >> + if (WIFSIGNALED(err))
> >> + printerr(0, "WARNING: forked child was killed"
> >> + "with signal %d\n", WTERMSIG(err));
> >> +}
> >>
> >> static void
> >> usage(char *progname)
> >> @@ -902,6 +919,7 @@ main(int argc, char *argv[])
> >>
> >> signal(SIGINT, sig_die);
> >> signal(SIGTERM, sig_die);
> >> + signal(SIGCHLD, sig_child);
> >> signal_set(&sighup_ev, SIGHUP, gssd_scan_cb, NULL);
> >> signal_add(&sighup_ev, NULL);
> >> event_set(&inotify_ev, inotify_fd, EV_READ | EV_PERSIST, gssd_inotify_cb, NULL);
> >> diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
> >> index 11168b2..8f5ca03 100644
> >> --- a/utils/gssd/gssd_proc.c
> >> +++ b/utils/gssd/gssd_proc.c
> >> @@ -656,16 +656,9 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
> >> /* fork() failed! */
> >> printerr(0, "WARNING: unable to fork() to handle"
> >> "upcall: %s\n", strerror(errno));
> >> - return;
> >> + /* FALLTHROUGH */
> >> default:
> >> - /* Parent: just wait on child to exit and return */
> >> - do {
> >> - pid = wait(&err);
> >> - } while(pid == -1 && errno != -ECHILD);
> >> -
> >> - if (WIFSIGNALED(err))
> >> - printerr(0, "WARNING: forked child was killed"
> >> - "with signal %d\n", WTERMSIG(err));
> >> + /* Parent: Return and wait for the SIGCHLD */
> >> return;
> >> }
> >> no_fork:
> >
> > I was thinking that there was some reason that we couldn't do this --
> > that there were data structures that would get wiped if you got another
> > upcall while the first was being processed. The forking should prevent
> > that though, so I think this looks reasonable.
> >
> > Acked-by: Jeff Layton <jlayton@poochiereds.net>
> >
> Self Nak... During my testing there was a large number zombie rpc.gssd
> process... I'm not sure why but they are there...
>
> steved.
It's probably what I mentioned in the other mail.
If you get several children exiting in quick succession then you may
only have one SIGCHLD pending. What you really want to do there is to
have your SIGCHLD handler reap as many children as it can in a
non-blocking fashion and then return. Even better would be to utilize
the event loop for handling the signal...
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] gssd: Improve scalability by not waiting for child processes
2015-09-23 21:20 [PATCH] gssd: Improve scalability by not waiting for child processes Steve Dickson
2015-09-25 10:53 ` Jeff Layton
@ 2015-09-25 11:13 ` Jeff Layton
2015-10-04 8:19 ` [systemd-devel] " Florian Weimer
2 siblings, 0 replies; 7+ messages in thread
From: Jeff Layton @ 2015-09-25 11:13 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list, Systemd Mailing List
On Wed, 23 Sep 2015 17:20:50 -0400
Steve Dickson <steved@redhat.com> wrote:
> Instead of waiting on every fork, which would
> become a bottle neck during a mount storm, simply
> set a SIGCHLD signal handler to do the wait on
> the child process
>
> Signed-off-by: Steve Dickson <steved@redhat.com>
> ---
> utils/gssd/gssd.c | 18 ++++++++++++++++++
> utils/gssd/gssd_proc.c | 11 ++---------
> 2 files changed, 20 insertions(+), 9 deletions(-)
>
> diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c
> index e480349..8b778cb 100644
> --- a/utils/gssd/gssd.c
> +++ b/utils/gssd/gssd.c
> @@ -44,11 +44,13 @@
> #define _GNU_SOURCE
> #endif
>
> +#include <sys/types.h>
> #include <sys/param.h>
> #include <sys/socket.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> #include <sys/inotify.h>
> +#include <sys/wait.h>
> #include <rpc/rpc.h>
> #include <netinet/in.h>
> #include <arpa/inet.h>
> @@ -736,6 +738,21 @@ sig_die(int signal)
> printerr(1, "exiting on signal %d\n", signal);
> exit(0);
> }
> +static void
> +sig_child(int signal)
> +{
> + int err;
> + pid_t pid;
> +
> + /* Parent: just wait on child to exit and return */
> + do {
> + pid = wait(&err);
> + } while(pid == -1 && errno != -ECHILD);
> +
> + if (WIFSIGNALED(err))
> + printerr(0, "WARNING: forked child was killed"
> + "with signal %d\n", WTERMSIG(err));
> +}
>
That said, there is a problem here. You're going to get a SIGCHLD for
each child that exits and multiples could exit at nearly the same time,
so some of the SIGCHLDs can get "lost". This probably needs to keep
reaping children (in a non-blocking way) until there aren't any more
that have exited.
Also, given that gssd is based around libevent, you may be better off
adding a libevent signal event handler instead of doing it with a "raw"
signal handler. That way you won't block upcalls that are in the middle
of running just to reap child processes.
> static void
> usage(char *progname)
> @@ -902,6 +919,7 @@ main(int argc, char *argv[])
>
> signal(SIGINT, sig_die);
> signal(SIGTERM, sig_die);
> + signal(SIGCHLD, sig_child);
> signal_set(&sighup_ev, SIGHUP, gssd_scan_cb, NULL);
> signal_add(&sighup_ev, NULL);
> event_set(&inotify_ev, inotify_fd, EV_READ | EV_PERSIST, gssd_inotify_cb, NULL);
> diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c
> index 11168b2..8f5ca03 100644
> --- a/utils/gssd/gssd_proc.c
> +++ b/utils/gssd/gssd_proc.c
> @@ -656,16 +656,9 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname,
> /* fork() failed! */
> printerr(0, "WARNING: unable to fork() to handle"
> "upcall: %s\n", strerror(errno));
> - return;
> + /* FALLTHROUGH */
> default:
> - /* Parent: just wait on child to exit and return */
> - do {
> - pid = wait(&err);
> - } while(pid == -1 && errno != -ECHILD);
> -
> - if (WIFSIGNALED(err))
> - printerr(0, "WARNING: forked child was killed"
> - "with signal %d\n", WTERMSIG(err));
> + /* Parent: Return and wait for the SIGCHLD */
> return;
> }
> no_fork:
--
Jeff Layton <jlayton@poochiereds.net>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [systemd-devel] [PATCH] gssd: Improve scalability by not waiting for child processes
2015-09-23 21:20 [PATCH] gssd: Improve scalability by not waiting for child processes Steve Dickson
2015-09-25 10:53 ` Jeff Layton
2015-09-25 11:13 ` Jeff Layton
@ 2015-10-04 8:19 ` Florian Weimer
2015-10-05 11:50 ` Steve Dickson
2 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2015-10-04 8:19 UTC (permalink / raw)
To: Steve Dickson; +Cc: Linux NFS Mailing list, Systemd Mailing List
* Steve Dickson:
> +static void
> +sig_child(int signal)
> +{
> + int err;
> + pid_t pid;
> +
> + /* Parent: just wait on child to exit and return */
> + do {
> + pid = wait(&err);
> + } while(pid == -1 && errno != -ECHILD);
> +
> + if (WIFSIGNALED(err))
> + printerr(0, "WARNING: forked child was killed"
> + "with signal %d\n", WTERMSIG(err));
> +}
prinerr calls vfprintf or vsyslog. Neither is safe to use in signal
handlers, so you need to log this message in some other way.
Florian
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [systemd-devel] [PATCH] gssd: Improve scalability by not waiting for child processes
2015-10-04 8:19 ` [systemd-devel] " Florian Weimer
@ 2015-10-05 11:50 ` Steve Dickson
0 siblings, 0 replies; 7+ messages in thread
From: Steve Dickson @ 2015-10-05 11:50 UTC (permalink / raw)
To: Florian Weimer; +Cc: Linux NFS Mailing list, Systemd Mailing List
On 10/04/2015 04:19 AM, Florian Weimer wrote:
> * Steve Dickson:
>
>> +static void
>> +sig_child(int signal)
>> +{
>> + int err;
>> + pid_t pid;
>> +
>> + /* Parent: just wait on child to exit and return */
>> + do {
>> + pid = wait(&err);
>> + } while(pid == -1 && errno != -ECHILD);
>> +
>> + if (WIFSIGNALED(err))
>> + printerr(0, "WARNING: forked child was killed"
>> + "with signal %d\n", WTERMSIG(err));
>> +}
>
> prinerr calls vfprintf or vsyslog. Neither is safe to use in signal
> handlers, so you need to log this message in some other way.
Good point... but this patch was self NAK-ed due to it leaving
zombie processes during my testing.
steved.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-10-05 11:50 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-09-23 21:20 [PATCH] gssd: Improve scalability by not waiting for child processes Steve Dickson
2015-09-25 10:53 ` Jeff Layton
2015-09-26 13:55 ` Steve Dickson
2015-09-26 13:59 ` Jeff Layton
2015-09-25 11:13 ` Jeff Layton
2015-10-04 8:19 ` [systemd-devel] " Florian Weimer
2015-10-05 11:50 ` Steve Dickson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).