* [RFC PATCH] Using pthreads to handle upcalls. @ 2015-11-09 13:05 Steve Dickson 2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson 2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields 0 siblings, 2 replies; 5+ messages in thread From: Steve Dickson @ 2015-11-09 13:05 UTC (permalink / raw) To: Linux NFS Mailing list Recently a bug was found that was causing a TGT fetch for every mount upcall. The bug was caused when forking for every mount was introduce. The global memory for the TGT cache was being freed when the forked process existed. The fix we came up with was to only fork on non-root upcalls, basically mount upcalls would no longer fork. In debugging the patch it became apparent that if the process hung, all NFS mounts on that client would be blocked. So at this point rpc.gssd is a single point of failure. This patch replaces the forking/non-forking with creating pthreads for every upcall which I think is a better solution to the original problem since pthreads can share global data. I was also hoping using pthread would bring more asynchronous to rpc.gssd. I was thinking rpc.gssd could take an upcall, fire off a thread to handle it, the go back and listen for more upcalls. Unfortunately this is not the case. It seems, maybe due to my lack of my pthreads understanding, that after each pthread_create() call, a pthread_join() call, which waits for the created to stop, is needed. Similar to fork/wait.. This means if an upcall pthread gets hung the daemon is also hung... The same single point of failure... I do believe using threads is a better solution than the non-fork solution, but rpc.gssd is still a single point of failure. Plus I'm hoping moving to pthread will allow us to solve that problem. Steve Dickson (1): gssd: use pthreads to handle upcalls aclocal/libpthread.m4 | 13 +++++++++++++ configure.ac | 3 +++ utils/gssd/Makefile.am | 3 ++- utils/gssd/gssd.c | 18 ++++++++++++++++-- utils/gssd/gssd.h | 2 ++ utils/gssd/gssd_proc.c | 30 ------------------------------ utils/gssd/krb5_util.c | 3 +++ 7 files changed, 39 insertions(+), 33 deletions(-) create mode 100644 aclocal/libpthread.m4 -- 2.4.3 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] gssd: use pthreads to handle upcalls 2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson @ 2015-11-09 13:05 ` Steve Dickson 2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields 1 sibling, 0 replies; 5+ messages in thread From: Steve Dickson @ 2015-11-09 13:05 UTC (permalink / raw) To: Linux NFS Mailing list Currently, to persevere global data over multiple mounts, the root process does not fork when handling an upcall. Instead of not-forking, create a pthread to handle the all upcalls since global data can be shared among threads. Signed-off-by: Steve Dickson <steved@redhat.com> --- aclocal/libpthread.m4 | 13 +++++++++++++ configure.ac | 3 +++ utils/gssd/Makefile.am | 3 ++- utils/gssd/gssd.c | 18 ++++++++++++++++-- utils/gssd/gssd.h | 2 ++ utils/gssd/gssd_proc.c | 30 ------------------------------ utils/gssd/krb5_util.c | 3 +++ 7 files changed, 39 insertions(+), 33 deletions(-) create mode 100644 aclocal/libpthread.m4 diff --git a/aclocal/libpthread.m4 b/aclocal/libpthread.m4 new file mode 100644 index 0000000..e87d2a0 --- /dev/null +++ b/aclocal/libpthread.m4 @@ -0,0 +1,13 @@ +dnl Checks for pthreads library and headers +dnl +AC_DEFUN([AC_LIBPTHREAD], [ + + dnl Check for library, but do not add -lpthreads to LIBS + AC_CHECK_LIB([pthread], [pthread_create], [LIBPTHREAD=-lpthread], + [AC_MSG_ERROR([libpthread not found.])]) + AC_SUBST(LIBPTHREAD) + + AC_CHECK_HEADERS([pthread.h], , + [AC_MSG_ERROR([libpthread headers not found.])]) + +])dnl diff --git a/configure.ac b/configure.ac index 25d2ba4..e0ecd61 100644 --- a/configure.ac +++ b/configure.ac @@ -385,6 +385,9 @@ if test "$enable_gss" = yes; then dnl Invoked after AC_KERBEROS_V5; AC_LIBRPCSECGSS needs to have KRBLIBS set AC_LIBRPCSECGSS + dnl Check for pthreads + AC_LIBPTHREAD + dnl librpcsecgss already has a dependency on libgssapi, dnl but we need to make sure we get the right version if test "$enable_gss" = yes; then diff --git a/utils/gssd/Makefile.am b/utils/gssd/Makefile.am index cb040b3..3f5f59a 100644 --- a/utils/gssd/Makefile.am +++ b/utils/gssd/Makefile.am @@ -49,7 +49,8 @@ gssd_LDADD = \ $(RPCSECGSS_LIBS) \ $(KRBLIBS) \ $(GSSAPI_LIBS) \ - $(LIBTIRPC) + $(LIBTIRPC) \ + $(LIBPTHREAD) gssd_LDFLAGS = \ $(KRBLDFLAGS) diff --git a/utils/gssd/gssd.c b/utils/gssd/gssd.c index e7cb07f..3c425ce 100644 --- a/utils/gssd/gssd.c +++ b/utils/gssd/gssd.c @@ -365,16 +365,30 @@ static void gssd_clnt_gssd_cb(int UNUSED(fd), short UNUSED(which), void *data) { struct clnt_info *clp = data; + pthread_t th; + int ret; - handle_gssd_upcall(clp); + ret = pthread_create(&th, NULL, (void *)handle_gssd_upcall, (void *)clp); + if (ret != 0) { + printerr(0, "pthread_create failed: ret %d: %s\n", ret, strerror(errno)); + return; + } + pthread_join(th, NULL); } static void gssd_clnt_krb5_cb(int UNUSED(fd), short UNUSED(which), void *data) { struct clnt_info *clp = data; + pthread_t th; + int ret; + + ret = pthread_create(&th, NULL, (void *)handle_krb5_upcall, (void *)clp); + if (ret != 0) { + printerr(0, "pthread_create failed: ret %d: %s\n", ret, strerror(errno)); + } - handle_krb5_upcall(clp); + pthread_join(th, NULL); } static struct clnt_info * diff --git a/utils/gssd/gssd.h b/utils/gssd/gssd.h index c6937c5..8677329 100644 --- a/utils/gssd/gssd.h +++ b/utils/gssd/gssd.h @@ -36,6 +36,7 @@ #include <gssapi/gssapi.h> #include <event.h> #include <stdbool.h> +#include <pthread.h> #ifndef GSSD_PIPEFS_DIR #define GSSD_PIPEFS_DIR "/var/lib/nfs/rpc_pipefs" @@ -61,6 +62,7 @@ extern int root_uses_machine_creds; extern unsigned int context_timeout; extern unsigned int rpc_timeout; extern char *preferred_realm; +extern pthread_mutex_t ple_lock; struct clnt_info { TAILQ_ENTRY(clnt_info) list; diff --git a/utils/gssd/gssd_proc.c b/utils/gssd/gssd_proc.c index 1ef68d8..616082e 100644 --- a/utils/gssd/gssd_proc.c +++ b/utils/gssd/gssd_proc.c @@ -629,36 +629,6 @@ process_krb5_upcall(struct clnt_info *clp, uid_t uid, int fd, char *tgtname, if (uid != 0 || (uid == 0 && root_uses_machine_creds == 0 && service == NULL)) { - /* already running as uid 0 */ - if (uid == 0) - goto no_fork; - - pid = fork(); - switch(pid) { - case 0: - /* Child: fall through to rest of function */ - childpid = getpid(); - unsetenv("KRB5CCNAME"); - printerr(2, "CHILD forked pid %d \n", childpid); - break; - case -1: - /* fork() failed! */ - printerr(0, "WARNING: unable to fork() to handle" - "upcall: %s\n", strerror(errno)); - return; - default: - /* Parent: just wait on child to exit and return */ - do { - pid = wait(&err); - } while(pid == -1 && errno != -ECHILD); - - if (WIFSIGNALED(err)) - printerr(0, "WARNING: forked child was killed" - "with signal %d\n", WTERMSIG(err)); - return; - } -no_fork: - auth = krb5_not_machine_creds(clp, uid, tgtname, &downcall_err, &err, &rpc_clnt); if (err) diff --git a/utils/gssd/krb5_util.c b/utils/gssd/krb5_util.c index 8ef8184..7354dc7 100644 --- a/utils/gssd/krb5_util.c +++ b/utils/gssd/krb5_util.c @@ -128,6 +128,7 @@ /* Global list of principals/cache file names for machine credentials */ struct gssd_k5_kt_princ *gssd_k5_kt_princ_list = NULL; +pthread_mutex_t ple_lock; #ifdef HAVE_SET_ALLOWABLE_ENCTYPES int limit_to_legacy_enctypes = 0; @@ -586,10 +587,12 @@ get_ple_by_princ(krb5_context context, krb5_principal princ) /* Need to serialize list if we ever become multi-threaded! */ + pthread_mutex_lock(&ple_lock); ple = find_ple_by_princ(context, princ); if (ple == NULL) { ple = new_ple(context, princ); } + pthread_mutex_unlock(&ple_lock); return ple; } -- 2.4.3 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] Using pthreads to handle upcalls. 2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson 2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson @ 2015-11-09 18:39 ` J. Bruce Fields 2015-11-09 20:11 ` Steve Dickson 1 sibling, 1 reply; 5+ messages in thread From: J. Bruce Fields @ 2015-11-09 18:39 UTC (permalink / raw) To: Steve Dickson; +Cc: Linux NFS Mailing list On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote: > Recently a bug was found that was causing a > TGT fetch for every mount upcall. The bug was > caused when forking for every mount was introduce. > The global memory for the TGT cache was being > freed when the forked process existed. > > The fix we came up with was to only fork on non-root > upcalls, basically mount upcalls would no longer fork. > In debugging the patch it became apparent that if the > process hung, all NFS mounts on that client would be blocked. > So at this point rpc.gssd is a single point of failure. > > This patch replaces the forking/non-forking with creating > pthreads for every upcall which I think is a better > solution to the original problem since pthreads can share > global data. I seem to recall the reason for the fork is to allow dropping some privileges while processing the upcall, is that right? But looking at pthreads(7), it looks like those are probably shared (e.g., it says user and group IDs are process-wide). > I was also hoping using pthread would bring more asynchronous > to rpc.gssd. I was thinking rpc.gssd could take an upcall, > fire off a thread to handle it, the go back and listen > for more upcalls. > > Unfortunately this is not the case. It seems, maybe due to > my lack of my pthreads understanding, that after each > pthread_create() call, a pthread_join() call, which waits for > the created to stop, is needed. Similar to fork/wait.. Actually making gssd thread-safe would be a significant effort. > This means if an upcall pthread gets hung the daemon > is also hung... The same single point of failure... > > I do believe using threads is a better solution than > the non-fork solution, but rpc.gssd is still a single > point of failure. Plus I'm hoping moving to pthread will > allow us to solve that problem. So this doesn't actually fix anything right now? --b. > > Steve Dickson (1): > gssd: use pthreads to handle upcalls > > aclocal/libpthread.m4 | 13 +++++++++++++ > configure.ac | 3 +++ > utils/gssd/Makefile.am | 3 ++- > utils/gssd/gssd.c | 18 ++++++++++++++++-- > utils/gssd/gssd.h | 2 ++ > utils/gssd/gssd_proc.c | 30 ------------------------------ > utils/gssd/krb5_util.c | 3 +++ > 7 files changed, 39 insertions(+), 33 deletions(-) > create mode 100644 aclocal/libpthread.m4 > > -- > 2.4.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] Using pthreads to handle upcalls. 2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields @ 2015-11-09 20:11 ` Steve Dickson 2015-11-09 21:31 ` J. Bruce Fields 0 siblings, 1 reply; 5+ messages in thread From: Steve Dickson @ 2015-11-09 20:11 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Linux NFS Mailing list Hey, On 11/09/2015 01:39 PM, J. Bruce Fields wrote: > On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote: >> Recently a bug was found that was causing a >> TGT fetch for every mount upcall. The bug was >> caused when forking for every mount was introduce. >> The global memory for the TGT cache was being >> freed when the forked process existed. >> >> The fix we came up with was to only fork on non-root >> upcalls, basically mount upcalls would no longer fork. >> In debugging the patch it became apparent that if the >> process hung, all NFS mounts on that client would be blocked. >> So at this point rpc.gssd is a single point of failure. >> >> This patch replaces the forking/non-forking with creating >> pthreads for every upcall which I think is a better >> solution to the original problem since pthreads can share >> global data. > > I seem to recall the reason for the fork is to allow dropping some > privileges while processing the upcall, is that right? I don't see this where privileges are being dropped. > But looking at pthreads(7), it looks like those are probably > shared (e.g., it says user and group IDs are process-wide). I think they are and that's also why thread can access the same global data. > >> I was also hoping using pthread would bring more asynchronous >> to rpc.gssd. I was thinking rpc.gssd could take an upcall, >> fire off a thread to handle it, the go back and listen >> for more upcalls. >> >> Unfortunately this is not the case. It seems, maybe due to >> my lack of my pthreads understanding, that after each >> pthread_create() call, a pthread_join() call, which waits for >> the created to stop, is needed. Similar to fork/wait.. > > Actually making gssd thread-safe would be a significant effort. Is it because the MIT libs are not thread-safe? Isn't the gssd_k5_kt_princ_list the only global list? Something to do with the upcalls? > >> This means if an upcall pthread gets hung the daemon >> is also hung... The same single point of failure... >> >> I do believe using threads is a better solution than >> the non-fork solution, but rpc.gssd is still a single >> point of failure. Plus I'm hoping moving to pthread will >> allow us to solve that problem. > > So this doesn't actually fix anything right now? No.. it does not... :-) But I do think its a clearer way handling global lists via multiple threads/process. It also makes the top of process_krb5_upcall easier to read, IMHO... Plus it does introduce pthread to nfs-utils.. So maybe some day we can pull some these daemons into the 21th century by multi-threading them.. or kill them! I'm good either way. 8-) Thanks for the cycles!!! steved. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH] Using pthreads to handle upcalls. 2015-11-09 20:11 ` Steve Dickson @ 2015-11-09 21:31 ` J. Bruce Fields 0 siblings, 0 replies; 5+ messages in thread From: J. Bruce Fields @ 2015-11-09 21:31 UTC (permalink / raw) To: Steve Dickson; +Cc: Linux NFS Mailing list On Mon, Nov 09, 2015 at 03:11:30PM -0500, Steve Dickson wrote: > Hey, > > On 11/09/2015 01:39 PM, J. Bruce Fields wrote: > > On Mon, Nov 09, 2015 at 08:05:07AM -0500, Steve Dickson wrote: > >> Recently a bug was found that was causing a > >> TGT fetch for every mount upcall. The bug was > >> caused when forking for every mount was introduce. > >> The global memory for the TGT cache was being > >> freed when the forked process existed. > >> > >> The fix we came up with was to only fork on non-root > >> upcalls, basically mount upcalls would no longer fork. > >> In debugging the patch it became apparent that if the > >> process hung, all NFS mounts on that client would be blocked. > >> So at this point rpc.gssd is a single point of failure. > >> > >> This patch replaces the forking/non-forking with creating > >> pthreads for every upcall which I think is a better > >> solution to the original problem since pthreads can share > >> global data. > > > > I seem to recall the reason for the fork is to allow dropping some > > privileges while processing the upcall, is that right? > I don't see this where privileges are being dropped. See 6b53fc9ce38ba6fff2fd5c2f6ed143747067a39d "gssd: do a more thorough change of identity after forking". I think the id's its switching there are global to a process, not specific to a single pthread. > >> I was also hoping using pthread would bring more asynchronous > >> to rpc.gssd. I was thinking rpc.gssd could take an upcall, > >> fire off a thread to handle it, the go back and listen > >> for more upcalls. > >> > >> Unfortunately this is not the case. It seems, maybe due to > >> my lack of my pthreads understanding, that after each > >> pthread_create() call, a pthread_join() call, which waits for > >> the created to stop, is needed. Similar to fork/wait.. > > > > Actually making gssd thread-safe would be a significant effort. > Is it because the MIT libs are not thread-safe? I don't know, it's one of many things we'd need to check. > Isn't the gssd_k5_kt_princ_list the only global list? Something to do > with the upcalls? There's also some global data that keeps track of the state of the rpc pipefs directory--topdir_list, etc. It might be easier just to share that one list? I don't know what would be involved, though. I'm not convinced this is a good way to help with hangs--a lot of the cases that would cause a hang are going to cause further threads to hang too. I can believe the parallelism might be useful in some cases for performance reasons, but I don't know if we've seen such a case yet. --b. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-11-09 21:31 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-11-09 13:05 [RFC PATCH] Using pthreads to handle upcalls Steve Dickson 2015-11-09 13:05 ` [PATCH] gssd: use " Steve Dickson 2015-11-09 18:39 ` [RFC PATCH] Using " J. Bruce Fields 2015-11-09 20:11 ` Steve Dickson 2015-11-09 21:31 ` J. Bruce Fields
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.