* main thread pthread_exit/sys_exit bug!
@ 2009-02-01 22:32 Kaz Kylheku
[not found] ` <20090201174159.4a52e15c.akpm@linux-foundation.org>
0 siblings, 1 reply; 20+ messages in thread
From: Kaz Kylheku @ 2009-02-01 22:32 UTC (permalink / raw)
To: linux-kernel
Basically, if you call pthread_exit from the main thread of a process, and keep
other threads running, the behavior is ugly.
I logged this initially as a bug against glibc, but then resolved it
with a kernel patch against linux 2.6.26:
Please see:
http://sources.redhat.com/bugzilla/show_bug.cgi?id=9804
I've known about this for some time, first having reproduced it on 2.6.17;
finally got around to fixing it.
When the main thread of a POSIX threads process calls pthread_exit, the process
should stick around until all the other threads do the same, or until one of
them calls _exit or exit, or until the process terminates abnormally. During
this time, it would be nice if the process behaved normally: if it did not
appear defunct in the process list and if POSIX job control was possible on it.
An easy way to achieve this is to insert a wait into the top of sys_exit, so
that do_exit is not called unless all the other threads have terminated. This
is another special case like do_group_exit. In the group exit, we zap the other
threads. In this case, we must not zap the other threads, but neither should we
fall through do_exit and become defunct!
The patch involves a controversial move: returning -ERESTARTSYS from sys_exit.
This is because the main thread may be stuck in sys_exit and have to respond to
a signal, and then go back to sys_exit. It appears to be working fine.
^ permalink raw reply [flat|nested] 20+ messages in thread[parent not found: <20090201174159.4a52e15c.akpm@linux-foundation.org>]
* Re: main thread pthread_exit/sys_exit bug! [not found] ` <20090201174159.4a52e15c.akpm@linux-foundation.org> @ 2009-02-02 6:45 ` Oleg Nesterov 2009-02-02 7:10 ` Kaz Kylheku 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-02 6:45 UTC (permalink / raw) To: Kaz Kylheku, Andrew Morton; +Cc: linux-kernel Kaz Kylheku wrote: > > Basically, if you call pthread_exit from the main thread of a process, and keep > other threads running, the behavior is ugly. Yes, known problem. Please look at [RFC,PATCH 3/3] do_wait: fix waiting for stopped group with dead leader http://marc.info/?t=119713920000003 I'll try to re-do and re-send this patch this week. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 6:45 ` Oleg Nesterov @ 2009-02-02 7:10 ` Kaz Kylheku 2009-02-02 16:56 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Kaz Kylheku @ 2009-02-02 7:10 UTC (permalink / raw) To: linux-kernel; +Cc: Andrew Morton, Oleg Nesterov On Sun, Feb 1, 2009 at 10:45 PM, Oleg Nesterov <oleg@redhat.com> wrote: > Kaz Kylheku wrote: >> >> Basically, if you call pthread_exit from the main thread of a process, and keep >> other threads running, the behavior is ugly. > > Yes, known problem. > > Please look at > > [RFC,PATCH 3/3] do_wait: fix waiting for stopped group with dead leader > http://marc.info/?t=119713920000003 > > I'll try to re-do and re-send this patch this week. I believe that my straight-forward fix is pretty much good to go. I checked into my distro, so we will see how it holds up. It's a bad idea to allow the main thread to terminate. It should stick around because it serves as a facade for the process as a whole. If the main thread is allowed to bail all the way through do_exit, who knows what kind of problems may show up because of that. What if one of my developers is working on a server which has called pthread_exit in the main thread, and wants to attach gdb to it? Will that work if the main thread (a.k.a group leader) is a defunct process? I just tried this test case and it worked perfectly with my patch! gdb attached to the process by the pid of teh group leader. It correctly showed as that thread being stopped in __exit_thread. I can see the other threads, etc. bash:~# /projects/sw/kaz/bug-repro-programs/pthread-exit & [1] 2093 bash:~# gdb -p 2093 GNU gdb 6.8 Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "mips64-linux". Attaching to process 2093 Reading symbols from /projects/sw/kaz/bug-repro-programs/pthread-exit...done. Reading symbols from /lib32/libpthread.so.0...done. [Thread debugging using libthread_db enabled] [New Thread 0x2d46c4b0 (LWP 2098)] [New Thread 0x2cc6c4b0 (LWP 2097)] [New Thread 0x2c46c4b0 (LWP 2096)] [New Thread 0x2bc6c4b0 (LWP 2095)] [New Thread 0x2b46c4b0 (LWP 2094)] Loaded symbols for /lib32/libpthread.so.0 Reading symbols from /lib32/libc.so.6...done. Loaded symbols for /lib32/libc.so.6 Reading symbols from /lib32/ld.so.1...done. Loaded symbols for /lib32/ld.so.1 Reading symbols from /lib32/libgcc_s.so.1...done. Loaded symbols for /lib32/libgcc_s.so.1 0x2abd3da4 in __exit_thread () from /lib32/libc.so.6 (gdb) where #0 0x2abd3da4 in __exit_thread () from /lib32/libc.so.6 #1 0x2ab18ab0 in __libc_start_main (main=0x10000710 <main>, argc=1, ubp_av=0x7fcd7524, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=<value optimized out>) at libc-start.c:245 #2 0x100005dc in _ftext () If I try this on an unpatched kernel that allows a main thread to bail through do_exit, this is what happens: Attaching to process 14651 ptrace: No such process. /root/14651: No such file or directory. I don't think that this is solved by any patch that allows the group leader to bail through do_exit. It's not just a problem of waiting on a dead group leader. If you want to maintain the illusion that the OS provides a process that contains threads, and the group leader is the representation of that process, then you have to keep that leader alive; the lifetime of that leader cannot be shorter than that of the process illusion. Patch: http://sourceware.org/bugzilla/attachment.cgi?id=3702 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 7:10 ` Kaz Kylheku @ 2009-02-02 16:56 ` Oleg Nesterov 2009-02-02 20:10 ` Kaz Kylheku ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: Oleg Nesterov @ 2009-02-02 16:56 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-kernel, Andrew Morton, Roland McGrath, Ulrich Drepper On 02/01, Kaz Kylheku wrote: > > On Sun, Feb 1, 2009 at 10:45 PM, Oleg Nesterov <oleg@redhat.com> wrote: > > Kaz Kylheku wrote: > >> > >> Basically, if you call pthread_exit from the main thread of a process, and keep > >> other threads running, the behavior is ugly. > > > > Yes, known problem. > > > > Please look at > > > > [RFC,PATCH 3/3] do_wait: fix waiting for stopped group with dead leader > > http://marc.info/?t=119713920000003 > > > > I'll try to re-do and re-send this patch this week. > > I believe that my straight-forward fix is pretty much good to go. I > checked into my distro, so we will see how it holds up. (the patch: http://sourceware.org/bugzilla/attachment.cgi?id=3702) Well, perhaps something like your patch makes sense. +/* + * A single thread is exiting, and it is the leader of the group. + * This coresponds to the main thread calling pthread_exit. + * In this case, the persists until all the other + * threads call pthread_exit, or someone calls exit or _exit. + * To implement this case, we park the thread leader in + * a loop which waits until the thread list becomes empty, + * or it receives a fatal signal. ^^^^^^^^^^^^^^ The comment is not exactly right, we return on every signal, not only fatal. +int +do_leader_exit(void) +{ + int ret = 0; + + if (unlikely(!thread_group_empty(current))) { + DECLARE_WAITQUEUE(wait, current); + + add_wait_queue(¤t->signal->wait_chldexit, &wait); + + set_current_state(TASK_INTERRUPTIBLE); + + do { + if (thread_group_empty(current)) + break; + + try_to_freeze(); + schedule(); + if (signal_pending(current)) + ret = -ERESTARTSYS; + } while (ret == 0); + + remove_wait_queue(¤t->signal->wait_chldexit, &wait); + + __set_current_state(TASK_RUNNING); + } + + return ret; +} the above is just the open-coded wait_event_interruptible(¤t->signal->wait_chldexit, thread_group_empty(current)); asmlinkage long sys_exit(int error_code) { + if (thread_group_leader(current)) { + int ret = do_leader_exit(); + if (ret != 0) + return ret; + } do_exit((error_code&0xff)<<8); } afaics, -ERESTARTSYS is not exactly correct. We can dequeue the signal without SA_RESTART. But we never should abort sys_exit(). ERESTARTNOINTR is better. But see below. I am worried this patch can confuse the user-space. Because, when the main thread does sys_exit(), the user-space has all rights to assume it exits ;) But with this patch the main thread will continue to handle the signals until the while group exits, I'm afraid libpthread.so won't be happy. And what if the signal handler does siglongjmp() and aborts sys_exit() ? > It's a bad idea to allow the main thread to terminate. It should stick > around because it serves as a facade for the process as a whole. If > the main thread is allowed to bail all the way through do_exit, who > knows what kind of problems may show up because of that. > > What if one of my developers is working on a server which has called > pthread_exit in the main thread, and wants to attach gdb to it? Will > that work if the main thread (a.k.a group leader) is a defunct > process? And I think gdb is wrong. It can see this process has other live threads, and attach to them. I didn't check gdb, but iirc "strace -f" works correctly in this case. I cced other people. Let's see what they think. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 16:56 ` Oleg Nesterov @ 2009-02-02 20:10 ` Kaz Kylheku 2009-02-02 20:17 ` Ulrich Drepper 2009-02-05 3:05 ` Roland McGrath 2 siblings, 0 replies; 20+ messages in thread From: Kaz Kylheku @ 2009-02-02 20:10 UTC (permalink / raw) To: linux-kernel; +Cc: Andrew Morton, Roland McGrath, Ulrich Drepper, Oleg Nesterov On Mon, Feb 2, 2009 at 8:56 AM, Oleg Nesterov <oleg@redhat.com> wrote: > Well, perhaps something like your patch makes sense. > > +/* > + * A single thread is exiting, and it is the leader of the group. > + * This coresponds to the main thread calling pthread_exit. > + * In this case, the persists until all the other > + * threads call pthread_exit, or someone calls exit or _exit. > + * To implement this case, we park the thread leader in > + * a loop which waits until the thread list becomes empty, > + * or it receives a fatal signal. > ^^^^^^^^^^^^^^ > The comment is not exactly right, we return on every signal, not only fatal. Yes; that's stale text, referring to some earlier code that had been changed. > the above is just the open-coded > wait_event_interruptible(¤t->signal->wait_chldexit, > thread_group_empty(current)); Terrific! I will make the change locally. Btw, it's ``hand coded''. ``Open coding'' is what both programmers and macros do. Hand conding is the dumb thing I did instead of using the macro to open code it. :) > I am worried this patch can confuse the user-space. Because, when > the main thread does sys_exit(), the user-space has all rights > to assume it exits ;) Only glibc knows about sys_exit. (Or are there run-times for other language implementations that have their own binding to sys_exit, bypassing pthreads?) The POSIX interface used by applications is pthread_exit, and there is no assumption there about it being an exit-like system call. It has a number of standard-defined user-space chores to do, in fact. > But with this patch the main thread will > continue to handle the signals until the while group exits, I'm > afraid libpthread.so won't be happy. > And what if the signal handler does siglongjmp() and aborts sys_exit() ? pthread_exit peforms cleanup unwinding (required by Unix and POSIX) and destruction of thread-specific storage. Glibc does this and then performs its own longjmp to a handler in the startup code above main. At that point it's no longer correct to longjmp to any of the frames that have been aborted by that action. It's still executing user code; sys_exit has not been called yet, and the signal handler can go off. Other than that, there is actually nothing wrong with aborting sys_exit. It hasn't done any cleanup yet through do_exit, so it can be nicely reentered later. I'd say that any programs that are broken by this patch are probably ``fork in the toaster'' category anyway. Note that programs can also abort the exit function with signals, too, and there is nothing that can be done in the kernel about it. >> It's a bad idea to allow the main thread to terminate. It should stick >> around because it serves as a facade for the process as a whole. If >> the main thread is allowed to bail all the way through do_exit, who >> knows what kind of problems may show up because of that. >> >> What if one of my developers is working on a server which has called >> pthread_exit in the main thread, and wants to attach gdb to it? Will >> that work if the main thread (a.k.a group leader) is a defunct >> process? > > And I think gdb is wrong. It can see this process has other live threads, > and attach to them. But if this problem is patched in this very simple way in the kernel, then gdb installations ``Just Work'' without even being recompiled. Probably, the only way the gdb maintainers would accept another Linux-specific hack would be if they didn't know that a kernel patch will fix it. :) ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 16:56 ` Oleg Nesterov 2009-02-02 20:10 ` Kaz Kylheku @ 2009-02-02 20:17 ` Ulrich Drepper 2009-02-02 20:39 ` Kaz Kylheku 2009-02-05 3:05 ` Roland McGrath 2 siblings, 1 reply; 20+ messages in thread From: Ulrich Drepper @ 2009-02-02 20:17 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Roland McGrath -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Oleg Nesterov wrote: > I am worried this patch can confuse the user-space. Because, when > the main thread does sys_exit(), the user-space has all rights > to assume it exits ;) But with this patch the main thread will > continue to handle the signals until the while group exits, I'm > afraid libpthread.so won't be happy. I haven't looked at the patch nor tried it. If the patch changes the behavior that the main thread, after calling sys_exit, still react to signals sent to this thread or to the process as a whole, then the patch is wrong. The userlevel context of the thread is not usable anymore. It will have run all kinds of destructors. The current behavior is AFAIK that the main thread won't react to any signal anymore. That is absolutely required. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAkmHVO8ACgkQ2ijCOnn/RHTJvwCgodxkT+mg0tmrnlhf/IP8hUQc RYIAn0YC7pTjPHHZa7kmvYSyu/Zw5IIT =ehdX -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 20:17 ` Ulrich Drepper @ 2009-02-02 20:39 ` Kaz Kylheku 2009-02-03 2:39 ` Kaz Kylheku 0 siblings, 1 reply; 20+ messages in thread From: Kaz Kylheku @ 2009-02-02 20:39 UTC (permalink / raw) To: Ulrich Drepper; +Cc: Oleg Nesterov, linux-kernel, Andrew Morton, Roland McGrath On Mon, Feb 2, 2009 at 12:17 PM, Ulrich Drepper <drepper@redhat.com> wrote: > The userlevel context of the > thread is not usable anymore. It will have run all kinds of > destructors. The current behavior is AFAIK that the main thread won't > react to any signal anymore. That is absolutely required. Hey Ulrich, Thanks for articulating that requirement. I think it can be met by extending the patch a little bit. We can keep the main thread parked inside sys_exit /and/ get it not to react to signals internally, yet have the external behavior that the process reacts in the normal way to certain signals like SIGTSTP, SIGCONT, etc. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 20:39 ` Kaz Kylheku @ 2009-02-03 2:39 ` Kaz Kylheku 2009-02-03 13:33 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Kaz Kylheku @ 2009-02-03 2:39 UTC (permalink / raw) To: linux-kernel; +Cc: Oleg Nesterov, Andrew Morton, Roland McGrath On Mon, Feb 2, 2009 at 12:39 PM, Kaz Kylheku <kkylheku@gmail.com> wrote: > On Mon, Feb 2, 2009 at 12:17 PM, Ulrich Drepper <drepper@redhat.com> wrote: >> The userlevel context of the >> thread is not usable anymore. It will have run all kinds of >> destructors. The current behavior is AFAIK that the main thread won't >> react to any signal anymore. That is absolutely required. > > Hey Ulrich, > > Thanks for articulating that requirement. I think it can be met by > extending the patch a little bit. I've now done that. The exiting thread leader, if there are still other threads alive, gets its own private signal handler array in which every action is set to SIG_IGN, using the ignore_signals function. I experimented with blocking signals, but that approach breaks the test case of being able to attach GDB to the exiting thread. As part of the patch, I found it convenient to extend the incomplete sys_unshare functionality w.r.t. signal handlers, rather than reinvent the wheel. Cheers ... http://sourceware.org/bugzilla/attachment.cgi?id=3702 http://sourceware.org/bugzilla/attachment.cgi?id=3705 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-03 2:39 ` Kaz Kylheku @ 2009-02-03 13:33 ` Oleg Nesterov 2009-02-03 19:51 ` Kaz Kylheku 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-03 13:33 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-kernel, Andrew Morton, Roland McGrath On 02/02, Kaz Kylheku wrote: > > On Mon, Feb 2, 2009 at 12:39 PM, Kaz Kylheku <kkylheku@gmail.com> wrote: > > On Mon, Feb 2, 2009 at 12:17 PM, Ulrich Drepper <drepper@redhat.com> wrote: > >> The userlevel context of the > >> thread is not usable anymore. It will have run all kinds of > >> destructors. The current behavior is AFAIK that the main thread won't > >> react to any signal anymore. That is absolutely required. > > > > Hey Ulrich, > > > > Thanks for articulating that requirement. I think it can be met by > > extending the patch a little bit. > > I've now done that. > > The exiting thread leader, if there are still other > threads alive, gets its own private signal handler array in which > every action is set to SIG_IGN, using the ignore_signals > function. > > I experimented with blocking signals, but that approach > breaks the test case of being able to attach GDB to the > exiting thread. > > As part of the patch, I found it convenient to extend the > incomplete sys_unshare functionality w.r.t. signal handlers, > rather than reinvent the wheel. This is wrong, we can not and must not unshare ->sighand. > Cheers ... > > http://sourceware.org/bugzilla/attachment.cgi?id=3702 > http://sourceware.org/bugzilla/attachment.cgi?id=3705 This adds multiple problems. Just for example, fs/proc/ takes leader->sighand->siglock to protect the list of sub-threads. Of course this doesn't work any longer after unsharing. And there are numerous similar problems. ignore_signals() in do_leader_exit() is not right too. This thread group should hangle the group-wide signals even if the main thread exits. atomic_read(&sigh->count) in unshare_sighand() is racy, and in fact bogus. (yes, the whole unshare_sighand() is bogus, it never populates new_sighp). The changing of ->sighand in do_unshare() is very wrong, we can free the sighand_struct which is currently locked/used/etc. Kaz, I don't really understand why you are trying to add these complications to the kernel :( If the thread exits - it should exit. Yes, we have problems with the exited main thread, we should fix them. Yes, gdb refuses to attach to the dead thread (I didn't check this myself, but I think you are right). But there is nothing wrong here, because we can't ptrace this thread. But, gdb _can_ ptrace the process, and it can see it have other threads. OK, if nothing else. Let's suppose your patch is correct. What about robust futexes? How can we delay exit_robust_list() ? I don't think we can. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-03 13:33 ` Oleg Nesterov @ 2009-02-03 19:51 ` Kaz Kylheku 2009-02-03 21:32 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Kaz Kylheku @ 2009-02-03 19:51 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-kernel, Andrew Morton, Roland McGrath On Tue, Feb 3, 2009 at 5:33 AM, Oleg Nesterov <oleg@redhat.com> wrote: > On 02/02, Kaz Kylheku wrote: >> >> On Mon, Feb 2, 2009 at 12:39 PM, Kaz Kylheku <kkylheku@gmail.com> wrote: >> > On Mon, Feb 2, 2009 at 12:17 PM, Ulrich Drepper <drepper@redhat.com> wrote: >> >> The userlevel context of the >> >> thread is not usable anymore. It will have run all kinds of >> >> destructors. The current behavior is AFAIK that the main thread won't >> >> react to any signal anymore. That is absolutely required. >> > >> > Hey Ulrich, >> > >> > Thanks for articulating that requirement. I think it can be met by >> > extending the patch a little bit. >> >> I've now done that. >> >> The exiting thread leader, if there are still other >> threads alive, gets its own private signal handler array in which >> every action is set to SIG_IGN, using the ignore_signals >> function. >> >> I experimented with blocking signals, but that approach >> breaks the test case of being able to attach GDB to the >> exiting thread. >> >> As part of the patch, I found it convenient to extend the >> incomplete sys_unshare functionality w.r.t. signal handlers, >> rather than reinvent the wheel. > > This is wrong, we can not and must not unshare ->sighand. You are right; it breaks important invariant conditions which connect the thread group together, like the one about the lock, et cetera. The patch goes too far: rather than simply delaying the finalization (relatively safe), it's messing with the shared state (risky). Well, it doesn't bother me that that has to be thrown out. In fact, I do not agree with the requirement that the thread which calls pthread_exit must not respond to signals; the original patch works for me. I.e. in my embedded GNU/Linux distro, that requirement doesn't exist. And since I can't find it in the Single Unix Specification, so much for that! Nothing in the spec says that once pthread_exit is called, signals are stopped. This function invokes cleanup handling, and thread-specific-storage destruction. During any of those tasks, signals can still be happening. Any of those tasks can easily enter into an indefinite wait. What if a cleanup handler performs a blocking RPC to a remote server? Well, there you are, stuck in pthread_exit, handling signals, and not cleaning up your robust list, etc. I also don't require robust locks to be cleaned up instantly if they are owned by a main thread that has called pthread_exit. My organization is a heavy user of robust mutexes; they protect the integrity of a large, ``real time'' database stored in shared memory. I don't think that this would affect us in any way. The principal concern is that a process dies, for whatever reason, while holding locks. It's more to recover from catastrophic failures, not from mutex locking mistakes. If a thread locks a mutex and doesn't release it due to bad program logic, that is a problem whether or not that thread dies. It's not particularly useful that we can resolve that situation with EOWNERDEAD in the kernel when that thread happens to die, because that's just one case where we are lucky, so to speak. > Yes, gdb refuses to attach to the dead thread (I didn't check > this myself, but I think you are right). But there is nothing > wrong here, because we can't ptrace this thread. But, gdb > _can_ ptrace the process, and it can see it have other threads. Face it, allowing the thread leader to exit is as wrong as doing other stupid things to the leader, like unsharing the signal handler. Either way, you are breaking some little stick which is holding up the illusion that there is a process which has threads. > OK, if nothing else. Let's suppose your patch is correct. Obviously, the second part isn't. I'm very happy to have any excuse to throw this out, thanks! The first part isn't Ulrich-compliant. That is signficant, and duly noted; but it's not the law where I'm sitting. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-03 19:51 ` Kaz Kylheku @ 2009-02-03 21:32 ` Oleg Nesterov 2009-02-03 23:06 ` Kaz Kylheku 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-03 21:32 UTC (permalink / raw) To: Kaz Kylheku; +Cc: linux-kernel, Andrew Morton, Roland McGrath On 02/03, Kaz Kylheku wrote: > > Well, it doesn't bother me that that has to be thrown out. > In fact, I do not agree with the requirement that the thread > which calls pthread_exit must not respond to signals; > the original patch works for me. What about other users? We can't know what how much they depend on the current behaviour. > I.e. in my embedded GNU/Linux distro, that requirement > doesn't exist. And since I can't find it in the Single > Unix Specification, so much for that! > > Nothing in the spec says that once pthread_exit is called, > signals are stopped. This function invokes cleanup handling, > and thread-specific-storage destruction. During any of those > tasks, signals can still be happening. Any of those > tasks can easily enter into an indefinite wait. What if > a cleanup handler performs a blocking RPC to a remote > server? Well, there you are, stuck in pthread_exit, > handling signals, and not cleaning up your robust list, etc. > > I also don't require robust locks to be cleaned up > instantly if they are owned by a main thread that has > called pthread_exit. OK, OK. Please forget about signals, futexes, etc. Simple program: pthread_t main_thread; void *tfunc(void *a) { pthread_joni(main_thread, NULL); return NULL; } int main(void) { pthread_t thr; main_thread = pthread_self(); pthread_create(&thr, NULL, tfunc, NULL); pthread_exit(NULL); } I bet this will hang with your patch applied. Because we depend on sys_futex(->clear_child_tid, FUTEX_WAKE, ...). Kaz, you know, it is not easy to say "you patch is wrong in any case, no matter how much it will be improved" ;) But even if the current behaviour is not optimal, we must not change it unless we think it leads to bugs. We can't know which application can suffer. The current behaviour is old. > Face it, allowing the thread leader to exit is as wrong as doing > other stupid things to the leader, like unsharing the signal > handler. Perhaps. That is why I said _something_ like your patch perhaps makes sense. But this is tricky, and I don't see a simple/clean way to improve things. And, otoh, I do not see _real_ problems with the zombie leaders. As for original problem, it should be fixed anyway. wait_task_stopped() should take SIGNAL_STOP_STOPPED into account, not task->state. Unless we are ptracer, of course. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-03 21:32 ` Oleg Nesterov @ 2009-02-03 23:06 ` Kaz Kylheku 0 siblings, 0 replies; 20+ messages in thread From: Kaz Kylheku @ 2009-02-03 23:06 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-kernel, Andrew Morton, Roland McGrath On Tue, Feb 3, 2009 at 1:32 PM, Oleg Nesterov <oleg@redhat.com> wrote: > On 02/03, Kaz Kylheku wrote: >> >> Well, it doesn't bother me that that has to be thrown out. >> In fact, I do not agree with the requirement that the thread >> which calls pthread_exit must not respond to signals; >> the original patch works for me. > > What about other users? We can't know what how much they > depend on the current behaviour. If they haven't run into this gaping job control issue, they haven't done a whole lot of testing, obviously! Those who have run into it would certainly have to implement a workaround --- like not calling pthread_exit from the main thread! I.e. ``Q: Doctor, it hurts when I do this; A: So don't do that!''. > OK, OK. Please forget about signals, futexes, etc. > Simple program: > > pthread_t main_thread; > > void *tfunc(void *a) > { > pthread_joni(main_thread, NULL); > return NULL; > } > > int main(void) > { > pthread_t thr; > > main_thread = pthread_self(); > pthread_create(&thr, NULL, tfunc, NULL); > pthread_exit(NULL); > } This test case appears to be conforming, so it has to work. The initial thread is considered joinable. For instance a Rationale note in Issue 6 of the SUS claims that one reason for the existence of pthread_detach is so that the initial thread could be detached, which cannot be done through thread creation attributes for that thread. So the intent is clearly that the initial thread is joinable! > I bet this will hang with your patch applied. It almost certainly will, and it does have to do with futexes. The main thread hasn't gone through the step where it clears the TID, so the lll_wait_tid futex wait in pthread_join will block. There is no short-circuit indication in the library to indicate that the main thread has died. This TID trick is analogous to the robust list clean up. It's the same kind of thing: fixing up a value of some registered user-space memory location, signaling. > Kaz, you know, it is not easy to say "you patch is wrong > in any case, no matter how much it will be improved" ;) Sure it is! I will save you from that, because I do not believe in piling hacks on top of hacks to fix something that may be the wrong approach, even in situations where there is a good chance that after some finite number of hacks, it will finally be right. I did that in LinuxThreads once upon a time (mind you, that was so great, FreeBSD had to have it!) I do think there is a clean, non-hacky way to reason about this. If we think about the process-containing-threads model that the kernel is trying to emulate, and what should happen when threads exit, we come to the following reasoning: When a thread exits, there does have to be certain cleanup with respect to that thread. But the process-wide cleanup is not performed until all the threads are gone (thread count hits zero). This is easy to implement under the process-contains-threads model. These actions are not cleanly separated in Linux. There is a do_exit function which handles both the thread-things and the process-things in one swoop, so to speak. The zombie problem occurs because do_exit goes too far, cleaning up things that it shouldn't; things that are necessary in holding up the POSIX-conforming illusion that there is a process that contains threads. My kneejerk approch was: hey wait, let's hold back from doing /anything/ in do_exit; in fact let's not call it at all if we're the initial thread and there are still others. But obviously, it's not just anything in do_exit that causes problems. Maybe some in-between approach can work. The pthread_join test case can be fixed in a clean way, as can robust cleanup. The thread can signal pthread_join by resetting its TID to zero and hitting the futex. It can do the robust list cleanup, etc. If you can identify a good separation about what to do first, and what to do later, maybe you can some decent compromise among the concerns. Breaking up the do_exit logic into do_exit_thread do_exit_process would probably not hurt. You then have to pick whether each action belongs to one or the other and stick it into the appropriate function, with some clear guidelines about what goes where. In sys_exit, the two pieces could be used somehow like this: do_exit_thread(); if (leader and not_empty(group)) { special_logic(); } do_exit_process(); As one rule (for instance), any cleanup that threatens the integrity of the process/thread model goes into do_exit_process. So for instance if you ptrace that process, it still has all of its memory areas intact (you don't have to look for a different PID in the process list in order to find them). ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-02 16:56 ` Oleg Nesterov 2009-02-02 20:10 ` Kaz Kylheku 2009-02-02 20:17 ` Ulrich Drepper @ 2009-02-05 3:05 ` Roland McGrath 2009-02-05 4:55 ` Kaz Kylheku 2 siblings, 1 reply; 20+ messages in thread From: Roland McGrath @ 2009-02-05 3:05 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper I haven't seen the clear explanation of what specific actual problems there are here. But I'm quite sure this is not the right approach to address them. Kaz has said things that seemed to imply that the behavior is erratic or the semantics are somehow ill-defined when the group leader has died with other threads living on. In fact, this case is perfectly well-specified and there is no mystery about it. The group leader dies and becomes a zombie. The zombie group leader is kept from reaping and parent notification by the delayed_group_leader() logic and related code, until the last thread in the group dies. The tgid (leader's tid), aka PID in POSIX terms, remains as the PID for the process as a whole and signals to it work fine, etc. Quite some time ago, there was some /proc bug wherein /proc/pid/task could not be listed when the group leader had died. That prevented strace or gdb from attaching to the process after its initial thread used pthread_exit. I don't recall when that was fixed, but it's been fine for a good while. That is the only problem for debuggability of this case that I recall knowing about. Certainly long ago there were many problems with job control signals in multi-thread groups, and there have been many little corner cases fixed in that over the 2.6.x period. I'm not aware of any such problems remaining. But if there are some, they need to be fixed in the signals code. It's certainly clear how it's supposed to work, and that's no different when the group leader is dead. Thanks, Roland ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-05 3:05 ` Roland McGrath @ 2009-02-05 4:55 ` Kaz Kylheku 2009-02-05 16:15 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Kaz Kylheku @ 2009-02-05 4:55 UTC (permalink / raw) To: Roland McGrath; +Cc: Oleg Nesterov, linux-kernel, Andrew Morton, Ulrich Drepper On Wed, Feb 4, 2009 at 7:05 PM, Roland McGrath <roland@redhat.com> wrote: > I haven't seen the clear explanation of what specific actual problems there > are here. But I'm quite sure this is not the right approach to address them. > > Kaz has said things that seemed to imply that the behavior is erratic or > the semantics are somehow ill-defined when the group leader has died with > other threads living on. In fact, this case is perfectly well-specified > and there is no mystery about it. I haven't observed anything that could be called erratic. The behavior that occurs, occurs reliably. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-05 4:55 ` Kaz Kylheku @ 2009-02-05 16:15 ` Oleg Nesterov 2009-02-05 21:22 ` Roland McGrath 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-05 16:15 UTC (permalink / raw) To: Kaz Kylheku; +Cc: Roland McGrath, linux-kernel, Andrew Morton, Ulrich Drepper On 02/04, Kaz Kylheku wrote: > > On Wed, Feb 4, 2009 at 7:05 PM, Roland McGrath <roland@redhat.com> wrote: > > I haven't seen the clear explanation of what specific actual problems there > > are here. But I'm quite sure this is not the right approach to address them. > > > > Kaz has said things that seemed to imply that the behavior is erratic or > > the semantics are somehow ill-defined when the group leader has died with > > other threads living on. In fact, this case is perfectly well-specified > > and there is no mystery about it. > > I haven't observed anything that could be called erratic. The behavior that > occurs, occurs reliably. Yes we have the bug, and wait_task_stopped() should be fixed. But it is buggy anyway, even if we delay the death of the main thread. But I also think we shouldn't. (and I am sorry, I still can't find the time to redo my old patch, will try to do this asap). Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-05 16:15 ` Oleg Nesterov @ 2009-02-05 21:22 ` Roland McGrath 2009-02-05 23:22 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Roland McGrath @ 2009-02-05 21:22 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper > Yes we have the bug, and wait_task_stopped() should be fixed. But it is > buggy anyway, even if we delay the death of the main thread. But I also > think we shouldn't. Sorry, I'd missed the actual bug report among all the tangential verbiage. I wrote this test case for it. Is there any problem other than this one? Thanks, Roland ========== #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <signal.h> #include <pthread.h> #include <assert.h> static void * thfunc (void *arg) { sleep (2); puts ("stopping"); raise (SIGSTOP); puts ("resumed"); exit (0); } int main (void) { pthread_t th; int rc = pthread_create (&th, NULL, &thfunc, NULL); assert_perror (rc); pthread_exit (0); /*NOTREACHED*/ return 1; } ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-05 21:22 ` Roland McGrath @ 2009-02-05 23:22 ` Oleg Nesterov 2009-02-09 3:33 ` Roland McGrath 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-05 23:22 UTC (permalink / raw) To: Roland McGrath; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper On 02/05, Roland McGrath wrote: > > I wrote this test case for it. Is there any problem other than this one? I don't know about other problems with the zombie leaders. Except, I am worried whether the fix I have in mind is correct ;) It is simple, wait_task_stopped() should do if we tracer: check ->state, eat ->exit_code else: check SIGNAL_STOP_STOPPED, use ->group_exit_code This looks logical, and should fix the problem. But this is the user-visible change. For example, $ sleep 100 ^Z [1]+ Stopped sleep 100 $ strace -p `pidof sleep` Process 11442 attached - interrupt to quit strace hangs in do_wait(). But after the fix strace will happily proceed. I can't know whether this behaviour change is bad or not. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-05 23:22 ` Oleg Nesterov @ 2009-02-09 3:33 ` Roland McGrath 2009-02-09 4:52 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Roland McGrath @ 2009-02-09 3:33 UTC (permalink / raw) To: Oleg Nesterov; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper > I don't know about other problems with the zombie leaders. Ok, then we can just concentrate on the test case I posted. > Except, I am worried whether the fix I have in mind is correct ;) > It is simple, wait_task_stopped() should do I think the first problem is we'll never even get into wait_task_stopped(). We'll be in wait_consider_task() on the group leader, which is EXIT_ZOMBIE. First we need to adjust this: if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p)) return wait_task_zombie(p, options, infop, stat_addr, ru); to maybe: if (p->exit_state == EXIT_ZOMBIE) { if (delay_group_leader(p)) return wait_task_zombie_leader(p, options, infop, stat_addr, ru); return wait_task_zombie(p, options, infop, stat_addr, ru); } In wait_task_zombie_leader(), it will have to take the siglock and try to figure out if there is a completed group stop to report. > if we tracer: > > check ->state, eat ->exit_code Being the ptracer does not affect the delay_group_leader logic. It just affects individual vs group stop reports. So the existing code path is right for ptrace. > else: > check SIGNAL_STOP_STOPPED, use ->group_exit_code We don't want wait to change group_exit_code. But we need the "reported as stopped" tracking that wait_task_stopped() gets by clearing exit_code. So I think what we need is to get the zombie group_leader->exit_code to be set to ->group_exit_code as it would have been if the leader were alive and had participated in the group stop. > This looks logical, and should fix the problem. But this is > the user-visible change. For example, > > $ sleep 100 > ^Z > [1]+ Stopped sleep 100 > $ strace -p `pidof sleep` > Process 11442 attached - interrupt to quit > > strace hangs in do_wait(). But after the fix strace will happily > proceed. I can't know whether this behaviour change is bad or not. I think this would only happen if the "reported as stopped" bookkeeping I mentioned above were broken. The "Stopped" line means that the shell just did do_wait(WUNTRACED), so wait_task_stopped() cleared ->exit_code when reporting it as stopped. Now strace does PTRACE_ATTACH and then a wait; it can't see a fresh wait result here because ->exit_code is still zero. 100% untested concept patch follows. Thanks, Roland ========== diff --git a/kernel/exit.c b/kernel/exit.c index f80dec3..0000000 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1437,7 +1437,8 @@ static int wait_task_stopped(int ptrace, exit_code = 0; spin_lock_irq(&p->sighand->siglock); - if (unlikely(!task_is_stopped_or_traced(p))) + if (unlikely(!task_is_stopped_or_traced(p)) && + (ptrace || p->exit_state != EXIT_ZOMBIE || !delay_group_leader(p))) goto unlock_sig; if (!ptrace && p->signal->group_stop_count > 0) @@ -1598,9 +1599,20 @@ static int wait_consider_task(struct tas /* * We don't reap group leaders with subthreads. + * When the group leader is dead, it still serves as + * a moniker for the whole group for stop and continue. + * But for ptrace, stop and continue are reported per-thread. */ - if (p->exit_state == EXIT_ZOMBIE && !delay_group_leader(p)) - return wait_task_zombie(p, options, infop, stat_addr, ru); + if (p->exit_state == EXIT_ZOMBIE) { + if (!delay_group_leader(p)) + return wait_task_zombie(p, options, + infop, stat_addr, ru); + *notask_error = 0; + if (unlikely(ptrace)) + return 0; + return wait_task_stopped(p, options, infop, stat_addr, ru) ?: + wait_task_continued(p, options, infop, stat_addr, ru); + } /* * It's stopped or running now, so it might diff --git a/kernel/signal.c b/kernel/signal.c index b6b3676..0000000 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1653,6 +1653,27 @@ finish_stop(int stop_count) } /* + * Complete group stop bookkeeping after decrementing sig->group_stop_count, + * to new value stop_count. When it reaches zero, mark the process stopped. + * + * If the group leader is already dead, then it did not participate + * normally in the group stop. But its ->exit_code stands for the whole + * group in do_wait() bookkeeping, so we need it to reflect the stop. + */ +static inline void complete_group_stop(struct task_struct *tsk, + struct signal_struct *sig, + int stop_count) +{ + if (stop_count) + return; + + sig->flags = SIGNAL_STOP_STOPPED; + + if (tsk->group_leader->exit_state) + tsk->group_leader->exit_code = sig->group_exit_code; +} + +/* * This performs the stopping for SIGSTOP and other stop signals. * We have to stop all threads in the thread group. * Returns nonzero if we've actually stopped and released the siglock. @@ -1696,8 +1717,7 @@ static int do_signal_stop(int signr) sig->group_stop_count = stop_count; } - if (stop_count == 0) - sig->flags = SIGNAL_STOP_STOPPED; + complete_group_stop(current, sig, stop_count); current->exit_code = sig->group_exit_code; __set_current_state(TASK_STOPPED); @@ -1933,9 +1953,8 @@ void exit_signals(struct task_struct *ts if (!signal_pending(t) && !(t->flags & PF_EXITING)) recalc_sigpending_and_wake(t); - if (unlikely(tsk->signal->group_stop_count) && - !--tsk->signal->group_stop_count) { - tsk->signal->flags = SIGNAL_STOP_STOPPED; + if (unlikely(tsk->signal->group_stop_count)) { + complete_group_stop(tsk, sig, --tsk->signal->group_stop_count); group_stop = 1; } out: ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-09 3:33 ` Roland McGrath @ 2009-02-09 4:52 ` Oleg Nesterov 2009-02-09 5:14 ` Oleg Nesterov 0 siblings, 1 reply; 20+ messages in thread From: Oleg Nesterov @ 2009-02-09 4:52 UTC (permalink / raw) To: Roland McGrath; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper I am already sleep, will return tomorrow. Just a souple of quick notes... On 02/08, Roland McGrath wrote: > > I think the first problem is we'll never even get into wait_task_stopped(). > We'll be in wait_consider_task() on the group leader, which is EXIT_ZOMBIE. Yes sure. I meant, instead of just checking task_is_stopped_or_traced() in wait_consider_task(), we should do somthing like int wait_is_stopped(p, ptrace) { if (ptrace) // wait_task_stopped() will also check p->exit_code != 0 return task_is_stopped_or_traced(p); else // wait_task_stopped() will also check ->group_exit_code != 0 return !!(signal->flags & SIGNAL_STOP_STOPPED); } > In wait_task_zombie_leader(), it will have to take the siglock and try to > figure out if there is a completed group stop to report. > > > if we tracer: > > > > check ->state, eat ->exit_code > > Being the ptracer does not affect the delay_group_leader logic. > It just affects individual vs group stop reports. So the existing > code path is right for ptrace. > > > else: > > check SIGNAL_STOP_STOPPED, use ->group_exit_code > > We don't want wait to change group_exit_code. But we need the "reported as > stopped" tracking that wait_task_stopped() gets by clearing exit_code. I never understood this. Why do we mix the normal group stop with the ptrace "per-thread" stops? Look. We have the main thread M and the sub-thread T. We stop this process, its parent does do_wait() and clears M->exit_code. Now, ptracer can attach to T (it still has ->exit_code != 0), but not to M. This always looked very strange to me. Or. ptracer attaches to the main thread and (say) does nothing. We send SIGSTOP to another thread. The whole group stops, but its parent can't see this. Why? Then ptracer does PTRACE_DETACH, and the parent still can't (and will never can) see the stop unless the ptracer puts something reasonable into ->exit_code. But even in this case we lost the notofication. > So > I think what we need is to get the zombie group_leader->exit_code to be set > to ->group_exit_code as it would have been if the leader were alive and had > participated in the group stop. Please see below. > > $ sleep 100 > > ^Z > > [1]+ Stopped sleep 100 > > $ strace -p `pidof sleep` > > Process 11442 attached - interrupt to quit > > > > strace hangs in do_wait(). But after the fix strace will happily > > proceed. I can't know whether this behaviour change is bad or not. > > I think this would only happen if the "reported as stopped" bookkeeping I > mentioned above were broken. The "Stopped" line means that the shell just > did do_wait(WUNTRACED), so wait_task_stopped() cleared ->exit_code when > reporting it as stopped. Now strace does PTRACE_ATTACH and then a wait; > it can't see a fresh wait result here because ->exit_code is still zero. Yes. And why ptracer should wait? > 100% untested concept patch follows. it adds more special cases for the delay_group_leader() zombies :( > +static inline void complete_group_stop(struct task_struct *tsk, > + struct signal_struct *sig, > + int stop_count) > +{ > + if (stop_count) > + return; > + > + sig->flags = SIGNAL_STOP_STOPPED; > + > + if (tsk->group_leader->exit_state) > + tsk->group_leader->exit_code = sig->group_exit_code; > +} This doesn't look exactly right. Unless another thread does sys_exit_group() later (they all can exit via sys_exit), wait_task_zombie() may report this ->exit_code == SIGSTOP. Oleg. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: main thread pthread_exit/sys_exit bug! 2009-02-09 4:52 ` Oleg Nesterov @ 2009-02-09 5:14 ` Oleg Nesterov 0 siblings, 0 replies; 20+ messages in thread From: Oleg Nesterov @ 2009-02-09 5:14 UTC (permalink / raw) To: Roland McGrath; +Cc: Kaz Kylheku, linux-kernel, Andrew Morton, Ulrich Drepper On 02/09, Oleg Nesterov wrote: > > Yes sure. I meant, instead of just checking task_is_stopped_or_traced() in > wait_consider_task(), we should do somthing like In short, please see the "patch" below. I doubt it can be compiled, just for the illustration. Oleg. --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1417,6 +1417,19 @@ static int wait_task_zombie(struct task_ return retval; } +static int *wait_xxx(struct task_struct *p, int ptrace) +{ + if (ptrace) { + if (task_is_stopped_or_traced(p)) + return &p->exit_code; + } else { + if (p->signal->flags & SIGNAL_STOPPED_STOPPED) + return &p->signal->group_exit_code; + } + + return NULL; +} + /* * Handle sys_wait4 work for one task in state TASK_STOPPED. We hold * read_lock(&tasklist_lock) on entry. If we return zero, we still hold @@ -1427,7 +1440,7 @@ static int wait_task_stopped(int ptrace, int options, struct siginfo __user *infop, int __user *stat_addr, struct rusage __user *ru) { - int retval, exit_code, why; + int retval, exit_code, *p_code, why; uid_t uid = 0; /* unneeded, required by compiler */ pid_t pid; @@ -1437,22 +1450,16 @@ static int wait_task_stopped(int ptrace, exit_code = 0; spin_lock_irq(&p->sighand->siglock); - if (unlikely(!task_is_stopped_or_traced(p))) - goto unlock_sig; - - if (!ptrace && p->signal->group_stop_count > 0) - /* - * A group stop is in progress and this is the group leader. - * We won't report until all threads have stopped. - */ + p_code = wait_xxx(p, ptrace); + if (unlikely(!p_code)) goto unlock_sig; - exit_code = p->exit_code; + exit_code = *p_code; if (!exit_code) goto unlock_sig; if (!unlikely(options & WNOWAIT)) - p->exit_code = 0; + *p_code = 0; /* don't need the RCU readlock here as we're holding a spinlock */ uid = __task_cred(p)->uid; @@ -1608,7 +1615,7 @@ static int wait_consider_task(struct tas */ *notask_error = 0; - if (task_is_stopped_or_traced(p)) + if (wait_xxx(p, ptrace)) return wait_task_stopped(ptrace, p, options, infop, stat_addr, ru); ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2009-02-09 5:17 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-01 22:32 main thread pthread_exit/sys_exit bug! Kaz Kylheku
[not found] ` <20090201174159.4a52e15c.akpm@linux-foundation.org>
2009-02-02 6:45 ` Oleg Nesterov
2009-02-02 7:10 ` Kaz Kylheku
2009-02-02 16:56 ` Oleg Nesterov
2009-02-02 20:10 ` Kaz Kylheku
2009-02-02 20:17 ` Ulrich Drepper
2009-02-02 20:39 ` Kaz Kylheku
2009-02-03 2:39 ` Kaz Kylheku
2009-02-03 13:33 ` Oleg Nesterov
2009-02-03 19:51 ` Kaz Kylheku
2009-02-03 21:32 ` Oleg Nesterov
2009-02-03 23:06 ` Kaz Kylheku
2009-02-05 3:05 ` Roland McGrath
2009-02-05 4:55 ` Kaz Kylheku
2009-02-05 16:15 ` Oleg Nesterov
2009-02-05 21:22 ` Roland McGrath
2009-02-05 23:22 ` Oleg Nesterov
2009-02-09 3:33 ` Roland McGrath
2009-02-09 4:52 ` Oleg Nesterov
2009-02-09 5:14 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox