* Problem with exiting threads under NPTL
@ 2003-12-14 5:25 Petr Vandrovec
2003-12-14 15:02 ` Martin Schlemmer
2003-12-14 19:38 ` [patch] " Ingo Molnar
0 siblings, 2 replies; 23+ messages in thread
From: Petr Vandrovec @ 2003-12-14 5:25 UTC (permalink / raw)
To: linux-kernel
Hi,
several times one of our threads ended up as ZOMBIE and
nobody wants to pick him up - even init ignores it. Inspection
of kernel structures revealed that task's exit code is 0,
exit_signal is -1, ptrace is 0 and state is 8 (ZOMBIE).
Quick look at exit_notify reveals that it should not happen:
if task has exit_signal == -1 and is not ptraced, we should immediate
move to TASK_DEAD and get rid of task as fast as possible.
But problem is that in do_notify_parent we have code which
sets exit_signal to -1 if parent ignores SIGCHLD, and we call
do_notify_parent() for group leader from release_task() when
last member of thread group exits, without taking any care
about eventual changes in exit_signal field.
So if some process ignores SIGCHLD, and spawns child process
which creates additional thread, and main thread in child exits
before child it created, you'll end up with immortal zombie.
Best regards,
Petr Vandrovec
vandrove@vc.cvut.cz
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
5 0 3709 1 15 0 0 0 exit Z tty3 0:00 [test] <defunct>
Creator of immortal zombies:
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
#include <stdio.h>
static void* thread(void* dummy) {
/* Make sure that we exit as second thread */
sleep(1);
return NULL;
}
void main(void) {
int pid;
signal(SIGCHLD, SIG_IGN);
pid = fork();
if (pid == 0) {
pthread_t p;
pthread_create(&p, NULL, thread, NULL);
} else {
/* Sleep some time so we know that child threads exit before us */
sleep(2);
printf("Look for task %d...\n", pid);
}
}
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: Problem with exiting threads under NPTL 2003-12-14 5:25 Problem with exiting threads under NPTL Petr Vandrovec @ 2003-12-14 15:02 ` Martin Schlemmer 2003-12-14 19:38 ` [patch] " Ingo Molnar 1 sibling, 0 replies; 23+ messages in thread From: Martin Schlemmer @ 2003-12-14 15:02 UTC (permalink / raw) To: Petr Vandrovec; +Cc: Linux Kernel Mailing Lists [-- Attachment #1: Type: text/plain, Size: 693 bytes --] On Sun, 2003-12-14 at 07:25, Petr Vandrovec wrote: > Hi, > several times one of our threads ended up as ZOMBIE and > nobody wants to pick him up - even init ignores it. Inspection > of kernel structures revealed that task's exit code is 0, > exit_signal is -1, ptrace is 0 and state is 8 (ZOMBIE). > > So if some process ignores SIGCHLD, and spawns child process > which creates additional thread, and main thread in child exits > before child it created, you'll end up with immortal zombie. > I can confirm this behavior here, although I must admit I do not know if the sample code is legal. Latest glibc from cvs + bk kernel. Cheers, -- Martin Schlemmer [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* [patch] Re: Problem with exiting threads under NPTL 2003-12-14 5:25 Problem with exiting threads under NPTL Petr Vandrovec 2003-12-14 15:02 ` Martin Schlemmer @ 2003-12-14 19:38 ` Ingo Molnar 2003-12-14 20:38 ` Linus Torvalds 1 sibling, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 19:38 UTC (permalink / raw) To: Petr Vandrovec Cc: linux-kernel, Linus Torvalds, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Petr Vandrovec wrote: > several times one of our threads ended up as ZOMBIE and nobody wants > to pick him up - even init ignores it. Inspection of kernel structures > revealed that task's exit code is 0, exit_signal is -1, ptrace is 0 and > state is 8 (ZOMBIE). ok, we only recently started zapping SIGCHLD==ignore processes via the detached-thread method. Your testcase shows one hole in this mechanism: if a process that was created as a 'deatched process' itself creates a detached thread (NPTL thread), and the thread exits before the thread group leader, then nobody will release the leader. The patch below fixes this bug, your testcase now works fine on my box. I've tested both UP and SMP kernels. the code is a bit ugly, but it's necessary - a parent can decide _after_ starting the child that it wants to detach it. (by setting SIGCHLD to SIG_IGN. The testcase doesnt do this.) So the only place where we can detect the detached-ness of a process is in do_notify_parent(). Ingo --- linux/kernel/exit.c.orig +++ linux/kernel/exit.c @@ -50,6 +50,7 @@ static void __unhash_process(struct task void release_task(struct task_struct * p) { task_t *leader; + int zap_leader = 0; struct dentry *proc_dentry; BUG_ON(p->state < TASK_ZOMBIE); @@ -72,8 +73,16 @@ void release_task(struct task_struct * p */ leader = p->group_leader; if (leader != p && thread_group_empty(leader) && - leader->state == TASK_ZOMBIE && leader->exit_signal != -1) + leader->state == TASK_ZOMBIE && leader->exit_signal != -1) { do_notify_parent(leader, leader->exit_signal); + /* + * If we were the last child thread and the leader has + * exited already, and the leader's parent ignores SIGCHLD, + * then we are the one who should release the leader: + */ + if (leader->exit_signal == -1) + zap_leader = 1; + } p->parent->cutime += p->utime + p->cutime; p->parent->cstime += p->stime + p->cstime; @@ -88,6 +97,13 @@ void release_task(struct task_struct * p proc_pid_flush(proc_dentry); release_thread(p); put_task_struct(p); + + /* + * Do this outside the tasklist lock. The reference to the + * leader is safe: + */ + if (zap_leader) + release_task(leader); } /* we are using it only for SMP init */ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 19:38 ` [patch] " Ingo Molnar @ 2003-12-14 20:38 ` Linus Torvalds 2003-12-14 20:45 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Linus Torvalds @ 2003-12-14 20:38 UTC (permalink / raw) To: Ingo Molnar Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Ingo Molnar wrote: > > the code is a bit ugly, but it's necessary - a parent can decide _after_ > starting the child that it wants to detach it. (by setting SIGCHLD to > SIG_IGN. The testcase doesnt do this.) So the only place where we can > detect the detached-ness of a process is in do_notify_parent(). Hmm.. What if "leader->exit_signal" was -1 already _before_ we call "do_notify_parent()"? In that case we'd never call "do_notify_parent()" for the leader at all, and we would also not release it outselves, the way you've done the test. Or is that case impossible to trigger? Looks a bit like that. But if it _is_ impossible to trigger (ie exit_signal cannot be -1 for a thread leader), then why does the current code test for "&& leader->exit_signal != -1)" at all? That code looks fragile as hell. I think you fixed a bug and it might be the absolutely proper fix, but I'd be happier about it if it was more obvious what the rules are and _why_ that is the only case that matters.. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 20:38 ` Linus Torvalds @ 2003-12-14 20:45 ` Linus Torvalds 2003-12-14 21:02 ` Ingo Molnar 2003-12-15 15:04 ` Jörn Engel 2003-12-14 21:06 ` Ingo Molnar 2003-12-15 23:06 ` Roland McGrath 2 siblings, 2 replies; 23+ messages in thread From: Linus Torvalds @ 2003-12-14 20:45 UTC (permalink / raw) To: Ingo Molnar Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Linus Torvalds wrote: > > That code looks fragile as hell. I think you fixed a bug and it might be > the absolutely proper fix, but I'd be happier about it if it was more > obvious what the rules are and _why_ that is the only case that matters.. Btw, on another note: to avoid the appearance of recursion, I'd prefer a p = leader; goto top; instead of a "release_task(leader);". I realize that the recursion should be just one deep (the leader of the leader is itself, and that will stop the thing from going further), but it looks trivial to avoid it, and any automated source checking tool would be confused by the apparent recursion. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 20:45 ` Linus Torvalds @ 2003-12-14 21:02 ` Ingo Molnar 2003-12-15 15:04 ` Jörn Engel 1 sibling, 0 replies; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 21:02 UTC (permalink / raw) To: Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Linus Torvalds wrote: > Btw, on another note: to avoid the appearance of recursion, I'd prefer a > > p = leader; > goto top; > > instead of a "release_task(leader);". > > I realize that the recursion should be just one deep (the leader of the > leader is itself, and that will stop the thing from going further), but > it looks trivial to avoid it, and any automated source checking tool > would be confused by the apparent recursion. yeah. New (SMP-testbooted) patch attached. I also got rid of the zap_leader flag. (we can signal leader-zapping via the leader pointer.) Ingo --- linux/kernel/exit.c.orig +++ linux/kernel/exit.c @@ -51,7 +51,8 @@ void release_task(struct task_struct * p { task_t *leader; struct dentry *proc_dentry; - + +restart: BUG_ON(p->state < TASK_ZOMBIE); atomic_dec(&p->user->processes); @@ -72,8 +73,20 @@ void release_task(struct task_struct * p */ leader = p->group_leader; if (leader != p && thread_group_empty(leader) && - leader->state == TASK_ZOMBIE && leader->exit_signal != -1) + leader->state == TASK_ZOMBIE && leader->exit_signal != -1) { do_notify_parent(leader, leader->exit_signal); + /* + * If we were the last child thread and the leader has + * exited already, and the leader's parent ignores SIGCHLD, + * then we are the one who should release the leader. + * + * (non-NULL leader triggers the zapping of the leader at + * the end of this function.) + */ + if (leader->exit_signal != -1) + leader = NULL; + } else + leader = NULL; p->parent->cutime += p->utime + p->cutime; p->parent->cstime += p->stime + p->cstime; @@ -88,6 +101,16 @@ void release_task(struct task_struct * p proc_pid_flush(proc_dentry); release_thread(p); put_task_struct(p); + + /* + * Do this outside the tasklist lock. The reference to the + * leader is safe. There's no recursion possible, since + * the leader of the leader is itself: + */ + if (unlikely(leader)) { + p = leader; + goto restart; + } } /* we are using it only for SMP init */ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 20:45 ` Linus Torvalds 2003-12-14 21:02 ` Ingo Molnar @ 2003-12-15 15:04 ` Jörn Engel 1 sibling, 0 replies; 23+ messages in thread From: Jörn Engel @ 2003-12-15 15:04 UTC (permalink / raw) To: Linus Torvalds; +Cc: Kernel Mailing List, Andrew Morton On Sun, 14 December 2003 12:45:34 -0800, Linus Torvalds wrote: > > Btw, on another note: to avoid the appearance of recursion, I'd prefer a > > p = leader; > goto top; > > instead of a "release_task(leader);". > > I realize that the recursion should be just one deep (the leader of the > leader is itself, and that will stop the thing from going further), but it > looks trivial to avoid it, and any automated source checking tool would be > confused by the apparent recursion. Since you mentioned it - how would you prefer the asct (we need a better acronym) to detect recursion depth. Currently, I have those in a seperate file that should come with the kernel, maybe in Documentation/recursions or so. But how about this: /** * RECURSION: 2 * NAME: do_recurse */ void do_recurse(int recurse) { if (recurse) do_recurse(0); } Ok, the format is ugly, feel free to pick anything nicer. But explicitly stating the recursion depth right where it happens makes sense to me, as many human readers would like a similar comment anyway. Any opinion? Jörn -- Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. -- Perlis's Programming Proverb #58, SIGPLAN Notices, Sept. 1982 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 20:38 ` Linus Torvalds 2003-12-14 20:45 ` Linus Torvalds @ 2003-12-14 21:06 ` Ingo Molnar 2003-12-14 22:10 ` Linus Torvalds 2003-12-15 23:06 ` Roland McGrath 2 siblings, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 21:06 UTC (permalink / raw) To: Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Linus Torvalds wrote: > That code looks fragile as hell. I think you fixed a bug and it might be > the absolutely proper fix, but I'd be happier about it if it was more > obvious what the rules are and _why_ that is the only case that > matters.. agreed, and we had a fair number of bugs in this area already. But i dont know whether (or how) it could be made simpler - i think most of the ugliness is a reflexion of the POSIX semantics. (combined with ptrace complexities.) Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 21:06 ` Ingo Molnar @ 2003-12-14 22:10 ` Linus Torvalds 2003-12-14 22:17 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Linus Torvalds @ 2003-12-14 22:10 UTC (permalink / raw) To: Ingo Molnar Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath Ingo, I'm being anal for 2.6.0 again and trying to think of all the ramifications of the changes, and I found a serious bug in your patch: Even though the parent ignores SIGCHLD it _can_ be running on another CPU in "wait4()". And since we drop the tasklist lock before we do the "release_task()" on the leader, and since the leader is still marked TASK_ZOMBIE, we may have _two_ different people trying to release it. First the parent, and then the last thread that still remembers the leader. Note that we at this point have not unhashed the thread group leader yet: we just unhashed the last thread. So we're dropping the task list lock, and the leader pointer is _not_ safe to keep that way. We've had that race before, and the fix is to mark the leader TASK_DEAD while still under the tasklist lock - that way the parent won't see the soon-to-be-reaped leader even if it were to happen to be in the middle of a wait4() call and get in while the tasklist lock is released. So how about this patch? It has this bug fixed, and is otherwise a mix of your two patches, along with an added sanity check ("the exit_signal for a thread group leader cannot be -1 until the last thread exited"), which makes the test make a whole lot more sense in my opinion (the previous test for "exit_signal != -1" looks nonsensical, and appears to be a cut-and-paste from somewhere else). Roland, Petr, comments? Can you see any hole in my logic? Linus ---- --- 1.119/kernel/exit.c Mon Oct 27 11:49:59 2003 +++ edited/kernel/exit.c Sun Dec 14 14:00:10 2003 @@ -49,9 +49,11 @@ void release_task(struct task_struct * p) { + int zap_leader; task_t *leader; struct dentry *proc_dentry; - + +repeat: BUG_ON(p->state < TASK_ZOMBIE); atomic_dec(&p->user->processes); @@ -70,10 +72,24 @@ * group, and the leader is zombie, then notify the * group leader's parent process. (if it wants notification.) */ + zap_leader = 0; leader = p->group_leader; - if (leader != p && thread_group_empty(leader) && - leader->state == TASK_ZOMBIE && leader->exit_signal != -1) + if (leader != p && thread_group_empty(leader) && leader->state == TASK_ZOMBIE) { + BUG_ON(leader->exit_signal == -1); do_notify_parent(leader, leader->exit_signal); + /* + * If we were the last child thread and the leader has + * exited already, and the leader's parent ignores SIGCHLD, + * then we are the one who should release the leader. + * + * do_notify_parent() will have marked it self-reaping in + * that case. + */ + if (leader->exit_signal == -1) { + zap_leader = 1; + leader->state = TASK_DEAD; + } + } p->parent->cutime += p->utime + p->cutime; p->parent->cstime += p->stime + p->cstime; @@ -88,6 +104,10 @@ proc_pid_flush(proc_dentry); release_thread(p); put_task_struct(p); + + p = leader; + if (zap_leader) + goto repeat; } /* we are using it only for SMP init */ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:10 ` Linus Torvalds @ 2003-12-14 22:17 ` Ingo Molnar 2003-12-14 22:32 ` Linus Torvalds 2003-12-14 22:28 ` Ingo Molnar 2003-12-15 8:54 ` Arjan van de Ven 2 siblings, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 22:17 UTC (permalink / raw) To: Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Linus Torvalds wrote: > Even though the parent ignores SIGCHLD it _can_ be running on another > CPU in "wait4()". And since we drop the tasklist lock before we do the > "release_task()" on the leader, and since the leader is still marked > TASK_ZOMBIE, we may have _two_ different people trying to release it. > First the parent, and then the last thread that still remembers the > leader. are you sure this can happen? eligible_child() does this: if (p->exit_signal == -1 && !p->ptrace) return 0; so can the parallel situation really happen? Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:17 ` Ingo Molnar @ 2003-12-14 22:32 ` Linus Torvalds 2003-12-15 23:04 ` Roland McGrath 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2003-12-14 22:32 UTC (permalink / raw) To: Ingo Molnar Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Ingo Molnar wrote: > > are you sure this can happen? eligible_child() does this: Interesting. That code should have been enough, but we've later on added extra code into wait_task_zombie to check _exactly_ the same case because we saw problems. Hmm.. Maybe that was to protect against concurrent wait4()'ers (which can happen - the wait case only takes the read lock). Threads can wait for each others children, after all. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:32 ` Linus Torvalds @ 2003-12-15 23:04 ` Roland McGrath 0 siblings, 0 replies; 23+ messages in thread From: Roland McGrath @ 2003-12-15 23:04 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Petr Vandrovec, Kernel Mailing List, Andrew Morton > On Sun, 14 Dec 2003, Ingo Molnar wrote: > > > > are you sure this can happen? eligible_child() does this: > > Interesting. That code should have been enough, but we've later on added > extra code into wait_task_zombie to check _exactly_ the same case because > we saw problems. > > Hmm.. Maybe that was to protect against concurrent wait4()'ers (which can > happen - the wait case only takes the read lock). Threads can wait for > each others children, after all. My recollection about the wait_task_* changes was that they were mostly to address races between concurrent wait4 calls. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:10 ` Linus Torvalds 2003-12-14 22:17 ` Ingo Molnar @ 2003-12-14 22:28 ` Ingo Molnar 2003-12-14 22:45 ` Linus Torvalds 2003-12-15 8:54 ` Arjan van de Ven 2 siblings, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 22:28 UTC (permalink / raw) To: Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath here's yet another variant of the patch. Changes relative to yours: - it sets zap_leader to 0 outside the critical section. - it doesnt set leader to TASK_DEAD, exit_signal == -1 should really protect it from sys_wait4() i believe. - it sets 'p = leader' within the unlikely branch - slightly faster common case. the patch testbooted fine here. Ingo --- linux/kernel/exit.c.orig +++ linux/kernel/exit.c @@ -50,8 +50,11 @@ static void __unhash_process(struct task void release_task(struct task_struct * p) { task_t *leader; + int zap_leader; struct dentry *proc_dentry; - + +repeat: + zap_leader = 0; BUG_ON(p->state < TASK_ZOMBIE); atomic_dec(&p->user->processes); @@ -72,8 +75,16 @@ void release_task(struct task_struct * p */ leader = p->group_leader; if (leader != p && thread_group_empty(leader) && - leader->state == TASK_ZOMBIE && leader->exit_signal != -1) + leader->state == TASK_ZOMBIE && leader->exit_signal != -1) { do_notify_parent(leader, leader->exit_signal); + /* + * If we were the last child thread and the leader has + * exited already, and the leader's parent ignores SIGCHLD, + * then we are the one who should release the leader. + */ + if (leader->exit_signal == -1) + zap_leader = 1; + } p->parent->cutime += p->utime + p->cutime; p->parent->cstime += p->stime + p->cstime; @@ -88,6 +99,16 @@ void release_task(struct task_struct * p proc_pid_flush(proc_dentry); release_thread(p); put_task_struct(p); + + /* + * Do this outside the tasklist lock. The reference to the + * leader is safe. There's no recursion possible, since + * the leader of the leader is itself: + */ + if (unlikely(zap_leader)) { + p = leader; + goto repeat; + } } /* we are using it only for SMP init */ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:28 ` Ingo Molnar @ 2003-12-14 22:45 ` Linus Torvalds 2003-12-14 23:08 ` Ingo Molnar 0 siblings, 1 reply; 23+ messages in thread From: Linus Torvalds @ 2003-12-14 22:45 UTC (permalink / raw) To: Ingo Molnar Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Ingo Molnar wrote: > > - it sets zap_leader to 0 outside the critical section. > > - it sets 'p = leader' within the unlikely branch - slightly faster common > case. Note that both of these micro-optimizations are equally likely to _hurt_. In particular, the bigger the liveness region of a variable is, the more it adds to register pressure, and the worse code gcc generates. Initializing variables further away from its use tends to just increase the likelihood that gcc will have to spill that (or another) variable to the stack. So setting variables earlier tends to actually pessimize code (yes, a perfect compiler wouldn't care, and just move the assignment later. But for such a perfect compiler it doesn't matter where you put it anyway). Rule of thumb: don't use variables "early" just to get one instruction out of a critical region. The cost of a critical region is not the variables inside of it, but the locking, and using variables early not only has the potential to screw the compiler, it makes the code less readable. Use the variable where they logically make sense. Similarly, straight-line unconditional code without any memory footprint (ie with only cached reads/writes) tends to be basically "free" on any modern CPU. In contrast, jumping around sure isn't (and gcc tends to optimize for the wrong things here). For this reason, don't do if (x) { a = ... } else { a = ...; } if one of the cases is trivial and 'a' is a local variable. It's usually better to use a = ...; if (x) { a = ... } and avoid the jumps. In fact, from a performance standpoint, it can often be better to use a = ...; tmp = ....; a = x ? tmp : a; because that allows the compiler to generate straight-line code with conditional moves - which it would _not_ be able to do if it decided that one of the subexpressions might cause faults or other unintended consequences. Basically: linearize the code. That's what modern CPU's are good at handling. It's ok to do a small number of extra "normal" instructions if you can avoid the nasty kind (jumps etc). Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:45 ` Linus Torvalds @ 2003-12-14 23:08 ` Ingo Molnar 2003-12-15 6:31 ` dan carpenter 2003-12-15 23:15 ` Roland McGrath 0 siblings, 2 replies; 23+ messages in thread From: Ingo Molnar @ 2003-12-14 23:08 UTC (permalink / raw) To: Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, Linus Torvalds wrote: > Note that both of these micro-optimizations are equally likely to > _hurt_. agreed. Patch without these changes reverted is attached. (this is pretty close to your last patch.) just one nit wrt. the first micro-optimization: > So setting variables earlier tends to actually pessimize code [...] in this specific case the critical section only sets the variable, and the lifetime of the variable necessiates a stackslot anyway, on x86. So moving the initialization outside the lock shouldnt pessimise the generated code, in fact it could even free up a register, due to the compiler being able to do: movl $1, 0x12(%esp) due to the clear register spill, instead of using up a register due to the initialization being too close to the setting to 1. But i agree, due to worse readability alone it is not worth having this change. And on non-x86 it could even result in a spilled register. (the nonlinear code at the end of the function was clearly a bad idea.) Ingo --- kernel/exit.c.orig +++ kernel/exit.c @@ -50,8 +50,10 @@ static void __unhash_process(struct task void release_task(struct task_struct * p) { task_t *leader; + int zap_leader; struct dentry *proc_dentry; - + +repeat: BUG_ON(p->state < TASK_ZOMBIE); atomic_dec(&p->user->processes); @@ -70,10 +72,22 @@ void release_task(struct task_struct * p * group, and the leader is zombie, then notify the * group leader's parent process. (if it wants notification.) */ + zap_leader = 0; leader = p->group_leader; - if (leader != p && thread_group_empty(leader) && - leader->state == TASK_ZOMBIE && leader->exit_signal != -1) + if (leader != p && thread_group_empty(leader) && leader->state == TASK_ZOMBIE) { + BUG_ON(leader->exit_signal == -1); do_notify_parent(leader, leader->exit_signal); + /* + * If we were the last child thread and the leader has + * exited already, and the leader's parent ignores SIGCHLD, + * then we are the one who should release the leader. + * + * do_notify_parent() will have marked it self-reaping in + * that case. + */ + if (leader->exit_signal == -1) + zap_leader = 1; + } p->parent->cutime += p->utime + p->cutime; p->parent->cstime += p->stime + p->cstime; @@ -88,6 +102,15 @@ void release_task(struct task_struct * p proc_pid_flush(proc_dentry); release_thread(p); put_task_struct(p); + + /* + * Do this outside the tasklist lock. The reference to the + * leader is safe. There's no recursion possible, since + * the leader of the leader is itself: + */ + p = leader; + if (zap_leader) + goto repeat; } /* we are using it only for SMP init */ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 23:08 ` Ingo Molnar @ 2003-12-15 6:31 ` dan carpenter 2003-12-15 11:43 ` Ingo Molnar 2003-12-15 15:11 ` Linus Torvalds 2003-12-15 23:15 ` Roland McGrath 1 sibling, 2 replies; 23+ messages in thread From: dan carpenter @ 2003-12-15 6:31 UTC (permalink / raw) To: Ingo Molnar, Linus Torvalds Cc: Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath I'm running the patch from BK and I still get unkillable zombies. I'm running strace on these processes and I don't know if that makes a difference. Basically, my program is a modified strace that calls the syscalls as well as printing it out... (I'm trying to test the kernel with bogus syscalls). When I hit ALT-SYSREQ-T most of the zombie processes look like this. msgctl09 Z 9FF99D73 0 1503 1465 1505 1500 (L-TLB) c9fc9f68 00000046 c88b49b0 9ff99d73 00000258 c81cea08 c9fc9f68 c012a65b c81ce9b0 00000011 9ff99d73 00000258 c88b49b0 c11e5bc0 000734f9 a150a7ac 00000258 c81ce9b0 cbfa7f38 c81cefc4 cbfe8850 00000000 c81ce9b0 c9fc9f90 Call Trace: [<c012a65b>] exit_notify+0x2eb/0x900 [<c012b08f>] do_exit+0x41f/0x5c0 [<c012b3d7>] do_group_exit+0x107/0x190 [<c010aa8f>] syscall_call+0x7/0xb There are some process that are stuck but not in zombie state that look like this. mknod09 T 00000001 0 1403 1 1420 1394 (NOTLB) c73adf80 00000082 c11e5bc0 00000001 00000001 c2e91c28 cbfa3f38 858434c0 00000256 c35529b0 c11e5bc0 00000ebe 85843808 c11e5bc0 000004c0 858731ea 00000256 c62c39b0 08053000 c73ac000 c73ac000 08053000 c73ac000 c73adfa4 Call Trace: [<c0131209>] ptrace_notify+0x49/0xf6 [<c015c4ac>] sys_brk+0x2c/0x130 [<c0110cf8>] do_syscall_trace+0x48/0x73 [<c010ab01>] syscall_trace_entry+0x11/0x24 The CPU usage is normal. thanks, dan carpenter ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-15 6:31 ` dan carpenter @ 2003-12-15 11:43 ` Ingo Molnar 2003-12-15 13:07 ` dan carpenter 2003-12-15 15:11 ` Linus Torvalds 1 sibling, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2003-12-15 11:43 UTC (permalink / raw) To: dan carpenter Cc: Linus Torvalds, Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, dan carpenter wrote: > [<c012a65b>] exit_notify+0x2eb/0x900 > [<c012b08f>] do_exit+0x41f/0x5c0 > [<c012b3d7>] do_group_exit+0x107/0x190 > [<c010aa8f>] syscall_call+0x7/0xb > > There are some process that are stuck but not in zombie state that look > like this. do they go away if you do a kill -CONT on them? Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-15 11:43 ` Ingo Molnar @ 2003-12-15 13:07 ` dan carpenter 0 siblings, 0 replies; 23+ messages in thread From: dan carpenter @ 2003-12-15 13:07 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Monday 15 December 2003 03:43 am, Ingo Molnar wrote: > > do they go away if you do a kill -CONT on them? > > Ingo Yes, they do. Thanks. regards, dan carpenter ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-15 6:31 ` dan carpenter 2003-12-15 11:43 ` Ingo Molnar @ 2003-12-15 15:11 ` Linus Torvalds 1 sibling, 0 replies; 23+ messages in thread From: Linus Torvalds @ 2003-12-15 15:11 UTC (permalink / raw) To: dan carpenter Cc: Ingo Molnar, Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath On Sun, 14 Dec 2003, dan carpenter wrote: > > I'm running the patch from BK and I still get unkillable zombies. Note that zombies are normal, and yes, they are _always_ unkillable. That's why they are called zombies ;) The only way to get rid of a zombie is to reap it with "wait()", or just have the parent die (at which point init will do so). Your case looks like the parent isn't waiting for the zombie, mostly because the parent is stopped: > mknod09 T 00000001 0 1403 1 1420 1394 (NOTLB) and that is easy to make happen with strace if you don't detach from the thing you are tracing. You can fix it up by just a "killall -CONT mknod09", and once the parent continues it will reap the zombie. Linus ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 23:08 ` Ingo Molnar 2003-12-15 6:31 ` dan carpenter @ 2003-12-15 23:15 ` Roland McGrath 1 sibling, 0 replies; 23+ messages in thread From: Roland McGrath @ 2003-12-15 23:15 UTC (permalink / raw) To: Ingo Molnar Cc: Linus Torvalds, Petr Vandrovec, Kernel Mailing List, Andrew Morton Ingo's final patch looks good to me. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 22:10 ` Linus Torvalds 2003-12-14 22:17 ` Ingo Molnar 2003-12-14 22:28 ` Ingo Molnar @ 2003-12-15 8:54 ` Arjan van de Ven 2003-12-15 22:55 ` Roland McGrath 2 siblings, 1 reply; 23+ messages in thread From: Arjan van de Ven @ 2003-12-15 8:54 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Petr Vandrovec, Kernel Mailing List, Andrew Morton, Roland McGrath [-- Attachment #1: Type: text/plain, Size: 262 bytes --] On Sun, 2003-12-14 at 23:10, Linus Torvalds wrote: > Even though the parent ignores SIGCHLD it _can_ be running on another CPU > in "wait4()". which fwiw is a case of illegal behavior in the program ... of course the kernel shouldn't die if it happens. [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-15 8:54 ` Arjan van de Ven @ 2003-12-15 22:55 ` Roland McGrath 0 siblings, 0 replies; 23+ messages in thread From: Roland McGrath @ 2003-12-15 22:55 UTC (permalink / raw) To: arjanv Cc: Linus Torvalds, Ingo Molnar, Petr Vandrovec, Kernel Mailing List, Andrew Morton > On Sun, 2003-12-14 at 23:10, Linus Torvalds wrote: > > > Even though the parent ignores SIGCHLD it _can_ be running on another CPU > > in "wait4()". > > which fwiw is a case of illegal behavior in the program ... of course > the kernel shouldn't die if it happens. No, it is legal to call wait* when ignoring SIGCHLD--and it is required to return ECHILD for the dead ones. For example, using waitpid with WNOHANG is a valid way to poll for the liveness of a child (though there is no good reason why an application wouldn't just use kill(,0) for that). It's not a method that has anything to recommend it, but it is perfectly valid and the range of permissible results is well-specified. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [patch] Re: Problem with exiting threads under NPTL 2003-12-14 20:38 ` Linus Torvalds 2003-12-14 20:45 ` Linus Torvalds 2003-12-14 21:06 ` Ingo Molnar @ 2003-12-15 23:06 ` Roland McGrath 2 siblings, 0 replies; 23+ messages in thread From: Roland McGrath @ 2003-12-15 23:06 UTC (permalink / raw) To: Linus Torvalds Cc: Ingo Molnar, Petr Vandrovec, Kernel Mailing List, Andrew Morton > Or is that case impossible to trigger? Looks a bit like that. But if it > _is_ impossible to trigger (ie exit_signal cannot be -1 for a thread > leader), then why does the current code test for "&& leader->exit_signal > != -1)" at all? I think that most of the logic testing exit_signal was written when CLONE_DETACHED could be used independent of CLONE_THREAD. We concluded that was unworkable due to some related races and changed clone so that the situation of a (live) thread group leader with exit_signal == -1 is now impossible. ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2003-12-15 23:15 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-12-14 5:25 Problem with exiting threads under NPTL Petr Vandrovec 2003-12-14 15:02 ` Martin Schlemmer 2003-12-14 19:38 ` [patch] " Ingo Molnar 2003-12-14 20:38 ` Linus Torvalds 2003-12-14 20:45 ` Linus Torvalds 2003-12-14 21:02 ` Ingo Molnar 2003-12-15 15:04 ` Jörn Engel 2003-12-14 21:06 ` Ingo Molnar 2003-12-14 22:10 ` Linus Torvalds 2003-12-14 22:17 ` Ingo Molnar 2003-12-14 22:32 ` Linus Torvalds 2003-12-15 23:04 ` Roland McGrath 2003-12-14 22:28 ` Ingo Molnar 2003-12-14 22:45 ` Linus Torvalds 2003-12-14 23:08 ` Ingo Molnar 2003-12-15 6:31 ` dan carpenter 2003-12-15 11:43 ` Ingo Molnar 2003-12-15 13:07 ` dan carpenter 2003-12-15 15:11 ` Linus Torvalds 2003-12-15 23:15 ` Roland McGrath 2003-12-15 8:54 ` Arjan van de Ven 2003-12-15 22:55 ` Roland McGrath 2003-12-15 23:06 ` Roland McGrath
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox