* [PATCH] zap_other_threads() detaches thread group leader @ 2003-08-09 19:17 Matt Wilson 2003-08-12 7:40 ` Roland McGrath 0 siblings, 1 reply; 12+ messages in thread From: Matt Wilson @ 2003-08-09 19:17 UTC (permalink / raw) To: Linus Torvalds Cc: linux-kernel, Ingo Molnar, Roland McGrath, manfred, Andrew Morton Hi. The change to detach the threads in zap_other_threads() broke the case where the non-thread-group-leader is the cause of de_thread(). In this case the group leader will be detached and freed before switch_exec_pids() is complete and invalid data will be used. This is a patch that makes sure that the group leader does not get detached and reaped. Thought that Ingo or Roland submitted this for me, but since it's still not in test3 I figured I would send it myself. Please CC: me on replies. Cheers, Matt msw@redhat.com --- linux-2.6.0-test3/kernel/signal.c.zap 2003-08-09 14:58:50.000000000 -0400 +++ linux-2.6.0-test3/kernel/signal.c 2003-08-09 14:59:42.000000000 -0400 @@ -1011,9 +1011,11 @@ void zap_other_threads(struct task_struc * killed as part of a thread group due to another * thread doing an execve() or similar. So set the * exit signal to -1 to allow immediate reaping of - * the process. + * the process. But don't detach the thread group + * leader. */ - t->exit_signal = -1; + if (t != p->group_leader) + t->exit_signal = -1; sigaddset(&t->pending.signal, SIGKILL); rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] zap_other_threads() detaches thread group leader 2003-08-09 19:17 [PATCH] zap_other_threads() detaches thread group leader Matt Wilson @ 2003-08-12 7:40 ` Roland McGrath 2003-08-12 7:52 ` [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED Roland McGrath 0 siblings, 1 reply; 12+ messages in thread From: Roland McGrath @ 2003-08-12 7:40 UTC (permalink / raw) To: Matt Wilson Cc: Linus Torvalds, linux-kernel, Ingo Molnar, Andrew Morton, Jeremy Fitzhardinge Because of other sticky issues still unresolved and the lack of apparent real need to support the troublesome case, at last check Linus was leaning towards disallowing CLONE_THREAD&~CLONE_DETACHED in copy_process so that the exit_signal = -1 change in zap_other_threads is no longer apropos and can be taken back out entirely. Unless that is done, the != group_leader check most certainly has to go in. Thanks, Roland ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-12 7:40 ` Roland McGrath @ 2003-08-12 7:52 ` Roland McGrath 2003-08-13 7:11 ` Jeremy Fitzhardinge 2003-08-14 17:32 ` Linus Torvalds 0 siblings, 2 replies; 12+ messages in thread From: Roland McGrath @ 2003-08-12 7:52 UTC (permalink / raw) To: Linus Torvalds, Andrew Morton Cc: Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge Please apply this patch to get us back out of this useless quagmire and disallow the problematic case that noone wants to try to use any more. Thanks, Roland --- linux/kernel/fork.c 18 Jul 2003 16:27:39 -0000 1.142 +++ linux/kernel/fork.c 12 Aug 2003 06:20:05 -0000 @@ -744,7 +749,8 @@ struct task_struct *copy_process(unsigne * Thread groups must share signals as well, and detached threads * can only be started up within the thread group. */ - if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND)) + if ((clone_flags & CLONE_THREAD) && + (clone_flags & (CLONE_SIGHAND|CLONE_DETACHED)) != (CLONE_SIGHAND|CLONE_DETACHED)) return ERR_PTR(-EINVAL); if ((clone_flags & CLONE_DETACHED) && !(clone_flags & CLONE_THREAD)) return ERR_PTR(-EINVAL); --- linux/kernel/signal.c 7 Aug 2003 20:06:12 -0000 1.98 +++ linux/kernel/signal.c 12 Aug 2003 06:18:34 -0000 @@ -1005,16 +1016,6 @@ void zap_other_threads(struct task_struc */ if (t->state & (TASK_ZOMBIE|TASK_DEAD)) continue; - - /* - * We don't want to notify the parent, since we are - * killed as part of a thread group due to another - * thread doing an execve() or similar. So set the - * exit signal to -1 to allow immediate reaping of - * the process. - */ - t->exit_signal = -1; - sigaddset(&t->pending.signal, SIGKILL); rm_from_queue(SIG_KERNEL_STOP_MASK, &t->pending); signal_wake_up(t, 1); ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-12 7:52 ` [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED Roland McGrath @ 2003-08-13 7:11 ` Jeremy Fitzhardinge 2003-08-13 9:37 ` Roland McGrath 2003-08-14 17:32 ` Linus Torvalds 1 sibling, 1 reply; 12+ messages in thread From: Jeremy Fitzhardinge @ 2003-08-13 7:11 UTC (permalink / raw) To: Roland McGrath Cc: Linus Torvalds, Andrew Morton, Matt Wilson, Linux Kernel List, Ingo Molnar On Tue, 2003-08-12 at 00:52, Roland McGrath wrote: > Please apply this patch to get us back out of this useless quagmire and > disallow the problematic case that noone wants to try to use any more. It seems to me there's a couple of problems with this patch: One is that it prevents any single piece of code using clone(CLONE_THREAD) from working on both 2.4 and 2.6. While clone() is mostly hidden under libpthread.so, there are some applications which use it directly for various good reasons. I've implemented two versions of my clone/wait stuff in Valgrind to cope with this, but not everyone will have yet. Perhaps I'm the only person in the world to use CLONE_THREAD, but it seems unlikely. After all, clone() is a public interface. The other reason is that this seems to be covering over an actual conceptual problem rather than fixing it. I can't say I really understand this piece of the kernel (which isn't too surprising that it is very complicated because it represents 30 years of Unix history packed into a dense mass), but one niggling concern I have is that even when you're using CLONE_DETACHED, if you've attached with ptrace(), you effectively become a parent who can wait on the detached clone. If you can wait via ptrace, then doesn't it mean that all the wait-related problems are still visible? I've tried to reproduce some of the problems I've been seeing with non-detached threads in the case where they are detached traced threads, but so far it seems OK. But that may be because I haven't hit it properly yet. I guess my concern is that I'm not convinced we really understand what's going wrong here, and so I'm not convinced that this patch really fixes things. J ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-13 7:11 ` Jeremy Fitzhardinge @ 2003-08-13 9:37 ` Roland McGrath 0 siblings, 0 replies; 12+ messages in thread From: Roland McGrath @ 2003-08-13 9:37 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Linus Torvalds, Andrew Morton, Matt Wilson, Linux Kernel List, Ingo Molnar > One is that it prevents any single piece of code using > clone(CLONE_THREAD) from working on both 2.4 and 2.6. I am not aware of what uses have been made of CLONE_THREAD in vanilla 2.4. The semantics are so drastically different between 2.4 and 2.6 that it seems useful to me that anything expecting the behavior you get from CLONE_THREAD in 2.4 have some indication that the world is different now. > Perhaps I'm the only person in the world to use CLONE_THREAD, but it > seems unlikely. After all, clone() is a public interface. Whoever is using it needs to cope with the many differences in the way that CLONE_THREAD threads behave in 2.6 vs 2.4. > I guess my concern is that I'm not convinced we really understand what's > going wrong here, and so I'm not convinced that this patch really fixes > things. Every problem that I have personally seen or thought of, we understand and have addressed in some fashion (the last of them by punting so it's ruled out). I am not making any special claims about the code--there certainly could be more problems. But there is not a failure mode that I have either seen in a reproducible test case or seen postulated in our discussion that we have not elucidated fully. I can certainly accept that you have seen some failures that are not specifically explained, but as I understand it there aren't any that you are seeing now or that you could reproduce that aren't addressed by the changes we've discussed before now. If there are specific failures you can reproduce that we have not already addressed, I would certainly like to see them. When the last potential problem was identified, it seemed sufficiently sticky to address properly that the inclination was to punt. This seemed appropriate given that the only user to complain about the subject (you) had said he'd punted on using the problematical feature combination (CLONE_THREAD without CLONE_DETACHED). I wouldn't claim that the last unresolved problem is intractable (in fact, I already proposed a hacky solution that might be adequate) or that there necessarily are (or aren't) more problems in the whole area of exit handling. I don't know of other problems, and if people in actual fact care about using CLONE_THREAD without CLONE_DETACHED in 2.6, then it is worth someone's time and effort to address the remaining known problem and field any newly discovered ones rather than punting. If there are in fact other mysterious problems with exit handling unrelated to the CLONE_THREAD|SIGCHLD style usage as you seem to be alluding to, certainly we should track them down. I honestly don't know any specific reasons to suspect any such things. Vague concern about ptrace seems like a red herring to me, unless you can cite some specific oddness actually happening to a real kernel. All the exit handling code has cases for ptrace specifically and does things differently. If you are talking specifically about the problem we discussed most recently, de_thread blocking when there are zombies in the thread group: indeed, I think it will block when there are ptrace'd zombies and the exec will not complete until the tracer reaps them all. That could be called a semantic of ptrace, rather than a bug. Otherwise, a debugger wouldn't necessarily get reports for every traced thread and would need to know by other means that an exec had happened and implicitly reaped them (not really hard to deal with, but perhaps an unnecessary complication of the ptrace thread tracing interface). Thanks, Roland ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-12 7:52 ` [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED Roland McGrath 2003-08-13 7:11 ` Jeremy Fitzhardinge @ 2003-08-14 17:32 ` Linus Torvalds 2003-08-14 17:53 ` Jamie Lokier 2003-08-14 20:56 ` Roland McGrath 1 sibling, 2 replies; 12+ messages in thread From: Linus Torvalds @ 2003-08-14 17:32 UTC (permalink / raw) To: Roland McGrath Cc: Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge On Tue, 12 Aug 2003, Roland McGrath wrote: > > Please apply this patch to get us back out of this useless quagmire and > disallow the problematic case that noone wants to try to use any more. > > - if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND)) > + if ((clone_flags & CLONE_THREAD) && > + (clone_flags & (CLONE_SIGHAND|CLONE_DETACHED)) != (CLONE_SIGHAND|CLONE_DETACHED)) > return ERR_PTR(-EINVAL); > if ((clone_flags & CLONE_DETACHED) && !(clone_flags & CLONE_THREAD)) Look at the condition that follows: it disallows CLONE_DETACHED for non-threads. Which really means that together with the first change, CLONE_DETACHED is always the same thing as CLONE_THREAD: you can't have one without the other. Which means that they are logically not separate bits any more, and we should just get rid of CLONE_DETACHED altogether, and use CLONE_THREAD in all cases where it is tested for. I'd really prefer not to keep a bit around that has to mean the same thing as another bit - that way just lies madness. So I'll document CLONE_DETACHED as being a no-op, and change the _one_ place that used it to just use CLONE_THREAD instead. Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 17:32 ` Linus Torvalds @ 2003-08-14 17:53 ` Jamie Lokier 2003-08-14 18:03 ` Linus Torvalds 2003-08-14 20:56 ` Roland McGrath 1 sibling, 1 reply; 12+ messages in thread From: Jamie Lokier @ 2003-08-14 17:53 UTC (permalink / raw) To: Linus Torvalds Cc: Roland McGrath, Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge Linus Torvalds wrote: > I'd really prefer not to keep a bit around that has to mean the same thing > as another bit - that way just lies madness. So I'll document > CLONE_DETACHED as being a no-op, and change the _one_ place that used it > to just use CLONE_THREAD instead. Don't forget to mention that software that may be run on 2.5 kernels needs to set both bits, else won't work as expected. -- Jamie ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 17:53 ` Jamie Lokier @ 2003-08-14 18:03 ` Linus Torvalds 2003-08-14 18:27 ` Jamie Lokier 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2003-08-14 18:03 UTC (permalink / raw) To: Jamie Lokier Cc: Roland McGrath, Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge On Thu, 14 Aug 2003, Jamie Lokier wrote: > > Don't forget to mention that software that may be run on 2.5 kernels > needs to set both bits, else won't work as expected. Well, the CLONE_DETACHED without CLONE_THREAD case was never legal, and my current patch will actually warn about the newly disallowed case too. I've not gotten any warnings with RH-9, and I don't think anybody else has a new enough glibc to even use CLONE_THREAD at all. But yes, there will be a warning, at least for a time (and eventually we'll just return -EINVAL silently - ie the program will _fail_ the clone(), it won't just act strangely). Linus ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 18:03 ` Linus Torvalds @ 2003-08-14 18:27 ` Jamie Lokier 2003-08-14 18:33 ` Linus Torvalds 0 siblings, 1 reply; 12+ messages in thread From: Jamie Lokier @ 2003-08-14 18:27 UTC (permalink / raw) To: Linus Torvalds Cc: Roland McGrath, Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge Linus Torvalds wrote: > > Don't forget to mention that software that may be run on 2.5 kernels > > needs to set both bits, else won't work as expected. > > Well, the CLONE_DETACHED without CLONE_THREAD case was never legal, and my > current patch will actually warn about the newly disallowed case too. I've > not gotten any warnings with RH-9, and I don't think anybody else has a > new enough glibc to even use CLONE_THREAD at all. I'm thinking of programs that don't use Glibc, but do use these features. Perhaps I'm the only person who writes such code :) > But yes, there will be a warning, at least for a time (and eventually > we'll just return -EINVAL silently - ie the program will _fail_ the > clone(), it won't just act strangely). -EINVAL would be great. -- Jamie ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 18:27 ` Jamie Lokier @ 2003-08-14 18:33 ` Linus Torvalds 2003-08-14 18:38 ` Jamie Lokier 0 siblings, 1 reply; 12+ messages in thread From: Linus Torvalds @ 2003-08-14 18:33 UTC (permalink / raw) To: Jamie Lokier Cc: Roland McGrath, Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge On Thu, 14 Aug 2003, Jamie Lokier wrote: > > I'm thinking of programs that don't use Glibc, but do use these features. > Perhaps I'm the only person who writes such code :) Sure, there have always been projects out there that have used the "native" clone() interfaces, but they're fairly rare, and CLONE_THREAD being a new addition makes it less likely to show up as a problem. > > But yes, there will be a warning, at least for a time (and eventually > > we'll just return -EINVAL silently - ie the program will _fail_ the > > clone(), it won't just act strangely). > > -EINVAL would be great. I've pushed the change to the BK trees, and if you're not a BK user you can just test the attached patch directly to see if it affects you.. Linus ---- # This is a BitKeeper generated patch for the following project: # Project Name: Linux kernel tree # This patch format is intended for GNU patch command version 2.5 or higher. # This patch includes the following deltas: # ChangeSet 1.1149 -> 1.1150 # include/linux/sched.h 1.159 -> 1.160 # kernel/fork.c 1.134 -> 1.135 # # The following is the BitKeeper ChangeSet Log # -------------------------------------------- # 03/08/14 torvalds@home.osdl.org 1.1150 # Mark CLONE_DETACHED as being irrelevant: it must match CLONE_THREAD. # # CLONE_THREAD without CLONE_DETACHED will now return -EINVAL, and # for a while we will warn about anything that uses it (there are no # known users, but this will help pinpoint any problems if somebody # used to care about the invalid combination). # -------------------------------------------- # diff -Nru a/include/linux/sched.h b/include/linux/sched.h --- a/include/linux/sched.h Thu Aug 14 11:32:19 2003 +++ b/include/linux/sched.h Thu Aug 14 11:32:19 2003 @@ -49,7 +49,7 @@ #define CLONE_SETTLS 0x00080000 /* create a new TLS for the child */ #define CLONE_PARENT_SETTID 0x00100000 /* set the TID in the parent */ #define CLONE_CHILD_CLEARTID 0x00200000 /* clear the TID in the child */ -#define CLONE_DETACHED 0x00400000 /* parent wants no child-exit signal */ +#define CLONE_DETACHED 0x00400000 /* Not used - CLONE_THREAD implies detached uniquely */ #define CLONE_UNTRACED 0x00800000 /* set if the tracing process can't force CLONE_PTRACE on this clone */ #define CLONE_CHILD_SETTID 0x01000000 /* set the TID in the child */ #define CLONE_STOPPED 0x02000000 /* Start in stopped state */ diff -Nru a/kernel/fork.c b/kernel/fork.c --- a/kernel/fork.c Thu Aug 14 11:32:19 2003 +++ b/kernel/fork.c Thu Aug 14 11:32:19 2003 @@ -746,8 +746,22 @@ */ if ((clone_flags & CLONE_THREAD) && !(clone_flags & CLONE_SIGHAND)) return ERR_PTR(-EINVAL); - if ((clone_flags & CLONE_DETACHED) && !(clone_flags & CLONE_THREAD)) + + /* + * CLONE_DETACHED must match CLONE_THREAD: it's a historical + * thing. + */ + if (!(clone_flags & CLONE_DETACHED) != !(clone_flags & CLONE_THREAD)) { + /* Warn about the old no longer supported case so that we see it */ + if (clone_flags & CLONE_THREAD) { + static int count; + if (count < 5) { + count++; + printk(KERN_WARNING "%s trying to use CLONE_THREAD without CLONE_DETACH\n", current->comm); + } + } return ERR_PTR(-EINVAL); + } retval = security_task_create(clone_flags); if (retval) @@ -877,10 +891,7 @@ p->parent_exec_id = p->self_exec_id; /* ok, now we should be set up.. */ - if (clone_flags & CLONE_DETACHED) - p->exit_signal = -1; - else - p->exit_signal = clone_flags & CSIGNAL; + p->exit_signal = (clone_flags & CLONE_THREAD) ? -1 : (clone_flags & CSIGNAL); p->pdeath_signal = 0; /* ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 18:33 ` Linus Torvalds @ 2003-08-14 18:38 ` Jamie Lokier 0 siblings, 0 replies; 12+ messages in thread From: Jamie Lokier @ 2003-08-14 18:38 UTC (permalink / raw) To: Linus Torvalds Cc: Roland McGrath, Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge Linus Torvalds wrote: > I've pushed the change to the BK trees, and if you're not a BK user you > can just test the attached patch directly to see if it affects you.. That's cool, I am not affected. Simply the documentation you mentioned, I felt someone who writes code using CLONE_THREAD in future should be aware that if they just use CLONE_THREAD by itself, their code won't work as expected with the later 2.5 kernels. The EINVAL ensures that perfectly. So, :) -- Jamie ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED 2003-08-14 17:32 ` Linus Torvalds 2003-08-14 17:53 ` Jamie Lokier @ 2003-08-14 20:56 ` Roland McGrath 1 sibling, 0 replies; 12+ messages in thread From: Roland McGrath @ 2003-08-14 20:56 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Matt Wilson, linux-kernel, Ingo Molnar, Jeremy Fitzhardinge > Which means that they are logically not separate bits any more, and we > should just get rid of CLONE_DETACHED altogether, and use CLONE_THREAD in > all cases where it is tested for. That is certainly fine by me. Thanks, Roland ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2003-08-14 20:56 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-08-09 19:17 [PATCH] zap_other_threads() detaches thread group leader Matt Wilson 2003-08-12 7:40 ` Roland McGrath 2003-08-12 7:52 ` [PATCH] revert zap_other_threads breakage, disallow CLONE_THREAD without CLONE_DETACHED Roland McGrath 2003-08-13 7:11 ` Jeremy Fitzhardinge 2003-08-13 9:37 ` Roland McGrath 2003-08-14 17:32 ` Linus Torvalds 2003-08-14 17:53 ` Jamie Lokier 2003-08-14 18:03 ` Linus Torvalds 2003-08-14 18:27 ` Jamie Lokier 2003-08-14 18:33 ` Linus Torvalds 2003-08-14 18:38 ` Jamie Lokier 2003-08-14 20:56 ` Roland McGrath
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox