From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753502AbZBCVf2 (ORCPT ); Tue, 3 Feb 2009 16:35:28 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751391AbZBCVfT (ORCPT ); Tue, 3 Feb 2009 16:35:19 -0500 Received: from mx2.redhat.com ([66.187.237.31]:52421 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750866AbZBCVfT (ORCPT ); Tue, 3 Feb 2009 16:35:19 -0500 Date: Tue, 3 Feb 2009 22:32:44 +0100 From: Oleg Nesterov To: Kaz Kylheku Cc: linux-kernel@vger.kernel.org, Andrew Morton , Roland McGrath Subject: Re: main thread pthread_exit/sys_exit bug! Message-ID: <20090203213244.GA29040@redhat.com> References: <3f43f78b0902011432y354c1b35m8f645640433f7b49@mail.gmail.com> <20090201174159.4a52e15c.akpm@linux-foundation.org> <20090202064509.GA20237@redhat.com> <3f43f78b0902012310p46186417m66873f410b948fd3@mail.gmail.com> <20090202165606.GA13346@redhat.com> <498754EF.8090604@redhat.com> <3f43f78b0902021239s21566f76hf7f59850b2dbf45a@mail.gmail.com> <3f43f78b0902021839j1eb1eb04u49be47277c99900d@mail.gmail.com> <20090203133313.GA5679@redhat.com> <3f43f78b0902031151ta841190i2c7898facc34cb95@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3f43f78b0902031151ta841190i2c7898facc34cb95@mail.gmail.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/03, Kaz Kylheku wrote: > > Well, it doesn't bother me that that has to be thrown out. > In fact, I do not agree with the requirement that the thread > which calls pthread_exit must not respond to signals; > the original patch works for me. What about other users? We can't know what how much they depend on the current behaviour. > I.e. in my embedded GNU/Linux distro, that requirement > doesn't exist. And since I can't find it in the Single > Unix Specification, so much for that! > > Nothing in the spec says that once pthread_exit is called, > signals are stopped. This function invokes cleanup handling, > and thread-specific-storage destruction. During any of those > tasks, signals can still be happening. Any of those > tasks can easily enter into an indefinite wait. What if > a cleanup handler performs a blocking RPC to a remote > server? Well, there you are, stuck in pthread_exit, > handling signals, and not cleaning up your robust list, etc. > > I also don't require robust locks to be cleaned up > instantly if they are owned by a main thread that has > called pthread_exit. OK, OK. Please forget about signals, futexes, etc. Simple program: pthread_t main_thread; void *tfunc(void *a) { pthread_joni(main_thread, NULL); return NULL; } int main(void) { pthread_t thr; main_thread = pthread_self(); pthread_create(&thr, NULL, tfunc, NULL); pthread_exit(NULL); } I bet this will hang with your patch applied. Because we depend on sys_futex(->clear_child_tid, FUTEX_WAKE, ...). Kaz, you know, it is not easy to say "you patch is wrong in any case, no matter how much it will be improved" ;) But even if the current behaviour is not optimal, we must not change it unless we think it leads to bugs. We can't know which application can suffer. The current behaviour is old. > Face it, allowing the thread leader to exit is as wrong as doing > other stupid things to the leader, like unsharing the signal > handler. Perhaps. That is why I said _something_ like your patch perhaps makes sense. But this is tricky, and I don't see a simple/clean way to improve things. And, otoh, I do not see _real_ problems with the zombie leaders. As for original problem, it should be fixed anyway. wait_task_stopped() should take SIGNAL_STOP_STOPPED into account, not task->state. Unless we are ptracer, of course. Oleg.