From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============8615787312279447014==" MIME-Version: 1.0 From: Oleg Nesterov To: lkp@lists.01.org Subject: Re: kthread_stop insanity (Re: [[DEBUG] force] 2642458962: BUG: unable to handle kernel paging request at ffffc90000997f18) Date: Mon, 27 Jun 2016 16:54:44 +0200 Message-ID: <20160627145443.GA17145@redhat.com> In-Reply-To: List-Id: --===============8615787312279447014== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On 06/26, Andy Lutomirski wrote: > > kthread_stop is *sick*. > > struct kthread self; > > ... > > current->vfork_done =3D &self.exited; > > ... > > do_exit(ret); > > And then some other thread goes and waits for the completion, which is > *on the stack*, which, in any sane world (e.g. with my series > applied), is long gone by then. Yes, I forgot this when we discussed the problems with ti->flags/etc... > But this is broken even without any changes: since when is gcc > guaranteed to preserve the stack contents when a function ends with a > sibling call, let alone with a __noreturn call? I don't know if gcc can actually drop the stack frame in this case, but even if it can this looks fixeable. > Is there seriously no way to directly wait for a struct task_struct to > exit? Could we, say, kmalloc the completion (or maybe even the whole > struct kthread) and (ick!) hang it off ->vfork_done? Sure we can... And yes, I think we need to alloc the whole struct kthread. Just another (unfortunate) complication, the current code is simple. And probably kthread/kthread_stop should switch to task_work_exit(). Oleg. --===============8615787312279447014==-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751961AbcF0OyK (ORCPT ); Mon, 27 Jun 2016 10:54:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:51307 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751860AbcF0OyH (ORCPT ); Mon, 27 Jun 2016 10:54:07 -0400 Date: Mon, 27 Jun 2016 16:54:44 +0200 From: Oleg Nesterov To: Andy Lutomirski Cc: Linus Torvalds , Peter Zijlstra , Tejun Heo , LKP , LKML , kernel test robot Subject: Re: kthread_stop insanity (Re: [[DEBUG] force] 2642458962: BUG: unable to handle kernel paging request at ffffc90000997f18) Message-ID: <20160627145443.GA17145@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 27 Jun 2016 14:54:06 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/26, Andy Lutomirski wrote: > > kthread_stop is *sick*. > > struct kthread self; > > ... > > current->vfork_done = &self.exited; > > ... > > do_exit(ret); > > And then some other thread goes and waits for the completion, which is > *on the stack*, which, in any sane world (e.g. with my series > applied), is long gone by then. Yes, I forgot this when we discussed the problems with ti->flags/etc... > But this is broken even without any changes: since when is gcc > guaranteed to preserve the stack contents when a function ends with a > sibling call, let alone with a __noreturn call? I don't know if gcc can actually drop the stack frame in this case, but even if it can this looks fixeable. > Is there seriously no way to directly wait for a struct task_struct to > exit? Could we, say, kmalloc the completion (or maybe even the whole > struct kthread) and (ick!) hang it off ->vfork_done? Sure we can... And yes, I think we need to alloc the whole struct kthread. Just another (unfortunate) complication, the current code is simple. And probably kthread/kthread_stop should switch to task_work_exit(). Oleg.