From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752501AbcF1UM3 (ORCPT ); Tue, 28 Jun 2016 16:12:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36427 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752184AbcF1UMZ (ORCPT ); Tue, 28 Jun 2016 16:12:25 -0400 Date: Tue, 28 Jun 2016 22:12:50 +0200 From: Oleg Nesterov To: Andy Lutomirski Cc: Andy Lutomirski , Linus Torvalds , Peter Zijlstra , Tejun Heo , LKP , LKML , kernel test robot Subject: Re: kthread_stop insanity (Re: [[DEBUG] force] 2642458962: BUG: unable to handle kernel paging request at ffffc90000997f18) Message-ID: <20160628201249.GA12471@redhat.com> References: <20160627145443.GA17145@redhat.com> <20160627170010.GA21628@redhat.com> <20160628185853.GA3998@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.18 (2008-05-17) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Tue, 28 Jun 2016 20:12:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/28, Andy Lutomirski wrote: > > On Tue, Jun 28, 2016 at 11:58 AM, Oleg Nesterov wrote: > > > > Then how (say) proc_pid_stack() can work? If it hits the task which is > > alreay dead we are (probably) fine, valid_stack_ptr() should fail iiuc. > > > > But what if we race with the last schedule() ? "addr = *stack" can read > > the already vfree'ed memory, no? > > > > Looks like print_context_stack/etc need probe_kernel_address or I missed > > something. > > Yuck. I suppose I could add a reference count to protect the stack. > Would that simplify the kthread code? Well yes, that is why I asked. So please tell me if you are going to do this... But we can fix kthread code without this hack which we do not need in the long term anyway. Unfortunaly we need to cleanup kernel/smpboot.c first. And I was going to do this a long ago for quite different reason ;) So please forget unless you see another reason for this change. Oleg.