From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from out1-smtp.messagingengine.com ([66.111.4.25]:53693 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933529AbeFGS1z (ORCPT ); Thu, 7 Jun 2018 14:27:55 -0400 Date: Thu, 7 Jun 2018 20:27:31 +0200 From: Greg KH To: Max Asbock Cc: "stable@vger.kernel.org" , "tytso@mit.edu" , Chris McDermott Subject: Re: [External] Re: panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt Message-ID: <20180607182731.GA24167@kroah.com> References: <20180607083714.GA17489@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: stable-owner@vger.kernel.org List-ID: On Thu, Jun 07, 2018 at 05:54:56PM +0000, Max Asbock wrote: > > ________________________________________ > From: Greg KH [greg@kroah.com] > Sent: Thursday, June 07, 2018 1:37 AM > To: Max Asbock > Cc: stable@vger.kernel.org; tytso@mit.edu; Chris McDermott > Subject: [External] Re: panic at boot time with kernel >= 4.9.98 - uninitialized system_wq in early interrupt > > > Ick :( > > > I'm guessing you also see these problems on 4.17? Can you test there to > > be sure of that? > > We haven't had a chance to test 4.17 on the system where this happens. I am suspecting this won't be a problem on 4.17 as workqueue init has been split up and there is now a workqueue_init_early() in start_kernel(): > /* > * Allow workqueue creation and work item queueing/cancelling > * early. Work item execution depends on kthreads and starts after > * workqueue_init(). > */ > workqueue_init_early(); > > So far we have only seen this with 4.9.x. Also, this only happens when lots of memory is installed (10TB). i am guessing the large memory size changes the timing of the initialization steps and brings out the problem. > When we get access to the system again we can attempt to boot the latest main-line kernel to verify that the work_init_early indeed fixes the issue there. Ugh, I forgot about the workqueue rewrite. How about 4.14, does that work for you? If so, just use that, you shouldn't be using the 4.9 kernel tree on x86-based hardware unless you are somehow forced to due to horrible closed source kernel drivers. You want and need the fixes and speedups that are on 4.14.y, it's a measurable difference. thanks, greg k-h