From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755494AbbIAI4d (ORCPT ); Tue, 1 Sep 2015 04:56:33 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:38688 "EHLO mail-wi0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755302AbbIAI4b (ORCPT ); Tue, 1 Sep 2015 04:56:31 -0400 Date: Tue, 1 Sep 2015 10:56:27 +0200 From: Ingo Molnar To: Markus Trippelsdorf Cc: Linus Torvalds , linux-kernel@vger.kernel.org, Peter Zijlstra , Thomas Gleixner , Andrew Morton , Mike Galbraith Subject: Re: [GIT PULL] scheduler changes for v4.3 Message-ID: <20150901085627.GF6315@gmail.com> References: <20150831172453.GA5429@gmail.com> <20150901070856.GA430@x4> <20150901072741.GB20383@gmail.com> <20150901074449.GB430@x4> <20150901083856.GD25398@gmail.com> <20150901084444.GB421@x4> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150901084444.GB421@x4> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Markus Trippelsdorf wrote: > On 2015.09.01 at 10:38 +0200, Ingo Molnar wrote: > > > > * Markus Trippelsdorf wrote: > > > > > Well, git show a1d8561172f369ba56d636df49a6b4d6d77e2123 : > > > > > > commit a1d8561172f369ba56d636df49a6b4d6d77e2123 > > > Merge: 3959df1dfb95 ff277d4250fe > > > Author: Linus Torvalds > > > Date: Mon Aug 31 20:26:22 2015 -0700 > > > > > diff --cc kernel/cpu.c > > > index 3c91a3fdfce5,664ce5299334..82cf9dff4295 > > > --- a/kernel/cpu.c > > > +++ b/kernel/cpu.c > > > @@@ -394,15 -392,10 +394,15 @@@ static int _cpu_down(unsigned int cpu, > > > smpboot_park_threads(cpu); > > > > > > /* > > > - * So now all preempt/rcu users must observe !cpu_active(). > > > + * Prevent irq alloc/free while the dying cpu reorganizes the > > > + * interrupt affinities. > > > */ > > > + irq_lock_sparse(); > > > > > > + /* > > > + * So now all preempt/rcu users must observe !cpu_active(). > > > + */ > > > - err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); > > > + err = stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); > > > if (err) { > > > /* CPU didn't die: tell everyone. Can't complain. */ > > > cpu_notify_nofail(CPU_DOWN_FAILED | mod, hcpu); > > > > So the irq_lock_sparse() change is from a commit that got merged in the last merge > > window, which is part of v4.2: > > > > ce0d3c0a6fb1 ("genirq: Revert sparse irq locking around __cpu_up() and move it to x86 for now") > > > > Could you please post the patch against Linus's latest that you have tested on > > your system to make it boot fine? > > > > The one you posted cannot possibly build, because access to __stop_machine() is > > gone from cpu.c: > > As I wrote in my other reply. The boot failure is nondeterministic (boot > succeeds roughly every sixth time). So the bisection and the patch is > just bogus (,but the boot failure is real). > > Sorry. No problem. Please let us know if any of these commits does turn out to be the culprit. (Which is always a possibility.) Thanks, Ingo