From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754493AbcGLOT6 (ORCPT ); Tue, 12 Jul 2016 10:19:58 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:34389 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752359AbcGLOT5 (ORCPT ); Tue, 12 Jul 2016 10:19:57 -0400 Date: Tue, 12 Jul 2016 16:19:55 +0200 From: Peter Zijlstra To: Frederic Weisbecker Cc: "Paul E. McKenney" , tglx@linutronix.de, linux-kernel@vger.kernel.org, rgkernel@gmail.com Subject: Re: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline Message-ID: <20160712141955.GS30909@twins.programming.kicks-ass.net> References: <20160630175845.GA10269@linux.vnet.ibm.com> <20160630232957.GB32568@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160630232957.GB32568@lerouge> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 01, 2016 at 01:29:59AM +0200, Frederic Weisbecker wrote: > > void wake_up_nohz_cpu(int cpu) > > { > > - if (!wake_up_full_nohz_cpu(cpu)) > > + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) > > So at this point, as we passed CPU_DYING, I believe the CPU isn't visible in the domains > anymore (correct me if I'm wrong), So rebuilding the domains is an utter trainwreck atm. But I suspect that's wrong. Esp. with cpusets enabled we rebuild the domains very late from a workqueue. That is why the scheduler has cpu_active_mask to constrain the domains during hotplug. Now I need to go sort through that trainwreck because deadline needs it, but I've not had the opportunity :/ > therefore get_nohz_timer_target() can't return it, > unless smp_processor_id() is the only alternative. With the below that should be true I think. diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 6c0cdb5a73f8..b35cacbe9b9e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -626,7 +626,7 @@ int get_nohz_timer_target(void) rcu_read_lock(); for_each_domain(cpu, sd) { - for_each_cpu(i, sched_domain_span(sd)) { + for_each_cpu_and(i, sched_domain_span(sd), cpu_active_mask) { if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) { cpu = i; goto unlock;