From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964958AbbJHQDT (ORCPT ); Thu, 8 Oct 2015 12:03:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:39729 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S964853AbbJHQDL (ORCPT ); Thu, 8 Oct 2015 12:03:11 -0400 Date: Thu, 8 Oct 2015 17:59:53 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: heiko.carstens@de.ibm.com, Tejun Heo , Ingo Molnar , Rik van Riel , Thomas Gleixner , linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] stop_machine: change cpu_stop_queue_two_works() to rely on stopper->enabled Message-ID: <20151008155953.GA20508@redhat.com> References: <20151008145059.GA17916@redhat.com> <20151008145136.GA18149@redhat.com> <20151008150431.GO3816@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151008150431.GO3816@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/08, Peter Zijlstra wrote: > > On Thu, Oct 08, 2015 at 04:51:36PM +0200, Oleg Nesterov wrote: > > @@ -261,12 +276,8 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void * > > set_state(&msdata, MULTI_STOP_PREPARE); > > > > /* > > + * We do not want to migrate to inactive CPU. FIXME: move this > > + * into the caller. > > */ > > if (!cpu_active(cpu1) || !cpu_active(cpu2)) { > > preempt_enable(); > > So we cannot move that into the caller.. Why? > because this function sleeps > with wait_for_completion(). > > Or rather, it would force the caller to use get_online_cpus(), which we > worked really hard to avoid. Aaah wait. Sorry for confusion! I meant "move this into the callback, migrate_swap_stop()". > Also, I think we still want the patch I proposed which ensures the > stopper thread is active 'early', because the load balancer pretty much Perhaps. Although I do not really understand why it is important. I mean, either way we unpark it at CPU_ONLINE stage, just sched_cpu_active() has a higher priority. But this is off-topic in a sense that the main point of this patch is that stop_two_cpus() no longer needs to abuse cpu_active() checks to avoid the race with cpu_up/down, we can simply rely on ->enabled. And again, we need to take both locks to remove "lglock stop_cpus_lock". So I think your change can be applied after this series too. Or I missed something? Oleg.