From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752924AbbF2EEU (ORCPT ); Mon, 29 Jun 2015 00:04:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57179 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751080AbbF2EEO (ORCPT ); Mon, 29 Jun 2015 00:04:14 -0400 Date: Mon, 29 Jun 2015 06:02:51 +0200 From: Oleg Nesterov To: Peter Zijlstra Cc: paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com, der.herr@hofr.at, dave@stgolabs.net, riel@redhat.com, viro@ZenIV.linux.org.uk, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH 0/6] stop_machine: kill stop_cpus_mutex and stop_cpus_lock Message-ID: <20150629040251.GA14558@redhat.com> References: <20150626021455.GA5675@redhat.com> <20150626122330.GY19282@twins.programming.kicks-ass.net> <20150626204612.GA14573@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150626204612.GA14573@redhat.com> User-Agent: Mutt/1.5.18 (2008-05-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/26, Oleg Nesterov wrote: > > 2 cmpxchg()'s vs 2 spin_lock()'s. Plus wake_up(), but we can check > waitqueue_active(). > > Do you think thi will be noticeably slower? > > Of course, if it races with another stop_two_cpus/stop_cpus it will > sleep, but in this case we need to wait anyway. > > > And I don't think that percpu-rwsem instead of stop_cpu_mutex makes > sense. at least I don't understand how can it help. OK, stop_two_cpus() > can use percpu_down_read() to avoid the deadlock with stop_cpus(), but > you still need double-lock... So I don't think this will make it faster, > this will just penalize stop_cpus(). Or I misunderstood. > > So I am still not convinced... But probably I am too biased ;) Yes... I'll probably try to make v2, this version is overcomplicated and buggy. > Btw. I can't understand the cpu_active() checks in stop_two_cpus(). > Do we really need them? Ah, please ignore. Yes, we can't rely on stopper->enabled check in cpu_stop_queue_work(), cpu_stop_signal_done() does not update multi_stop_data->num_threads / ->thread_ack. So we need to ensure that cpu_online() == T for both CPUS or multi_cpu_stop() can hang. But we can't use cpu_online() instead, take_cpu_down() can be already queued. So this relies on the fact that CPU_DOWN_PREPARE (which removes CPU from cpu_active_mask) is called before stop_machine(take_cpu_down) and we do not care that cpu_active() is not stable; if we see cpu_active() cpu_online() can't change unders us because take_cpu_down() was not queued. If we change stop_two_cpus() to use stop_work_alloc_one() it can use cpu_online(), int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *arg) { struct cpu_stop_work *work1, *work2; struct cpu_stop_done done; struct multi_stop_data msdata = { .fn = fn, .data = arg, .num_threads = 2, .active_cpus = cpumask_of(cpu1), }; set_state(&msdata, MULTI_STOP_PREPARE); cpu_stop_init_done(&done, 2); if (cpu1 > cpu2) swap(cpu1, cpu2); work1 = stop_work_alloc_one(cpu1, true); work2 = stop_work_alloc_one(cpu2, true); /* stop_machine() is blocked, cpu can't go away */ if (cpu_online(cpu1) && cpu_online(cpu2)) { work1->fn = work2->fn = multi_cpu_stop; work1->arg = work2->arg = &msdata; work1->done = work2->done = &done; preempt_disable(); cpu_stop_queue_work(cpu1, work1); cpu_stop_queue_work(cpu2, work2); preempt_enable(); wait_for_completion(&done.completion); } stop_work_free_one(cpu1); stop_work_free_one(cpu2); stop_work_wake_up(); return done.executed ? done.ret : -ENOENT; } Oleg.