From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S934018AbbJHOy7 (ORCPT <rfc822;w@1wt.eu>);
	Thu, 8 Oct 2015 10:54:59 -0400
Received: from mx1.redhat.com ([209.132.183.28]:34127 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S933998AbbJHOyz (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 8 Oct 2015 10:54:55 -0400
Date: Thu, 8 Oct 2015 16:51:36 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: heiko.carstens@de.ibm.com, Tejun Heo <tj@kernel.org>,
        Ingo Molnar <mingo@kernel.org>, Rik van Riel <riel@redhat.com>,
        Thomas Gleixner <tglx@linutronix.de>, linux-kernel@vger.kernel.org
Subject: [PATCH 3/3] stop_machine: change cpu_stop_queue_two_works() to
	rely on stopper->enabled
Message-ID: <20151008145136.GA18149@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20151008145059.GA17916@redhat.com>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Change cpu_stop_queue_two_works() to ensure that both CPU's have
stopper->enabled == T or fail otherwise.

This way stop_two_cpus() no longer needs to check cpu_active() to
avoid the deadlock. This patch doesn't remove these checks, we will
do this later.

Note: we need to take both stopper->lock's at the same time, but this
will also help to remove lglock from stop_machine.c, so I hope this
is fine.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 kernel/stop_machine.c |   29 ++++++++++++++++++++---------
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 688d6b3..6d85d27 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -219,12 +219,27 @@ static int multi_cpu_stop(void *data)
 static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
 				    int cpu2, struct cpu_stop_work *work2)
 {
+	struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1);
+	struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
+	int err;
+
 	lg_double_lock(&stop_cpus_lock, cpu1, cpu2);
-	cpu_stop_queue_work(cpu1, work1);
-	cpu_stop_queue_work(cpu2, work2);
+	spin_lock_irq(&stopper1->lock);
+	spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);
+
+	err = -ENOENT;
+	if (!stopper1->enabled || !stopper2->enabled)
+		goto unlock;
+
+	err = 0;
+	__cpu_stop_queue_work(stopper1, work1);
+	__cpu_stop_queue_work(stopper2, work2);
+unlock:
+	spin_unlock(&stopper2->lock);
+	spin_unlock_irq(&stopper1->lock);
 	lg_double_unlock(&stop_cpus_lock, cpu1, cpu2);
 
-	return 0;
+	return err;
 }
 /**
  * stop_two_cpus - stops two cpus
@@ -261,12 +276,8 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2, cpu_stop_fn_t fn, void *
 	set_state(&msdata, MULTI_STOP_PREPARE);
 
 	/*
-	 * If we observe both CPUs active we know _cpu_down() cannot yet have
-	 * queued its stop_machine works and therefore ours will get executed
-	 * first. Or its not either one of our CPUs that's getting unplugged,
-	 * in which case we don't care.
-	 *
-	 * This relies on the stopper workqueues to be FIFO.
+	 * We do not want to migrate to inactive CPU. FIXME: move this
+	 * into the caller.
 	 */
 	if (!cpu_active(cpu1) || !cpu_active(cpu2)) {
 		preempt_enable();
-- 
1.5.5.1