From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S933615AbbFWQVs (ORCPT <rfc822;w@1wt.eu>);
	Tue, 23 Jun 2015 12:21:48 -0400
Received: from mx1.redhat.com ([209.132.183.28]:45537 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S933549AbbFWQVl (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 23 Jun 2015 12:21:41 -0400
Date: Tue, 23 Jun 2015 18:20:24 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: paulmck@linux.vnet.ibm.com, tj@kernel.org, mingo@redhat.com,
        linux-kernel@vger.kernel.org, der.herr@hofr.at, dave@stgolabs.net,
        riel@redhat.com, viro@ZenIV.linux.org.uk,
        torvalds@linux-foundation.org
Subject: Re: [RFC][PATCH 12/13] stop_machine: Remove lglock
Message-ID: <20150623162024.GA23714@redhat.com>
References: <20150622121623.291363374@infradead.org> <20150622122256.765619039@infradead.org> <20150622222152.GA4460@redhat.com> <20150623100932.GB3644@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150623100932.GB3644@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 06/23, Peter Zijlstra wrote:
>
> On Tue, Jun 23, 2015 at 12:21:52AM +0200, Oleg Nesterov wrote:
>
> > Suppose that stop_two_cpus(cpu1 => 0, cpu2 => 1) races with stop_machine().
> >
> > 	- stop_machine takes the lock on CPU 0, adds the work
> > 	  and drops the lock
> >
> > 	- cpu_stop_queue_work() queues both works
>
> cpu_stop_queue_work() only ever queues _1_ work.
>
> > 	- stop_machine takes the lock on CPU 1, etc
> >
> > In this case both CPU 0 and 1 will run multi_cpu_stop() but they will
> > use different multi_stop_data's, so they will wait for each other
> > forever?
>
> So what you're saying is:
>
> 	queue_stop_cpus_work()		stop_two_cpus()
>
> 	cpu_stop_queue_work(0,..);
> 					spin_lock(0);
> 					spin_lock(1);
>
> 					__cpu_stop_queue_work(0,..);
> 					__cpu_stop_queue_work(1,..);
>
> 					spin_unlock(1);
> 					spin_unlock(0);
> 	cpu_stop_queue_work(1,..);

Yes, sorry for confusion.

> We can of course slap a percpu-rwsem in, but I wonder if there's
> anything smarter we can do here.

I am wondering too if we can make this multi_cpu_stop() more clever.
Or at least add some deadlock detection...

Until then you can probably just uglify queue_stop_cpus_work() and
avoid the race,

	static void queue_stop_cpus_work(const struct cpumask *cpumask,
					 cpu_stop_fn_t fn, void *arg,
					 struct cpu_stop_done *done)
	{
		struct cpu_stopper *stopper;
		struct cpu_stop_work *work;
		unsigned long flags;
		unsigned int cpu;

		local_irq_save(flags);
		for_each_cpu(cpu, cpumask) {
			stopper = &per_cpu(cpu_stopper, cpu);
			spin_lock(&stopper->lock);

			work = &per_cpu(stop_cpus_work, cpu);
			work->fn = fn;
			work->arg = arg;
			work->done = done;
		}

		for_each_cpu(cpu, cpumask)
			__cpu_stop_queue_work(cpu, &per_cpu(stop_cpus_work, cpu));

		for_each_cpu(cpu, cpumask) {
			stopper = &per_cpu(cpu_stopper, cpu);
			spin_unlock(&stopper->lock);
		}
		local_irq_restore(flags);
	}

ignoring lockdep problems.

It would be nice to remove stop_cpus_mutex, it actually protects
stop_cpus_work... Then probably stop_two_cpus() can just use
stop_cpus(). We could simply make stop_cpus_mutex per-cpu too,
but this doesn't look nice.

Oleg.