From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753768AbbHCPAl (ORCPT <rfc822;w@1wt.eu>);
	Mon, 3 Aug 2015 11:00:41 -0400
Received: from mx1.redhat.com ([209.132.183.28]:60973 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753701AbbHCPAj (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 3 Aug 2015 11:00:39 -0400
Date: Mon, 3 Aug 2015 16:58:36 +0200
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>, Rik van Riel <riel@redhat.com>,
        Tejun Heo <tj@kernel.org>, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 6/6] stop_machine: kill stop_cpus_lock and
	lg_double_lock/unlock()
Message-ID: <20150803145836.GA13173@redhat.com>
References: <20150721192219.GA31150@redhat.com> <20150721192247.GA31191@redhat.com> <20150730215527.GQ25159@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150730215527.GQ25159@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 07/30, Peter Zijlstra wrote:
>
> On Tue, Jul 21, 2015 at 09:22:47PM +0200, Oleg Nesterov wrote:
>
> > +	err = -EDEADLK;
> > +	if (stop_work_pending(stopper1) != stop_work_pending(stopper2))
> > +		goto unlock;
>
> You could DoS/false positive this by running stop_one_cpu() in a loop,
> and thereby 'always' having work pending on one but not the other.

as we already discussed this is not a problem.

> > +	if (unlikely(err == -EDEADLK)) {
> > +		cond_resched();
> > +		goto retry;
>
> And this just gives me -rt nightmares.

Why?

> As it is, -rt does horrible things to stop_machine, and I would very
> much like to make it such that we don't need to do that.
>
> Now, obviously, stop_cpus() is _BAD_ for -rt, and we try real hard to
> make sure that doesn't happen,

Yes. stop_cpus() is already bad so I am not sure I understand why this
change make the things really worse.

stop_two_cpus() needs to spin/retry if it races with the main loop in
queue_stop_cpus_work(),

        preempt_disable();
        for_each_cpu(cpu, cpumask) {
                work = &per_cpu(cpu_stopper.stop_work, cpu);
                work->fn = fn;
                work->arg = arg;
                work->done = done;
                cpu_stop_queue_work(cpu, work);
        }
        preempt_enable();

and iirc preempt_disable() means "disable preemption" even in -rt, but
I am not sure. So "goto retry" should be really, really unlikely.

Besides, whatever we do stop_two_cpus(X, Y) will wait anyway if ->stop_work
was queued on X or Y anyway. And with your patch in the next email it will
spin too (yes, yes, -rt differs).


Another case when stop_two_cpus(X, Y) needs to retry is when ->stop_work
was already dequeued on CPU X but not on CPU Y (and this is why it needs
cond_resched() for CONFIG_PREEMPT=n, it can run on CPU Y). This does not
look really bad too, the migration/Y thread is already activated and it
has the highest priority.


So I still think that at least correctness wise this patch is fine. Am I
missed something else?


> Paul's RCU branch already kills try_stop_cpus() dead, so that wart is
> also gone. But we're still stuck with stop_machine_from_inactive_cpu()
> which does a spin-wait for exclusive state. So I suppose we'll have to
> keep stop_cpus_mutex :/

Yes, we still need stop_cpus_mutex. Even if we remove try_stop_cpus() and
stop_machine_from_inactive_cpu(). But this is another issue.

Oleg.