From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S932371AbdDEPDa (ORCPT <rfc822;w@1wt.eu>);
        Wed, 5 Apr 2017 11:03:30 -0400
Received: from bombadil.infradead.org ([65.50.211.133]:48387 "EHLO
        bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1753339AbdDEPC1 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 5 Apr 2017 11:02:27 -0400
Date: Wed, 5 Apr 2017 08:02:17 -0700
From: Darren Hart <dvhart@infradead.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: tglx@linutronix.de, mingo@kernel.org, juri.lelli@arm.com,
        rostedt@goodmis.org, xlpang@redhat.com, bigeasy@linutronix.de,
        linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com,
        jdesfossez@efficios.com, bristot@redhat.com
Subject: Re: [PATCH -v6 04/13] futex,rt_mutex: Provide futex specific
 rt_mutex API
Message-ID: <20170405150217.GA27833@fury>
References: <20170322103547.756091212@infradead.org>
 <20170322104151.702962446@infradead.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20170322104151.702962446@infradead.org>
User-Agent: Mutt/1.7.1 (2016-10-04)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Mar 22, 2017 at 11:35:51AM +0100, Peter Zijlstra wrote:
> Part of what makes futex_unlock_pi() intricate is that
> rt_mutex_futex_unlock() -> rt_mutex_slowunlock() can drop
> rt_mutex::wait_lock.
> 
> This means we cannot rely on the atomicy of wait_lock, which we would
> like to do in order to not rely on hb->lock so much.
> 
> The reason rt_mutex_slowunlock() needs to drop wait_lock is because it
> can race with the rt_mutex fastpath, however futexes have their own
> fast path.
> 
> Since futexes already have a bunch of separate rt_mutex accessors,
> complete that set and implement a rt_mutex variant without fastpath
> for them.

Premise makes sense, I'm tripping over some detail - wondering if it is all
related...

> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  kernel/futex.c                  |   30 ++++++++++-----------
>  kernel/locking/rtmutex.c        |   55 +++++++++++++++++++++++++++++-----------
>  kernel/locking/rtmutex_common.h |    9 +++++-
>  3 files changed, 62 insertions(+), 32 deletions(-)
> 
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -916,7 +916,7 @@ void exit_pi_state_list(struct task_stru
>  		pi_state->owner = NULL;
>  		raw_spin_unlock_irq(&curr->pi_lock);
>  
> -		rt_mutex_unlock(&pi_state->pi_mutex);
> +		rt_mutex_futex_unlock(&pi_state->pi_mutex);
>  
>  		spin_unlock(&hb->lock);
>  
> @@ -1364,20 +1364,18 @@ static int wake_futex_pi(u32 __user *uad
>  	pi_state->owner = new_owner;
>  	raw_spin_unlock(&new_owner->pi_lock);
>  
> -	raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
> -
> -	deboost = rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
> -
>  	/*
> -	 * First unlock HB so the waiter does not spin on it once he got woken
> -	 * up. Second wake up the waiter before the priority is adjusted. If we
> -	 * deboost first (and lose our higher priority), then the task might get
> -	 * scheduled away before the wake up can take place.
> +	 * We've updated the uservalue, this unlock cannot fail.

It isn't clear to me what I should understand from this new comment. How does
the value of the uval affect whether or not the pi_state->pi_mutex can be
unlocked or not? Or are you noting that we've set FUTEX_WAITIERS so any valid
userspace operations will be forced intot he kernel and can't race with us since
we hold the hb->lock? With futexes, I think it's important that we be very
explicit in our comment blocks.

>  	 */
> +	deboost = __rt_mutex_futex_unlock(&pi_state->pi_mutex, &wake_q);
> +
> +	raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
>  	spin_unlock(&hb->lock);
> -	wake_up_q(&wake_q);
> -	if (deboost)
> +
> +	if (deboost) {
> +		wake_up_q(&wake_q);

Is moving wake_up_q under deboost related to this change or is it just an
optimization since there is no need to wake unless we are deboosting ourselves -
which was true before as well?

If this is due to the rt_mutex_futex* API, I haven't made the connection.

-- 
Darren Hart
VMware Open Source Technology Center