From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752871AbcJJOG2 (ORCPT ); Mon, 10 Oct 2016 10:06:28 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:42649 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751702AbcJJOG1 (ORCPT ); Mon, 10 Oct 2016 10:06:27 -0400 Date: Mon, 10 Oct 2016 16:06:06 +0200 From: Peter Zijlstra To: Thomas Gleixner Cc: mingo@kernel.org, juri.lelli@arm.com, rostedt@goodmis.org, xlpang@redhat.com, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com, jdesfossez@efficios.com, bristot@redhat.com Subject: Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI Message-ID: <20161010140606.GR3568@worktop.programming.kicks-ass.net> References: <20161003091234.879763059@infradead.org> <20161003091847.704255067@infradead.org> <20161007112143.GJ3117@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Oct 09, 2016 at 01:17:50PM +0200, Thomas Gleixner wrote: > On Fri, 7 Oct 2016, Peter Zijlstra wrote: > > top_waiter = futex_top_waiter(hb, &key); > > if (top_waiter) { > > - ret = wake_futex_pi(uaddr, uval, top_waiter, hb); > > + struct futex_pi_state *pi_state = top_waiter->pi_state; > > + > > + ret = -EINVAL; > > + if (!pi_state) > > + goto out_unlock; > > + > > + /* > > + * If current does not own the pi_state then the futex is > > + * inconsistent and user space fiddled with the futex value. > > + */ > > + if (pi_state->owner != current) > > + goto out_unlock; > > + > > + /* > > + * Grab a reference on the pi_state and drop hb->lock. > > + * > > + * The reference ensures pi_state lives, dropping the hb->lock > > + * is tricky.. wake_futex_pi() will take rt_mutex::wait_lock to > > + * close the races against futex_lock_pi(), but in case of > > + * _any_ fail we'll abort and retry the whole deal. > > + */ > > + WARN_ON_ONCE(!atomic_inc_not_zero(&pi_state->refcount)); > > + spin_unlock(&hb->lock); > > + > > + ret = wake_futex_pi(uaddr, uval, pi_state); > > + > > + put_pi_state(pi_state); > > put_pi_state() requires hb->lock protection AFAICT. > > CPU0 CPU1 > > wake_futex_pi() attach_to_pi_state() > put_pi_state() > refcount--; > if (!refcount) > free_state(); > WARN_ON(!pi_state->refcount); > > we might not see the warning, but in any case the following access to > pi_state on cpu1 is borked. Not sure this can happen, we do all attach_to_pi_state() with hb->lock held, and the only way to get there is through futex_q->pi_state. And as long as that link is stable, pi_state is too. That is, the only way for wake_futex_pi() to drop the last reference is if there are no futex_q's referencing it anymore, but that also means attach_to_pi_state() cannot happen (!top_waiter).