From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753820AbcKYJYE (ORCPT ); Fri, 25 Nov 2016 04:24:04 -0500 Received: from merlin.infradead.org ([205.233.59.134]:53614 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753535AbcKYJXn (ORCPT ); Fri, 25 Nov 2016 04:23:43 -0500 Date: Fri, 25 Nov 2016 10:23:26 +0100 From: Peter Zijlstra To: Thomas Gleixner Cc: mingo@kernel.org, juri.lelli@arm.com, rostedt@goodmis.org, xlpang@redhat.com, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com, jdesfossez@efficios.com, bristot@redhat.com Subject: Re: [RFC][PATCH 4/4] futex: Rewrite FUTEX_UNLOCK_PI Message-ID: <20161125092326.GG3174@twins.programming.kicks-ass.net> References: <20161007112143.GJ3117@twins.programming.kicks-ass.net> <20161008165540.GI3568@worktop.programming.kicks-ass.net> <20161021122735.GA3117@twins.programming.kicks-ass.net> <20161123192005.GA3107@twins.programming.kicks-ass.net> <20161124165241.GF3174@twins.programming.kicks-ass.net> <20161124185807.GI3092@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161124185807.GI3092@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 24, 2016 at 07:58:07PM +0100, Peter Zijlstra wrote: > OK, so clearly I'm confused. So let me try again. > > LOCK_PI, does in one function: lookup_pi_state, and fixup_owner. If > fixup_owner fails with -EAGAIN, we can redo the pi_state lookup. > > The requeue stuff, otoh, has one each. REQUEUE_WAIT has fixup_owner(), > CMP_REQUEUE has lookup_pi_state. Therefore, fixup_owner failing with > -EAGAIN leaves us dead in the water. There's nothing to go back to to > retry. > > So far, so 'good', right? > > Now, as far as I understand this requeue stuff, we have 2 futexes, an > inner futex and an outer futex. The inner futex is always 'locked' and > serves as a collection pool for waiting threads. > > The requeue crap picks one (or more) waiters from the inner futex and > sticks them on the outer futex, which gives them a chance to run. > > So WAIT_REQUEUE blocks on the inner futex, but knows that if it ever > gets woken, it will be on the outer futex, and hence needs to > fixup_owner if the futex and rt_mutex state got out of sync. > > CMP_REQUEUEUEUE picks the one (or more) waiters of the inner futex and > sticks them on the outer futex. > > So far, so 'good' ? > > The thing I'm not entire sure on is what happens with the outer futex, > do we first LOCK_PI it before doing CMP_REQUEUE, giving us waiters, and > then UNLOCK_PI to let them rip? Or do we just CMP_REQUEUE and then let > whoever wins finish with UNLOCK_PI? > > > In any case, I don't think it matters much, either way we can race > betwen the 'last' UNLOCK_PI and getting rt_mutex waiters and then hit > the &init_task funny state, such that WAIT_REQUEUE waking hits EAGAIN > and we're 'stuck'. > > Now, if we always CMP_REQUEUE to a locked outer futex, then we cannot > know, at CMP_REQUEUE time, who will win and cannot fix up. OTOH, if we always first LOCK_PI before doing CMP_REQUEUE, I don't think we can hit the funny state, LOCK_PI will have fixed it up for us. So the question is, do we mandate LOCK_PI before CMP_REQUEUE?