From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750772AbWFMIh3 (ORCPT ); Tue, 13 Jun 2006 04:37:29 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1750776AbWFMIh3 (ORCPT ); Tue, 13 Jun 2006 04:37:29 -0400 Received: from ecfrec.frec.bull.fr ([129.183.4.8]:17029 "EHLO ecfrec.frec.bull.fr") by vger.kernel.org with ESMTP id S1750772AbWFMIh2 (ORCPT ); Tue, 13 Jun 2006 04:37:28 -0400 Message-ID: <448E79DA.8050704@bull.net> Date: Tue, 13 Jun 2006 10:39:54 +0200 From: Pierre Peiffer User-Agent: Thunderbird 1.5.0.2 (X11/20060501) MIME-Version: 1.0 To: =?ISO-8859-15?Q?S=E9bastien_Dugu=E9?= Cc: Jakub Jelinek , Arjan van de Ven , Ingo Molnar , Atsushi Nemoto , linux-kernel@vger.kernel.org Subject: Re: NPTL mutex and the scheduling priority References: <20060612.171035.108739746.nemoto@toshiba-tops.co.jp> <1150115008.3131.106.camel@laptopd505.fenrus.org> <20060612124406.GZ3115@devserv.devel.redhat.com> <1150125869.3835.12.camel@frecb000686> In-Reply-To: <1150125869.3835.12.camel@frecb000686> X-MIMETrack: Itemize by SMTP Server on ECN002/FR/BULL(Release 5.0.12 |February 13, 2003) at 13/06/2006 10:41:10, Serialize by Router on ECN002/FR/BULL(Release 5.0.12 |February 13, 2003) at 13/06/2006 10:41:11, Serialize complete at 13/06/2006 10:41:11 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=ISO-8859-15; format=flowed Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Sébastien Dugué a écrit : > But maybe a better solution for condvars would be to implement > something like a futex_requeue_pi() to handle the broadcast and > only use PI futexes all along in glibc. > > Any ideas? I'm currently thinking about it, and as far as I can see, it should be technically feasible but not obvious. In fact, PI-futex adds a rt-mutex behind each futex, when there are some waiters. Each waiter is then queued two times: once in the chain list of the hash-bucket, once in the (ordered) wait_list of the rt-mutex. What we want, with a futex_requeue_pi, is a requeue of some tasks from (futex1, rt_mutex1) to (futex2, rt_mutex2), respecting the wait_list order of rt_mutex1.wait-list. => this needs something like a rt_mutex_requeue, and given an element of rt_mutex1.wait_list, we need to retrieve its futex_q to requeue it to the second hash-bucket chain (of futex2). Moreover, we must take care of the case where the futex2 is not yet locked (i.e. has no owner): there is not yet a pi_state nor a rt_mutex associated with the futex2 ... And during all of this, we must take care of several race conditions in several places. I'll continue my investigation, but I really wonder if futex_requeue_pi will still be an "optimization" as it should be. So comments from the experts are welcome ;-) -- Pierre