From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S267732AbUHEOLG (ORCPT ); Thu, 5 Aug 2004 10:11:06 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S267688AbUHEOHh (ORCPT ); Thu, 5 Aug 2004 10:07:37 -0400 Received: from zcars04f.nortelnetworks.com ([47.129.242.57]:36337 "EHLO zcars04f.nortelnetworks.com") by vger.kernel.org with ESMTP id S267732AbUHEODL (ORCPT ); Thu, 5 Aug 2004 10:03:11 -0400 Message-ID: <41123DDD.5040607@nortelnetworks.com> Date: Thu, 05 Aug 2004 10:02:05 -0400 X-Sybari-Space: 00000000 00000000 00000000 00000000 From: Chris Friesen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ulrich Drepper CC: Andrew Morton , inaky.perez-gonzalez@intel.com, linux-kernel@vger.kernel.org, robustmutexes@lists.osdl.org, rusty@rustcorp.com.au, mingo@elte.hu, jamie@shareable.org Subject: Re: [RFC/PATCH] FUSYN Realtime & robust mutexes for Linux, v2.3.1 References: <20040804232123.3906dab6.akpm@osdl.org> <4111DC8C.7050504@redhat.com> <20040805001737.78afb0d6.akpm@osdl.org> <4111E3B5.1070608@redhat.com> In-Reply-To: <4111E3B5.1070608@redhat.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Ulrich Drepper wrote: > Andrew Morton wrote: > > >>How large is the slowdown, and on what workloads? > > > The fast path for all locking primitives etc in nptl today is entirely > at userlevel. Normally just a single atomic operation with a dozen > other instructions. With the fusyn stuff each and every locking > operation needs a system call to register/unregister the thread as it > locks/unlocks mutex/rwlocks/etc. Go figure how well this works. We are > talking about making the fast path of the locking primitives > two/three/four orders of magnitude more expensive. And this for > absolutely no benefit for 99.999% of all the code which uses threads. Just a small clarification. (Rusty already touched on this briefly, but I think he made a mistake.) If the arch has atomic compare-and-exchange, then the non-contended case is entirely userspace and no syscall is needed. I don't think that the cmpxchg need be 64-bit. From the OLS 2004 talk: int vfulock_lock (&vfulock, flags, pid, &timeout) { unsigned old = VFULOCK_UNLOCKED; if (cmpxchg(vfulock,old,pid) != old) return 0; return SYSCALL(ufulock_lock,3,vfulock,flags,to); } That looks like a 32-bit cmpxchg to me. Also, Inaky reported general operation about 10% slower than NPTL, but said that he wanted to fix that if possible. Chris