From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andi Kleen Subject: Re: [PATCH 0/4] i386 - pte update optimizations Date: 13 Apr 2007 14:27:09 +0200 Message-ID: References: <461EE9E5.6060403@vmware.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org To: Keir Fraser Cc: Zachary Amsden , "H. Peter Anvin" , Andrew Morton , Virtualization Mailing List , Chris Wright , David Rientjes , Hugh Dickins , Linux Kernel Mailing List List-Id: virtualization@lists.linuxfoundation.org Keir Fraser writes: > On 13/4/07 03:24, "Zachary Amsden" wrote: > > >> You do know that P6 and higher don't do locked bus references as long > >> as the value is in the cache, right? > > > > Yes. Even then, last time I clocked instructions, xchg was still slower > > than read / write, although I could be misremembering. And it's not > > totally clear that they will always be in cached state, however, and for > > SMP, we still want to drop the implicit lock in cases where the > > processor might not know they are cached exclusive, but we know there > > are no other racing users. And there are plenty of old processors out > > there to still make it worthwhile. > > LOCKed instruction suck really badly on the netburst microarchitecture (like > factor of 10x, or not far off). I think it's probably because of their side > effect of serialising memory accesses, causing horrible pipeline stalls. Unfortunately they tend to be HyperThreaded usually (except for early ones and Celerons) and need the LOCK anyways. -Andi