From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54126) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ai9ho-0001Nt-7D for qemu-devel@nongnu.org; Mon, 21 Mar 2016 19:57:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ai9hl-0004my-11 for qemu-devel@nongnu.org; Mon, 21 Mar 2016 19:57:52 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:50771) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ai9hk-0004mu-SC for qemu-devel@nongnu.org; Mon, 21 Mar 2016 19:57:48 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 81E9520C01 for ; Mon, 21 Mar 2016 19:57:48 -0400 (EDT) Date: Mon, 21 Mar 2016 19:59:50 -0400 From: "Emilio G. Cota" Message-ID: <20160321235950.GA9356@flamenco> References: <1458317932-1875-1-git-send-email-alex.bennee@linaro.org> <1458317932-1875-2-git-send-email-alex.bennee@linaro.org> <20160321215039.GA2466@flamenco> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [RFC v1 01/11] tcg: move tb_find_fast outside the tb_lock critical section List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Maydell Cc: mttcg@listserver.greensocs.com, Peter Crosthwaite , Mark Burton , Alvise Rigo , QEMU Developers , Sergey Fedorov , Paolo Bonzini , KONRAD =?iso-8859-1?Q?Fr=E9d=E9ric?= , Alex =?iso-8859-1?Q?Benn=E9e?= , Andreas =?iso-8859-1?Q?F=E4rber?= , Richard Henderson On Mon, Mar 21, 2016 at 22:08:06 +0000, Peter Maydell wrote: > It is not _necessary_, but it is a performance optimization to > speed up the "missed in the TLB" case. (A TLB flush will wipe > the tb_jmp_cache table.) From the thread where the move-to-front-of-list > behaviour was added in 2010, benefits cited: (snip) > I think what's happening here is that for guest CPUs where TLB > invalidation happens fairly frequently (notably ARM, because > we don't model ASIDs in the QEMU TLB and thus have to flush > the TLB on any context switch) the case of "we didn't hit in > the TLB but we do have this TB and it was used really recently" > happens often enough to make it worthwhile for the > tb_find_physical() code to keep its hash buckets in LRU order. > > Obviously that's all five year old data now, so a pinch of > salt may be indicated, but I'd rather we didn't just remove > the optimisation without some benchmarking to check that it's > not significant. A 2x difference is huge. Good point. Most of my tests have been on x86-on-x86, and the difference there (for many CPU-intensive benchmarks such as SPEC) was negligible. Just tested the current master booting Alex' debian ARM image, without LRU, and I see a 20% increase in boot time. I'll add per-bucket locks to keep the same behaviour without hurting scalability. Thanks, Emilio