From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52676) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c4nkd-0001Xv-4K for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:42:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c4nkY-0001l3-92 for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:42:39 -0500 Received: from mail-wm0-x231.google.com ([2a00:1450:400c:c09::231]:36189) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1c4nkX-0001kE-Nq for qemu-devel@nongnu.org; Thu, 10 Nov 2016 06:42:34 -0500 Received: by mail-wm0-x231.google.com with SMTP id g23so35484461wme.1 for ; Thu, 10 Nov 2016 03:42:33 -0800 (PST) References: <1473847013-20191-1-git-send-email-pbonzini@redhat.com> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <1473847013-20191-1-git-send-email-pbonzini@redhat.com> Date: Thu, 10 Nov 2016 11:42:30 +0000 Message-ID: <87k2cb8st5.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH 0/3] target-arm: cache tbflags in CPUARMState List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: qemu-devel@nongnu.org Paolo Bonzini writes: > Computing TranslationBlock flags is pretty expensive on ARM, especially > 32-bit. Because tbflags are computed on every tb lookup, it is not > unlikely to see cpu_get_tb_cpu_state close to the top of the profile > now that QHT makes the hash table much more efficient. > > However, most tbflags only change when the EL is switched or after > MSR instructions. Based on this observation, this series caches these > tbflags in CPUARMState, resulting in a 10-15% speedup on 32-bit code. Hi, I'm starting to clear out my review queue but I notice these now longer apply cleanly to master. Where you going to re-issue the series once you'd addressed Peter's concerns? My general comments are I think this is a good idea but my concern is ensuring state changes get picked up and we don't end up with inconsistent state between real and cached values. I still have the scars from my last attempt to rationalise cpu.h pstate, aarch64, uncached_cpsr and spsr! Cheers, -- Alex Bennée