From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1N2huV-0008Ps-2V for qemu-devel@nongnu.org; Tue, 27 Oct 2009 05:00:11 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1N2huT-0008On-GL for qemu-devel@nongnu.org; Tue, 27 Oct 2009 05:00:10 -0400 Received: from [199.232.76.173] (port=37085 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1N2huT-0008OW-5V for qemu-devel@nongnu.org; Tue, 27 Oct 2009 05:00:09 -0400 Received: from hall.aurel32.net ([88.191.82.174]:43230) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1N2huS-00040Q-JX for qemu-devel@nongnu.org; Tue, 27 Oct 2009 05:00:08 -0400 Message-ID: <4AE6B691.9090602@aurel32.net> Date: Tue, 27 Oct 2009 10:00:01 +0100 From: Aurelien Jarno MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH v2 05/10] target-arm: optimize arm load/store multiple ops References: <1256386749-85299-1-git-send-email-juha.riihimaki@nokia.com> <1256386749-85299-6-git-send-email-juha.riihimaki@nokia.com> <20091027083942.GA4399@hall.aurel32.net> <4B83D3FD-D222-4483-9EA2-870D131BB01E@nokia.com> In-Reply-To: <4B83D3FD-D222-4483-9EA2-870D131BB01E@nokia.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juha.Riihimaki@nokia.com Cc: qemu-devel@nongnu.org Juha.Riihimaki@nokia.com a écrit : > On Oct 27, 2009, at 10:39, ext Aurelien Jarno wrote: > >> On Sat, Oct 24, 2009 at 03:19:04PM +0300, juha.riihimaki@nokia.com >> wrote: >>> From: Juha Riihimäki >>> >>> RM load/store multiple instructions can be slightly optimized by >>> loading the register offset constant into a variable outside the >>> register loop and using the preloaded variable inside the loop >>> instead >>> of reloading the offset value to a temporary variable on each loop >>> iteration. This causes less TCG ops to be generated for a ARM load/ >>> store multiple instruction if there are more than one register >>> accessed, otherwise the number of generated TCG ops is the same. >>> >>> Signed-off-by: Juha Riihimäki >>> Acked-by: Laurent Desnogues >> This patch breaks, the boot of an arm kernel, as tmp2 is used >> elsewhere >> within this code path. > > True, I just noticed that as well. This is because the resource leak > patch > was refactored to utilize tmp2 inside the loop as well. I just sent a > new > revision of this patch that uses tmp3 for th constant value. > >> OTOH, while it reduce the number of TCG ops, that should not impact >> the >> generated host asm code, as most (all ?) targets are able to add a >> small constant value to a register in one instruction. > > This is true, but I still think it provides a small speed gain as > there are > less TCG ops to be processed when generating host code...? It means less TCG ops, but one more temp variable, therefore if there is a gain, I don't think it is something even measurable. OTOH it makes the code a bit more complex to read. I am not really opposed to this patch (and the other patches of the same kind), but I will need some more arguments to convince me. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurelien@aurel32.net http://www.aurel32.net