From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46203) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZRd8W-0006QE-8W for qemu-devel@nongnu.org; Tue, 18 Aug 2015 05:24:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZRd8V-0003t6-2Q for qemu-devel@nongnu.org; Tue, 18 Aug 2015 05:24:52 -0400 Received: from mail-la0-x233.google.com ([2a00:1450:4010:c03::233]:36425) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZRd8U-0003rD-Qj for qemu-devel@nongnu.org; Tue, 18 Aug 2015 05:24:50 -0400 Received: by lagz9 with SMTP id z9so95424890lag.3 for ; Tue, 18 Aug 2015 02:24:49 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150803091716.GF30591@aurel32.net> References: <55B870A9.4090008@gmx.net> <20150729150147.GO11361@aurel32.net> <55B99F95.8010603@gmx.net> <20150730075252.GT11361@aurel32.net> <55B9DD60.8020801@gmx.net> <20150730085500.GV11361@aurel32.net> <20150730155003.GE30591@aurel32.net> <20150731154323.GD23508@aurel32.net> <20150803091716.GF30591@aurel32.net> From: Artyom Tarasenko Date: Tue, 18 Aug 2015 11:24:30 +0200 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to speedup the emulation? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno Cc: qemu-devel , Dennis Luehring , Richard Henderson On Mon, Aug 3, 2015 at 11:17 AM, Aurelien Jarno wrote: > On 2015-08-03 10:31, Artyom Tarasenko wrote: >> Hi Aurelien, >> >> On Fri, Jul 31, 2015 at 5:43 PM, Aurelien Jarno wrote: >> >> >> > It uses a lot of integer functions >> >> > based on CPU flags, so most of the time is spent computing them in >> >> > helper_compute_psr. >> >> >> >> I wonder if this can be optimized. I guess most RISC CPUs would have a >> >> similar problem. Unlike x86, the compilers usually optimize >> >> instructions on flag usage. If there is an instruction modifying flags >> >> in a code, the flags will be used for sure, so it probably makes a >> >> little sense to pospone the flag computation? >> > >> > Indeed. ARM and SH4 use one TCG temp per flag, and they can be computed >> > one by one using setcond. The optimizer and the liveness analysis then >> > get rid of the unused computation. However while it allows intra-TB >> > optimization, it prevent any other flags optimization. Therefore the >> > only way to know if it is a good idea or not is to implement it and >> > benchmark that, but using a bit more than a single biased benchmark like >> > the one from sysbench. >> > >> > Also note that the current implementation predates the introduction of >> > setcond, which is necessary to be able to compute the flags using TCG >> > code. >> >> Thanks for explaining it, the problem is much more clear now. >> Moving to setcond is definitely worth a shot. I'd like to play with it. >> What would be the minimal entity to change without reworking the complete TCG: >> a) one flag for one instruction, >> b) all flags for one instruction, >> c) one flag for all instructions, >> or d) all flags for all instructions (gradually moving to setcond is >> not possible) ? > > You should with the c) option. You can look at how I done this for SH4, > starting with commit 5ed9a259c164bb9fd2a6fe8a363a4bda2e4a5461. FWIW I tried this for Z and N flags, but the resulting code was slower than the current implementation. Actually the current implementation is already very good intra-TB optimized: for the case where a conditional branch/move follows a compare operation no external helpers are called. The unoptimized case is a sequence of multiple cmp and branch operations (likely created by a "case" statement in the original source code), especially where cmp is in a delay slot of a branch instruction. I wonder whether we always have to finish a TB on a conditional jump. Maybe it would make sense to translate further if a destination of a jump is not too far from dc->pc? The definition of "not too far" is indeed tricky. Artyom -- Regards, Artyom Tarasenko SPARC and PPC PReP under qemu blog: http://tyom.blogspot.com/search/label/qemu