From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:48472) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLbyL-0005v1-6y for qemu-devel@nongnu.org; Sun, 15 May 2011 10:07:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QLbyK-0006jY-1M for qemu-devel@nongnu.org; Sun, 15 May 2011 10:07:05 -0400 Received: from mail-qy0-f180.google.com ([209.85.216.180]:58401) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLbyJ-0006jM-NK for qemu-devel@nongnu.org; Sun, 15 May 2011 10:07:03 -0400 Received: by qyk10 with SMTP id 10so2285219qyk.4 for ; Sun, 15 May 2011 07:07:03 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20110515140352.GQ30615@hall.aurel32.net> References: <20110514220447.GA30615@hall.aurel32.net> <20110515111428.GH30615@hall.aurel32.net> <20110515130241.GO30615@hall.aurel32.net> <20110515140352.GQ30615@hall.aurel32.net> From: Blue Swirl Date: Sun, 15 May 2011 17:06:43 +0300 Message-ID: Content-Type: text/plain; charset=UTF-8 Subject: Re: [Qemu-devel] [PATCH RFC 00/11] AREG0 elimination List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Aurelien Jarno Cc: Laurent Desnogues , qemu-devel On Sun, May 15, 2011 at 5:03 PM, Aurelien Jarno wrote: > On Sun, May 15, 2011 at 04:42:05PM +0300, Blue Swirl wrote: >> On Sun, May 15, 2011 at 4:02 PM, Aurelien Jarno wrote: >> > On Sun, May 15, 2011 at 03:37:00PM +0300, Blue Swirl wrote: >> >> On Sun, May 15, 2011 at 3:14 PM, Laurent Desnogues >> >> wrote: >> >> > On Sun, May 15, 2011 at 1:33 PM, Blue Swirl wrote: >> >> > [...] >> >> >>> x86_64 uses r14 as TCG_AREG0. Despite the instructions being quite >> >> >>> simple (only 2 movi_i32), the resulting code makes 2 access to env to >> >> >>> save the two registers. Having to reload the env pointer each time to a >> >> >>> register would clearly increase the size of this TB. >> >> >> >> >> >> I don't think TCG would be that simple, instead the pointer would be >> >> >> loaded only once in this case. >> >> > >> >> > Assuming TCG was able to allocate a register for that, >> >> > it would be live at most for one TB, so you'd have to >> >> > load it at least once per TB, and with block chaining >> >> > that wouldn't be efficient as you'd keep on reloading it. >> >> >> >> Yes, but if there are better uses, the register can be flushed. Now >> >> this is not possible since the register is always unavailable. >> >> >> > >> > What are the better uses, that justify to flush a register that is going >> > to be used three or four host asm later? >> >> It would obviously replace something else determined by TCG. > > The register will be free only for a few host instructions. Could you > please give more concrete example about such a usage? > >> > In the current generated code, roughly one every four instruction >> > reference TCG_AREG0, so this register is really needed very often. >> > >> > If you think TCG will be faster by having one more register in between >> > I suggest you to first optimize tcg_reg_alloc(), which simply spill >> > a random register, even if they are some allocated register that won't >> > be used until the end of the TB. You should also should check how often >> > TCG spills a register (in which case it would have benefit from one more >> > register). It happens less than 2000 times when booting an emulated mips >> > system on x86_64, while more than 160000 TB are generated. >> >> Right, on a modern CPU with lots of registers, one additional register >> won't be helpful, but on i386 the situation should be very different, >> there are very few registers. >> > > On i386, I indeed get a lot more of spilled registers, that is 340000. Still > that number is not that high, it's less than two times per TB. If we > consider that these register spills are pure loss (which is not always > the case, sometime the spilled register is actually never used later, so > it's just an anticipated save), that's 4 load/store per TB. > > It means to compensate, the env register should not be loaded more than > 4 times in a TB, which looks like quite difficult to achieve given how > often this register is used. > > Please also note that spilling globals currently need access to the env > pointer, which might not be loaded, so another register spill is need to > load it. This will make the code a lot more complex than now to avoid a > deadlock (probably by spilling local temps first). OK, this doesn't look so attractive after all.