From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:37251) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLXbb-0005Ow-RU for qemu-devel@nongnu.org; Sun, 15 May 2011 05:27:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QLXba-0004ky-Mi for qemu-devel@nongnu.org; Sun, 15 May 2011 05:27:19 -0400 Received: from mail-qy0-f180.google.com ([209.85.216.180]:44651) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QLXba-0004ko-Gb for qemu-devel@nongnu.org; Sun, 15 May 2011 05:27:18 -0400 Received: by qyk10 with SMTP id 10so2224428qyk.4 for ; Sun, 15 May 2011 02:27:17 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: <20110514211616.GA13600@volta.aurel32.net> <20110514220447.GA30615@hall.aurel32.net> Date: Sun, 15 May 2011 11:27:17 +0200 Message-ID: From: Laurent Desnogues Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH RFC 00/11] AREG0 elimination List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Blue Swirl Cc: qemu-devel , Aurelien Jarno On Sun, May 15, 2011 at 9:15 AM, Blue Swirl wrote: > On Sun, May 15, 2011 at 1:04 AM, Aurelien Jarno wr= ote: >> On Sun, May 15, 2011 at 12:52:35AM +0300, Blue Swirl wrote: >>> On Sun, May 15, 2011 at 12:16 AM, Aurelien Jarno = wrote: >>> > On Sat, May 14, 2011 at 10:35:20PM +0300, Blue Swirl wrote: [...] >>> > The env register is used very often (basically for every load/store, = but >>> > also a lot of helpers), so it makes sense to reserve a register for i= t. >>> > >>> > For what I understand from your patch series, you prefer to pass this >>> > register explicitly to TCG functions. This basically means this TCG >>> > global will be loaded to host register as soon as it is used, but als= o >>> > regularly, as globals are saved back to their canonical location befo= re >>> > an helper or a load/store. >>> > >>> > So it seems that this patch series will just allowing the "env regist= er" >>> > to change over time, though it will not spare one more register for t= he >>> > TCG code, and it will emit longer TCG code to regularly reload the en= v >>> > global into a host register. >>> >>> But there will be one more register available in some cases. In other >> >> Inside the TCG code, it will basically happens very rarely, given >> load/store are really the most used instructions, and they need to load >> the env register. > > Not exactly, from a sample run with -d op_opt: > $ egrep -v -e '^$' -v -e 'OP after' -v -e ' end' -v -e 'Search PC' > /tmp/qemu.log | awk '{print $1}' | sort | uniq -c|sort -rn > 1673966 movi_i32 > =A0653931 ld_i32 > =A0607432 mov_i32 > =A0428684 st_i32 > =A0326878 movi_i64 > =A0308626 add_i32 > =A0283186 call > =A0256817 exit_tb > =A0207232 nopn > =A0189388 goto_tb > =A0122398 and_i32 > =A0117997 shr_i32 > =A089107 qemu_ld32 > =A082926 set_label > =A082713 brcond_i32 > =A067169 qemu_st32 > =A055109 or_i32 > =A046536 ext32u_i64 > =A044288 xor_i32 > =A038103 sub_i32 > =A026361 shl_i32 > =A023218 shl_i64 > =A023218 qemu_st64 > =A023218 or_i64 > =A020474 shr_i64 > =A020445 qemu_ld64 > =A011161 qemu_ld8u > =A010409 qemu_st8 > =A0 5013 qemu_ld16u > =A0 3795 qemu_st16 > =A0 2776 qemu_ld8s > =A0 1915 sar_i32 > =A0 1414 qemu_ld16s > =A0 =A0839 not_i32 > =A0 =A0579 setcond_i32 > =A0 =A0213 br > =A0 =A0 42 ext32s_i64 > =A0 =A0 30 mul_i64 Unless I missed something, this doesn't show the usage of ld/st per TB, which is what Aur=E9lien was looking for if I understood correctly. All I can say is that you had at most 256817 TB's and 234507 qemu_ld/st, so about one per TB. Anyway I must be thick, because I fail to see how generated code could access guest CPU registers without a pointer to the CPU env :-) IIUC the SPARC translator uses ld_i32/st_i32 mainly for accessing the guest CPU registers, which due to register windows is held in a dedicated global temp. Is that correct? If so this is kind of hiding accesses to the CPU env; all other targets read/write registers by using CPU env (through the use global temps in most cases). So I think most (if not almost all) TB will need a pointer to CPU env, which is why I think Aur=E9lien's proposal to keep a dedicated register that'd be loaded in the prologue is the only way to not degrade performance of the generated code (I'd add that this dedicated register should be the one defined by the ABI as holding the first parameter value, if that's possible; I'm afraid this is not necessarily a good idea). Laurent