From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58401) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YbQyK-0006pP-Kg for qemu-devel@nongnu.org; Fri, 27 Mar 2015 05:54:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YbQyG-0000sX-Gs for qemu-devel@nongnu.org; Fri, 27 Mar 2015 05:54:36 -0400 References: <5510B8C4.1050302@twiddle.net> <1427313048-26772-1-git-send-email-cota@braap.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <1427313048-26772-1-git-send-email-cota@braap.org> Date: Fri, 27 Mar 2015 09:55:03 +0000 Message-ID: <87y4mibw94.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH] tcg: optimise memory layout of TCGTemp List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Emilio G. Cota" Cc: qemu-trivial@nongnu.org, Stefan Weil , qemu-devel@nongnu.org, Richard Henderson Emilio G. Cota writes: > This brings down the size of the struct from 56 to 32 bytes on 64-bit, > and to 16 bytes on 32-bit. Have you been able to measure any performance improvement with these new structures? In theory, if aligned with cache lines, performance should improve but real numbers would be nice. > > The appended adds macros to prevent us from mistakenly overflowing > the bitfields when more elements are added to the corresponding > enums/macros. I can see the defines but I can't see any checks. Should we be able to do a compile time check if TCG_TYPE_COUNT doesn't fit into TCG_TYPE_NR_BITS? > > Note that reg/mem_reg need only 6 bits (for ia64) but for performance > is probably better to align them to a byte address. > > Given that TCGTemp is used in large arrays this leads to a few KBs > of savings. However, unpacking the bits takes additional code, so > the net effect depends on the target (host is x86_64): > > Before: > $ find . -name 'tcg.o' | xargs size > text data bss dec hex filename > 41131 29800 88 71019 1156b ./aarch64-softmmu/tcg/tcg.o > 37969 29416 96 67481 10799 ./x86_64-linux-user/tcg/tcg.o > 39354 28816 96 68266 10aaa ./arm-linux-user/tcg/tcg.o > 40802 29096 88 69986 11162 ./arm-softmmu/tcg/tcg.o > 39417 29672 88 69177 10e39 ./x86_64-softmmu/tcg/tcg.o > > After: > $ find . -name 'tcg.o' | xargs size > text data bss dec hex filename > 41187 29800 88 71075 115a3 ./aarch64-softmmu/tcg/tcg.o > 37777 29416 96 67289 106d9 ./x86_64-linux-user/tcg/tcg.o > 39162 28816 96 68074 109ea ./arm-linux-user/tcg/tcg.o > 40858 29096 88 70042 1119a ./arm-softmmu/tcg/tcg.o > 39473 29672 88 69233 10e71 ./x86_64-softmmu/tcg/tcg.o > > Suggested-by: Stefan Weil > Suggested-by: Richard Henderson > Signed-off-by: Emilio G. Cota > --- > tcg/tcg.h | 22 +++++++++++++--------- > 1 file changed, 13 insertions(+), 9 deletions(-) > > diff --git a/tcg/tcg.h b/tcg/tcg.h > index add7f75..71ae7b2 100644 > --- a/tcg/tcg.h > +++ b/tcg/tcg.h > @@ -193,7 +193,7 @@ typedef struct TCGPool { > typedef enum TCGType { > TCG_TYPE_I32, > TCG_TYPE_I64, > - TCG_TYPE_COUNT, /* number of different types */ > + TCG_TYPE_COUNT, /* number of different types, see TCG_TYPE_NR_BITS */ > > /* An alias for the size of the host register. */ > #if TCG_TARGET_REG_BITS == 32 > @@ -217,6 +217,9 @@ typedef enum TCGType { > #endif > } TCGType; > > +/* used for bitfield packing to save space */ > +#define TCG_TYPE_NR_BITS 1 > + > /* Constants for qemu_ld and qemu_st for the Memory Operation field. */ > typedef enum TCGMemOp { > MO_8 = 0, > @@ -421,16 +424,14 @@ static inline TCGCond tcg_high_cond(TCGCond c) > #define TEMP_VAL_REG 1 > #define TEMP_VAL_MEM 2 > #define TEMP_VAL_CONST 3 > +#define TEMP_VAL_NR_BITS 2 A similar compile time check could be added here. > > -/* XXX: optimize memory layout */ > typedef struct TCGTemp { > - TCGType base_type; > - TCGType type; > - int val_type; > - int reg; > - tcg_target_long val; > - int mem_reg; > - intptr_t mem_offset; > + unsigned int reg:8; > + unsigned int mem_reg:8; > + unsigned int val_type:TEMP_VAL_NR_BITS; > + unsigned int base_type:TCG_TYPE_NR_BITS; > + unsigned int type:TCG_TYPE_NR_BITS; > unsigned int fixed_reg:1; > unsigned int mem_coherent:1; > unsigned int mem_allocated:1; > @@ -438,6 +439,9 @@ typedef struct TCGTemp { > basic blocks. Otherwise, it is not > preserved across basic blocks. */ > unsigned int temp_allocated:1; /* never used for code gen */ > + > + tcg_target_long val; > + intptr_t mem_offset; > const char *name; > } TCGTemp; -- Alex Bennée