All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>
Cc: qemu-trivial@nongnu.org, Stefan Weil <sw@weilnetz.de>,
	qemu-devel@nongnu.org, Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-trivial] [Qemu-devel] [PATCH] tcg: optimise memory layout of TCGTemp
Date: Fri, 27 Mar 2015 09:55:03 +0000	[thread overview]
Message-ID: <87y4mibw94.fsf@linaro.org> (raw)
In-Reply-To: <1427313048-26772-1-git-send-email-cota@braap.org>


Emilio G. Cota <cota@braap.org> writes:

> This brings down the size of the struct from 56 to 32 bytes on 64-bit,
> and to 16 bytes on 32-bit.

Have you been able to measure any performance improvement with these new
structures? In theory, if aligned with cache lines, performance should
improve but real numbers would be nice.

>
> The appended adds macros to prevent us from mistakenly overflowing
> the bitfields when more elements are added to the corresponding
> enums/macros.

I can see the defines but I can't see any checks. Should we be able to
do a compile time check if TCG_TYPE_COUNT doesn't fit into
TCG_TYPE_NR_BITS?

>
> Note that reg/mem_reg need only 6 bits (for ia64) but for performance
> is probably better to align them to a byte address.
>
> Given that TCGTemp is used in large arrays this leads to a few KBs
> of savings. However, unpacking the bits takes additional code, so
> the net effect depends on the target (host is x86_64):
>
> Before:
> $ find . -name 'tcg.o' | xargs size
>    text    data     bss     dec     hex filename
>   41131   29800      88   71019   1156b ./aarch64-softmmu/tcg/tcg.o
>   37969   29416      96   67481   10799 ./x86_64-linux-user/tcg/tcg.o
>   39354   28816      96   68266   10aaa ./arm-linux-user/tcg/tcg.o
>   40802   29096      88   69986   11162 ./arm-softmmu/tcg/tcg.o
>   39417   29672      88   69177   10e39 ./x86_64-softmmu/tcg/tcg.o
>
> After:
> $ find . -name 'tcg.o' | xargs size
>    text    data     bss     dec     hex filename
>   41187   29800      88   71075   115a3 ./aarch64-softmmu/tcg/tcg.o
>   37777   29416      96   67289   106d9 ./x86_64-linux-user/tcg/tcg.o
>   39162   28816      96   68074   109ea ./arm-linux-user/tcg/tcg.o
>   40858   29096      88   70042   1119a ./arm-softmmu/tcg/tcg.o
>   39473   29672      88   69233   10e71 ./x86_64-softmmu/tcg/tcg.o
>
> Suggested-by: Stefan Weil <sw@weilnetz.de>
> Suggested-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  tcg/tcg.h | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index add7f75..71ae7b2 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -193,7 +193,7 @@ typedef struct TCGPool {
>  typedef enum TCGType {
>      TCG_TYPE_I32,
>      TCG_TYPE_I64,
> -    TCG_TYPE_COUNT, /* number of different types */
> +    TCG_TYPE_COUNT, /* number of different types, see TCG_TYPE_NR_BITS */
>  
>      /* An alias for the size of the host register.  */
>  #if TCG_TARGET_REG_BITS == 32
> @@ -217,6 +217,9 @@ typedef enum TCGType {
>  #endif
>  } TCGType;
>  
> +/* used for bitfield packing to save space */
> +#define TCG_TYPE_NR_BITS 1
> +
>  /* Constants for qemu_ld and qemu_st for the Memory Operation field.  */
>  typedef enum TCGMemOp {
>      MO_8     = 0,
> @@ -421,16 +424,14 @@ static inline TCGCond tcg_high_cond(TCGCond c)
>  #define TEMP_VAL_REG   1
>  #define TEMP_VAL_MEM   2
>  #define TEMP_VAL_CONST 3
> +#define TEMP_VAL_NR_BITS 2

A similar compile time check could be added here.

>  
> -/* XXX: optimize memory layout */
>  typedef struct TCGTemp {
> -    TCGType base_type;
> -    TCGType type;
> -    int val_type;
> -    int reg;
> -    tcg_target_long val;
> -    int mem_reg;
> -    intptr_t mem_offset;
> +    unsigned int reg:8;
> +    unsigned int mem_reg:8;
> +    unsigned int val_type:TEMP_VAL_NR_BITS;
> +    unsigned int base_type:TCG_TYPE_NR_BITS;
> +    unsigned int type:TCG_TYPE_NR_BITS;
>      unsigned int fixed_reg:1;
>      unsigned int mem_coherent:1;
>      unsigned int mem_allocated:1;
> @@ -438,6 +439,9 @@ typedef struct TCGTemp {
>                                    basic blocks. Otherwise, it is not
>                                    preserved across basic blocks. */
>      unsigned int temp_allocated:1; /* never used for code gen */
> +
> +    tcg_target_long val;
> +    intptr_t mem_offset;
>      const char *name;
>  } TCGTemp;

-- 
Alex Bennée


WARNING: multiple messages have this Message-ID (diff)
From: "Alex Bennée" <alex.bennee@linaro.org>
To: "Emilio G. Cota" <cota@braap.org>
Cc: qemu-trivial@nongnu.org, Stefan Weil <sw@weilnetz.de>,
	qemu-devel@nongnu.org, Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [PATCH] tcg: optimise memory layout of TCGTemp
Date: Fri, 27 Mar 2015 09:55:03 +0000	[thread overview]
Message-ID: <87y4mibw94.fsf@linaro.org> (raw)
In-Reply-To: <1427313048-26772-1-git-send-email-cota@braap.org>


Emilio G. Cota <cota@braap.org> writes:

> This brings down the size of the struct from 56 to 32 bytes on 64-bit,
> and to 16 bytes on 32-bit.

Have you been able to measure any performance improvement with these new
structures? In theory, if aligned with cache lines, performance should
improve but real numbers would be nice.

>
> The appended adds macros to prevent us from mistakenly overflowing
> the bitfields when more elements are added to the corresponding
> enums/macros.

I can see the defines but I can't see any checks. Should we be able to
do a compile time check if TCG_TYPE_COUNT doesn't fit into
TCG_TYPE_NR_BITS?

>
> Note that reg/mem_reg need only 6 bits (for ia64) but for performance
> is probably better to align them to a byte address.
>
> Given that TCGTemp is used in large arrays this leads to a few KBs
> of savings. However, unpacking the bits takes additional code, so
> the net effect depends on the target (host is x86_64):
>
> Before:
> $ find . -name 'tcg.o' | xargs size
>    text    data     bss     dec     hex filename
>   41131   29800      88   71019   1156b ./aarch64-softmmu/tcg/tcg.o
>   37969   29416      96   67481   10799 ./x86_64-linux-user/tcg/tcg.o
>   39354   28816      96   68266   10aaa ./arm-linux-user/tcg/tcg.o
>   40802   29096      88   69986   11162 ./arm-softmmu/tcg/tcg.o
>   39417   29672      88   69177   10e39 ./x86_64-softmmu/tcg/tcg.o
>
> After:
> $ find . -name 'tcg.o' | xargs size
>    text    data     bss     dec     hex filename
>   41187   29800      88   71075   115a3 ./aarch64-softmmu/tcg/tcg.o
>   37777   29416      96   67289   106d9 ./x86_64-linux-user/tcg/tcg.o
>   39162   28816      96   68074   109ea ./arm-linux-user/tcg/tcg.o
>   40858   29096      88   70042   1119a ./arm-softmmu/tcg/tcg.o
>   39473   29672      88   69233   10e71 ./x86_64-softmmu/tcg/tcg.o
>
> Suggested-by: Stefan Weil <sw@weilnetz.de>
> Suggested-by: Richard Henderson <rth@twiddle.net>
> Signed-off-by: Emilio G. Cota <cota@braap.org>
> ---
>  tcg/tcg.h | 22 +++++++++++++---------
>  1 file changed, 13 insertions(+), 9 deletions(-)
>
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index add7f75..71ae7b2 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -193,7 +193,7 @@ typedef struct TCGPool {
>  typedef enum TCGType {
>      TCG_TYPE_I32,
>      TCG_TYPE_I64,
> -    TCG_TYPE_COUNT, /* number of different types */
> +    TCG_TYPE_COUNT, /* number of different types, see TCG_TYPE_NR_BITS */
>  
>      /* An alias for the size of the host register.  */
>  #if TCG_TARGET_REG_BITS == 32
> @@ -217,6 +217,9 @@ typedef enum TCGType {
>  #endif
>  } TCGType;
>  
> +/* used for bitfield packing to save space */
> +#define TCG_TYPE_NR_BITS 1
> +
>  /* Constants for qemu_ld and qemu_st for the Memory Operation field.  */
>  typedef enum TCGMemOp {
>      MO_8     = 0,
> @@ -421,16 +424,14 @@ static inline TCGCond tcg_high_cond(TCGCond c)
>  #define TEMP_VAL_REG   1
>  #define TEMP_VAL_MEM   2
>  #define TEMP_VAL_CONST 3
> +#define TEMP_VAL_NR_BITS 2

A similar compile time check could be added here.

>  
> -/* XXX: optimize memory layout */
>  typedef struct TCGTemp {
> -    TCGType base_type;
> -    TCGType type;
> -    int val_type;
> -    int reg;
> -    tcg_target_long val;
> -    int mem_reg;
> -    intptr_t mem_offset;
> +    unsigned int reg:8;
> +    unsigned int mem_reg:8;
> +    unsigned int val_type:TEMP_VAL_NR_BITS;
> +    unsigned int base_type:TCG_TYPE_NR_BITS;
> +    unsigned int type:TCG_TYPE_NR_BITS;
>      unsigned int fixed_reg:1;
>      unsigned int mem_coherent:1;
>      unsigned int mem_allocated:1;
> @@ -438,6 +439,9 @@ typedef struct TCGTemp {
>                                    basic blocks. Otherwise, it is not
>                                    preserved across basic blocks. */
>      unsigned int temp_allocated:1; /* never used for code gen */
> +
> +    tcg_target_long val;
> +    intptr_t mem_offset;
>      const char *name;
>  } TCGTemp;

-- 
Alex Bennée

  reply	other threads:[~2015-03-27  9:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-21  6:27 [Qemu-trivial] [PATCH] tcg: pack TCGTemp to reduce size by 8 bytes Emilio G. Cota
2015-03-21  6:27 ` [Qemu-devel] " Emilio G. Cota
2015-03-23 21:42 ` [Qemu-trivial] " Stefan Weil
2015-03-23 21:42   ` Stefan Weil
2015-03-24  1:07   ` [Qemu-trivial] " Richard Henderson
2015-03-24  1:07     ` Richard Henderson
2015-03-25 19:50     ` [Qemu-trivial] [PATCH] tcg: optimise memory layout of TCGTemp Emilio G. Cota
2015-03-25 19:50       ` [Qemu-devel] " Emilio G. Cota
2015-03-27  9:55       ` Alex Bennée [this message]
2015-03-27  9:55         ` Alex Bennée
2015-03-27 21:09         ` [Qemu-trivial] " Emilio G. Cota
2015-03-27 21:09           ` Emilio G. Cota
2015-03-30  9:55           ` [Qemu-trivial] " Laurent Desnogues
2015-03-30  9:55             ` Laurent Desnogues
2015-03-27 14:58       ` [Qemu-trivial] " Richard Henderson
2015-03-27 14:58         ` [Qemu-devel] " Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2015-03-29 21:52 [Qemu-trivial] " Richard Henderson
2015-03-30  5:33 ` Stefan Weil
2015-03-30  5:43 ` Stefan Weil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y4mibw94.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=cota@braap.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-trivial@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=sw@weilnetz.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.