* [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size
@ 2023-07-30 16:30 Helge Deller
2023-07-30 17:50 ` Richard Henderson
2023-07-31 9:55 ` Philippe Mathieu-Daudé
0 siblings, 2 replies; 3+ messages in thread
From: Helge Deller @ 2023-07-30 16:30 UTC (permalink / raw)
To: Laurent Vivier, Richard Henderson, qemu-devel
On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
the next to-be-executed instructions addresses. Each generated TB writes those
registers at least once, so those registers are used heavily in generated
code.
Looking at the generated assembly, for a x86-64 host this code
to write the address $0x7ffe826f into iaoq_f is generated:
0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp)
0x7f73e800018c: fe 7f
0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp)
0x7f73e8000196: fe 7f
With the trivial change, by moving the variables iaoq_f and iaoq_b to
the top of struct CPUArchState, the offset to %rbp is reduced (from
0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
generated code per move instruction:
0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)
Overall this is a reduction of generated code (not a reduction of
number of instructions).
A test run with checks the generated code size by running "/bin/ls"
with qemu-user shows that the code size shrinks from 1616767 to 1569273
bytes, which is ~97% of the former size.
Signed-off-by: Helge Deller <deller@gmx.de>
diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h
index 9fe79b1242..75c5c0ccf7 100644
--- a/target/hppa/cpu.h
+++ b/target/hppa/cpu.h
@@ -168,6 +168,9 @@ typedef struct {
} hppa_tlb_entry;
typedef struct CPUArchState {
+ target_ureg iaoq_f; /* front */
+ target_ureg iaoq_b; /* back, aka next instruction */
+
target_ureg gr[32];
uint64_t fr[32];
uint64_t sr[8]; /* stored shifted into place for gva */
@@ -186,8 +189,6 @@ typedef struct CPUArchState {
target_ureg psw_cb; /* in least significant bit of next nibble */
target_ureg psw_cb_msb; /* boolean */
- target_ureg iaoq_f; /* front */
- target_ureg iaoq_b; /* back, aka next instruction */
uint64_t iasq_f;
uint64_t iasq_b;
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size
2023-07-30 16:30 [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller
@ 2023-07-30 17:50 ` Richard Henderson
2023-07-31 9:55 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 3+ messages in thread
From: Richard Henderson @ 2023-07-30 17:50 UTC (permalink / raw)
To: Helge Deller, Laurent Vivier, qemu-devel
On 7/30/23 09:30, Helge Deller wrote:
> On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
> the next to-be-executed instructions addresses. Each generated TB writes those
> registers at least once, so those registers are used heavily in generated
> code.
>
> Looking at the generated assembly, for a x86-64 host this code
> to write the address $0x7ffe826f into iaoq_f is generated:
> 0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp)
> 0x7f73e800018c: fe 7f
> 0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp)
> 0x7f73e8000196: fe 7f
>
> With the trivial change, by moving the variables iaoq_f and iaoq_b to
> the top of struct CPUArchState, the offset to %rbp is reduced (from
> 0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
> generated code per move instruction:
> 0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
> 0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)
>
> Overall this is a reduction of generated code (not a reduction of
> number of instructions).
> A test run with checks the generated code size by running "/bin/ls"
> with qemu-user shows that the code size shrinks from 1616767 to 1569273
> bytes, which is ~97% of the former size.
>
> Signed-off-by: Helge Deller<deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size
2023-07-30 16:30 [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller
2023-07-30 17:50 ` Richard Henderson
@ 2023-07-31 9:55 ` Philippe Mathieu-Daudé
1 sibling, 0 replies; 3+ messages in thread
From: Philippe Mathieu-Daudé @ 2023-07-31 9:55 UTC (permalink / raw)
To: Helge Deller, Laurent Vivier, Richard Henderson, qemu-devel
On 30/7/23 18:30, Helge Deller wrote:
> On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
> the next to-be-executed instructions addresses. Each generated TB writes those
> registers at least once, so those registers are used heavily in generated
> code.
>
> Looking at the generated assembly, for a x86-64 host this code
> to write the address $0x7ffe826f into iaoq_f is generated:
> 0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp)
> 0x7f73e800018c: fe 7f
> 0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp)
> 0x7f73e8000196: fe 7f
>
> With the trivial change, by moving the variables iaoq_f and iaoq_b to
> the top of struct CPUArchState, the offset to %rbp is reduced (from
> 0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
> generated code per move instruction:
> 0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
> 0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)
>
> Overall this is a reduction of generated code (not a reduction of
> number of instructions).
> A test run with checks the generated code size by running "/bin/ls"
> with qemu-user shows that the code size shrinks from 1616767 to 1569273
> bytes, which is ~97% of the former size.
>
> Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-07-31 9:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-30 16:30 [PATCH] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller
2023-07-30 17:50 ` Richard Henderson
2023-07-31 9:55 ` Philippe Mathieu-Daudé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).