* [PULL 0/1] Hppa linux user speedup patches @ 2023-08-03 22:35 Helge Deller 2023-08-03 22:35 ` [PULL 1/1] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller 2023-08-04 4:12 ` [PULL 0/1] Hppa linux user speedup patches Richard Henderson 0 siblings, 2 replies; 3+ messages in thread From: Helge Deller @ 2023-08-03 22:35 UTC (permalink / raw) To: qemu-devel, Peter Maydell, Richard Henderson; +Cc: Helge Deller The following changes since commit 9ba37026fcf6b7f3f096c0cca3e1e7307802486b: Update version for v8.1.0-rc2 release (2023-08-02 08:22:45 -0700) are available in the Git repository at: https://github.com/hdeller/qemu-hppa.git tags/hppa-linux-user-speedup-pull-request for you to fetch changes up to f8c0fd9804f435a20c3baa4c0c77ba9a02af24ef: target/hppa: Move iaoq registers and thus reduce generated code size (2023-08-04 00:02:56 +0200) ---------------------------------------------------------------- Generated code size reduction with linux-user for hppa Would you please consider pulling this trivial fix, which reduces the generated code on x86 by ~3% when running linux-user with the hppa target? Thanks, Helge ---------------------------------------------------------------- Helge Deller (1): target/hppa: Move iaoq registers and thus reduce generated code size target/hppa/cpu.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) -- 2.41.0 ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PULL 1/1] target/hppa: Move iaoq registers and thus reduce generated code size 2023-08-03 22:35 [PULL 0/1] Hppa linux user speedup patches Helge Deller @ 2023-08-03 22:35 ` Helge Deller 2023-08-04 4:12 ` [PULL 0/1] Hppa linux user speedup patches Richard Henderson 1 sibling, 0 replies; 3+ messages in thread From: Helge Deller @ 2023-08-03 22:35 UTC (permalink / raw) To: qemu-devel, Peter Maydell, Richard Henderson Cc: Helge Deller, Philippe Mathieu-Daudé, qemu-stable On hppa the Instruction Address Offset Queue (IAOQ) registers specifies the next to-be-executed instructions addresses. Each generated TB writes those registers at least once, so those registers are used heavily in generated code. Looking at the generated assembly, for a x86-64 host this code to write the address $0x7ffe826f into iaoq_f is generated: 0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp) 0x7f73e800018c: fe 7f 0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp) 0x7f73e8000196: fe 7f With the trivial change, by moving the variables iaoq_f and iaoq_b to the top of struct CPUArchState, the offset to %rbp is reduced (from 0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of generated code per move instruction: 0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp) 0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp) Overall this is a reduction of generated code (not a reduction of number of instructions). A test run with checks the generated code size by running "/bin/ls" with qemu-user shows that the code size shrinks from 1616767 to 1569273 bytes, which is ~97% of the former size. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Helge Deller <deller@gmx.de> Cc: qemu-stable@nongnu.org --- target/hppa/cpu.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/target/hppa/cpu.h b/target/hppa/cpu.h index 9fe79b1242..75c5c0ccf7 100644 --- a/target/hppa/cpu.h +++ b/target/hppa/cpu.h @@ -168,6 +168,9 @@ typedef struct { } hppa_tlb_entry; typedef struct CPUArchState { + target_ureg iaoq_f; /* front */ + target_ureg iaoq_b; /* back, aka next instruction */ + target_ureg gr[32]; uint64_t fr[32]; uint64_t sr[8]; /* stored shifted into place for gva */ @@ -186,8 +189,6 @@ typedef struct CPUArchState { target_ureg psw_cb; /* in least significant bit of next nibble */ target_ureg psw_cb_msb; /* boolean */ - target_ureg iaoq_f; /* front */ - target_ureg iaoq_b; /* back, aka next instruction */ uint64_t iasq_f; uint64_t iasq_b; -- 2.41.0 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PULL 0/1] Hppa linux user speedup patches 2023-08-03 22:35 [PULL 0/1] Hppa linux user speedup patches Helge Deller 2023-08-03 22:35 ` [PULL 1/1] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller @ 2023-08-04 4:12 ` Richard Henderson 1 sibling, 0 replies; 3+ messages in thread From: Richard Henderson @ 2023-08-04 4:12 UTC (permalink / raw) To: Helge Deller, qemu-devel, Peter Maydell On 8/3/23 15:35, Helge Deller wrote: > The following changes since commit 9ba37026fcf6b7f3f096c0cca3e1e7307802486b: > > Update version for v8.1.0-rc2 release (2023-08-02 08:22:45 -0700) > > are available in the Git repository at: > > https://github.com/hdeller/qemu-hppa.git tags/hppa-linux-user-speedup-pull-request > > for you to fetch changes up to f8c0fd9804f435a20c3baa4c0c77ba9a02af24ef: > > target/hppa: Move iaoq registers and thus reduce generated code size (2023-08-04 00:02:56 +0200) > > ---------------------------------------------------------------- > Generated code size reduction with linux-user for hppa > > Would you please consider pulling this trivial fix, which reduces > the generated code on x86 by ~3% when running linux-user with > the hppa target? Applied, thanks. Please update https://wiki.qemu.org/ChangeLog/8.1 as appropriate. r~ ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-08-04 4:13 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-08-03 22:35 [PULL 0/1] Hppa linux user speedup patches Helge Deller 2023-08-03 22:35 ` [PULL 1/1] target/hppa: Move iaoq registers and thus reduce generated code size Helge Deller 2023-08-04 4:12 ` [PULL 0/1] Hppa linux user speedup patches Richard Henderson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).