From: Paolo Bonzini <pbonzini@redhat.com>
To: Richard Henderson <rth@twiddle.net>
Cc: qemu-ppc@nongnu.org, qemu-devel@nongnu.org, aurelien@aurel32.net
Subject: Re: [Qemu-devel] [PATCH 2/2] tcg-ppc: use new return-argument ld/st helpers
Date: Thu, 05 Sep 2013 17:41:32 +0200 [thread overview]
Message-ID: <5228A62C.90507@redhat.com> (raw)
In-Reply-To: <5228A08C.5040106@twiddle.net>
Il 05/09/2013 17:17, Richard Henderson ha scritto:
> On 09/05/2013 01:22 AM, Paolo Bonzini wrote:
>> These use a 32-bit load-of-immediate to save a mflr+addi+mtlr sequence.
>> Tested with a Windows 98 guest (pretty much the most recent thing I
>> could run on my PPC machine) and kvm-unit-tests's sieve.flat. The
>> speed up for sieve.flat is as high as 10% for qemu-system-i386, 25%
>> (no kidding) for qemu-system-x86_64 on my PowerBook G4.
>
> See also the series beginning at
>
> http://lists.nongnu.org/archive/html/qemu-devel/2013-09/msg00025.html
>
> The major difference is that I use a conditional call out of the fast
> path, which lets me later just use one mflr to pass the parameter. I
> also, perhaps foolishly, got rid of the trampolines. E.g.
>
> 0xf57a1838: rlwinm r3,r15,24,20,27
> 0xf57a183c: rlwinm r0,r15,0,30,19
> 0xf57a1840: add r3,r3,r27
> 0xf57a1844: lwz r4,6436(r3)
> 0xf57a1848: cmpw cr7,r0,r4
> 0xf57a184c: lwz r3,6444(r3)
> 0xf57a1850: bnel- cr7,0xf57a1910
> 0xf57a1854: stwx r16,r3,r15
> ...
> 0xf57a1910: mr r3,r27
> 0xf57a1914: mr r4,r15
> 0xf57a1918: mr r5,r16
> 0xf57a191c: li r6,1
> 0xf57a1920: mflr r7
> 0xf57a1924: lis r0,4120
> 0xf57a1928: ori r0,r0,45040
> 0xf57a192c: mtctr r0
> 0xf57a1930: bctrl
> 0xf57a1934: b 0xf57a1858
>
> I don't see anything technically wrong with your patch. But I'd be
> interested to compare vs mine.
Sure, I'll give it a try tomorrow or in the weekend.
The G4 in my computer must simply hate the mflr/add/mtlr sequence in the
trampoline; there's no other explanation for such a huge performance
improvement. So even though I suspect that there won't be much
difference between our patches it's good to check what's better in case
your sequences are triggering something as bad. The bnel/mflr is a nice
trick to save one instruction, though!
Regarding removal of the trampolines, the extra icache cost should be a
wash now that they are half the size, but I'd still prefer it to be a
separate patch.
Paolo
next prev parent reply other threads:[~2013-09-05 15:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-05 8:22 [Qemu-devel] [PATCH 0/2] tcg-ppc: use new return-argument ld/st helpers Paolo Bonzini
2013-09-05 8:22 ` [Qemu-devel] [PATCH 1/2] tcg-ppc: fix qemu_ld/qemu_st for AIX ABI Paolo Bonzini
2013-09-05 8:22 ` [Qemu-devel] [PATCH 2/2] tcg-ppc: use new return-argument ld/st helpers Paolo Bonzini
2013-09-05 15:17 ` Richard Henderson
2013-09-05 15:41 ` Paolo Bonzini [this message]
2013-09-05 9:46 ` [Qemu-devel] [Qemu-ppc] [PATCH 0/2] " Alexander Graf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5228A62C.90507@redhat.com \
--to=pbonzini@redhat.com \
--cc=aurelien@aurel32.net \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.