From: Richard Henderson <rth@twiddle.net>
To: Alexander Graf <agraf@suse.de>
Cc: "av1474@comtv.ru" <av1474@comtv.ru>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"aurelien@aurel32.net" <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH v3 17/27] tcg-ppc64: Implement bswap64
Date: Tue, 02 Apr 2013 08:12:01 -0700 [thread overview]
Message-ID: <515AF541.1040900@twiddle.net> (raw)
In-Reply-To: <34B5C19C-785B-43E0-B1A5-F75D16EA6F09@suse.de>
[-- Attachment #1: Type: text/plain, Size: 848 bytes --]
On 2013-04-02 07:41, Alexander Graf wrote:
>> On 2013-04-01 23:34, Alexander Graf wrote:
>>> Is this faster than a load/store with std/ldbrx?
>>
>> Hmm. Almost certainly not. And since we've got stack space
>> allocated for function calls, we've got scratch space to do it in.
>>
>> Probably similar for bswap32 too, eh?
>
> Depends - memory load/store doesn't come for free and bswap32 is quite short.
>
>>
>> I'll do a tiny bit o benchmarking for power7.
>
> Cool, thanks a bunch :)
Heh. "Almost certainly not" indeed. Unless I've made some silly mistake,
going through memory stalls badly. No store buffer forwarding on power7?
With the following test case, time reports:
f1 2.967s
f2 8.930s
f3 7.071s
f4 7.166s
And note that f4 is a normal store/load pair, trying to determine what the
store buffer forwarding delay might be.
r~
[-- Attachment #2: z.c --]
[-- Type: text/x-csrc, Size: 987 bytes --]
static long __attribute__((noinline)) f1(long x, long *mem)
{
long r, t;
asm volatile (
"rlwinm %0,%1,8,0,31\n\
rlwimi %0,%1,24,0,7\n\
rlwimi %0,%1,24,16,23\n\
rldicl %0,%0,32,0\n\
rldicl %2,%1,32,0\n\
rlwimi %0,%2,8,0,31\n\
rlwimi %0,%2,24,0,7\n\
rlwimi %0,%2,24,16,23"
: "=&r"(r), "=r"(t)
: "r"(x));
return r;
}
static long __attribute__((noinline)) f2(long x, long *mem)
{
long r, t;
asm volatile ("std %1,0(%2); ldbrx %0,0,%2" : "=r"(r) : "r"(x), "b"(mem));
return r;
}
static long __attribute__((noinline)) f3(long x, long *mem)
{
long r, t;
asm volatile ("stdbrx %1,0,%2; ld %0,0(%2)" : "=r"(r) : "r"(x), "b"(mem));
return r;
}
static long __attribute__((noinline)) f4(long x, long *mem)
{
long r, t;
asm volatile ("std %1,0(%2); ld %0,0(%2)" : "=r"(r) : "r"(x), "b"(mem));
return r;
}
#define D1(x,y) x##y
#define DO(x) D1(f,x)
int main()
{
long tmp, i;
for (i = 0; i < 1000000000; ++i)
DO(N)(i, &tmp);
return 0;
}
next prev parent reply other threads:[~2013-04-02 15:12 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-02 4:23 [Qemu-devel] [PATCH v3 00/27] Modernize tcg/ppc64 Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 01/27] disas: Disassemble all ppc insns for the host Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 02/27] tcg-ppc64: Use TCGReg everywhere Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 03/27] tcg-ppc64: Introduce and use tcg_out_rlw Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 04/27] tcg-ppc64: Introduce and use tcg_out_ext32u Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 05/27] tcg-ppc64: Introduce and use tcg_out_shli64 Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 06/27] tcg-ppc64: Introduce and use tcg_out_shri64 Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 07/27] tcg-ppc64: Cleanup tcg_out_movi Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 08/27] tcg-ppc64: Introduce and use TAI and SAI Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 09/27] tcg-ppc64: Rearrange integer constant constraints Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 10/27] tcg-ppc64: Improve constant add and sub ops Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 11/27] tcg-ppc64: Tidy or and xor patterns Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 12/27] tcg-ppc64: Improve and_i32 with constant Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 13/27] tcg-ppc64: Improve and_i64 " Richard Henderson
2013-04-13 11:38 ` Aurelien Jarno
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 14/27] tcg-ppc64: Streamline qemu_ld/st insn selection Richard Henderson
2013-04-13 11:39 ` Aurelien Jarno
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 15/27] tcg-ppc64: Implement rotates Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 16/27] tcg-ppc64: Implement bswap16 and bswap32 Richard Henderson
2013-04-13 11:39 ` Aurelien Jarno
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 17/27] tcg-ppc64: Implement bswap64 Richard Henderson
2013-04-02 6:34 ` Alexander Graf
2013-04-02 13:44 ` Richard Henderson
2013-04-02 14:41 ` Alexander Graf
2013-04-02 15:12 ` Richard Henderson [this message]
2013-04-02 15:23 ` Alexander Graf
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 18/27] tcg-ppc64: Implement compound logicals Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 19/27] tcg-ppc64: Handle constant inputs for some " Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 20/27] tcg-ppc64: Implement deposit Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 21/27] tcg-ppc64: Use I constraint for mul Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 22/27] tcg-ppc64: Use TCGType throughout compares Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 23/27] tcg-ppc64: Rewrite setcond Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 24/27] tcg-ppc64: Implement movcond Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 25/27] tcg-ppc64: Use getauxval for ISA detection Richard Henderson
2013-04-13 11:39 ` Aurelien Jarno
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 26/27] tcg-ppc64: Implement add2/sub2_i64 Richard Henderson
2013-04-02 4:23 ` [Qemu-devel] [PATCH v3 27/27] tcg-ppc64: Implement mulu2/muls2_i64 Richard Henderson
2013-04-02 15:34 ` [Qemu-devel] [PATCH v3 00/27] Modernize tcg/ppc64 Alexander Graf
2013-04-02 15:54 ` Aurelien Jarno
2013-04-02 16:08 ` Alexander Graf
2013-04-13 11:38 ` Aurelien Jarno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=515AF541.1040900@twiddle.net \
--to=rth@twiddle.net \
--cc=agraf@suse.de \
--cc=aurelien@aurel32.net \
--cc=av1474@comtv.ru \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).