qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Henderson <rth@twiddle.net>
To: Alexander Graf <agraf@suse.de>
Cc: "av1474@comtv.ru" <av1474@comtv.ru>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"aurelien@aurel32.net" <aurelien@aurel32.net>
Subject: Re: [Qemu-devel] [PATCH v3 17/27] tcg-ppc64: Implement bswap64
Date: Tue, 02 Apr 2013 08:12:01 -0700	[thread overview]
Message-ID: <515AF541.1040900@twiddle.net> (raw)
In-Reply-To: <34B5C19C-785B-43E0-B1A5-F75D16EA6F09@suse.de>

[-- Attachment #1: Type: text/plain, Size: 848 bytes --]

On 2013-04-02 07:41, Alexander Graf wrote:
>> On 2013-04-01 23:34, Alexander Graf wrote:
>>> Is this faster than a load/store with std/ldbrx?
>>
>> Hmm.  Almost certainly not.  And since we've got stack space
>> allocated for function calls, we've got scratch space to do it in.
>>
>> Probably similar for bswap32 too, eh?
>
> Depends - memory load/store doesn't come for free and bswap32 is quite short.
>
>>
>> I'll do a tiny bit o benchmarking for power7.
>
> Cool, thanks a bunch :)

Heh.  "Almost certainly not" indeed.  Unless I've made some silly mistake,
going through memory stalls badly.  No store buffer forwarding on power7?

With the following test case, time reports:

f1		2.967s
f2		8.930s
f3		7.071s
f4		7.166s

And note that f4 is a normal store/load pair, trying to determine what the
store buffer forwarding delay might be.


r~

[-- Attachment #2: z.c --]
[-- Type: text/x-csrc, Size: 987 bytes --]

static long __attribute__((noinline)) f1(long x, long *mem)
{
  long r, t;
  asm volatile (
       "rlwinm %0,%1,8,0,31\n\
	rlwimi %0,%1,24,0,7\n\
	rlwimi %0,%1,24,16,23\n\
	rldicl %0,%0,32,0\n\
	rldicl %2,%1,32,0\n\
	rlwimi %0,%2,8,0,31\n\
	rlwimi %0,%2,24,0,7\n\
	rlwimi %0,%2,24,16,23"
	: "=&r"(r), "=r"(t)
	: "r"(x));
  return r;
}

static long __attribute__((noinline)) f2(long x, long *mem)
{
  long r, t;
  asm volatile ("std %1,0(%2); ldbrx %0,0,%2" : "=r"(r) : "r"(x), "b"(mem));
  return r;
}

static long __attribute__((noinline)) f3(long x, long *mem)
{
  long r, t;
  asm volatile ("stdbrx %1,0,%2; ld %0,0(%2)" : "=r"(r) : "r"(x), "b"(mem));
  return r;
}

static long __attribute__((noinline)) f4(long x, long *mem)
{
  long r, t;
  asm volatile ("std %1,0(%2); ld %0,0(%2)" : "=r"(r) : "r"(x), "b"(mem));
  return r;
}

#define D1(x,y) x##y
#define DO(x)   D1(f,x)

int main()
{
    long tmp, i;
    for (i = 0; i < 1000000000; ++i)
      DO(N)(i, &tmp);
    return 0;
}

  reply	other threads:[~2013-04-02 15:12 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-02  4:23 [Qemu-devel] [PATCH v3 00/27] Modernize tcg/ppc64 Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 01/27] disas: Disassemble all ppc insns for the host Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 02/27] tcg-ppc64: Use TCGReg everywhere Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 03/27] tcg-ppc64: Introduce and use tcg_out_rlw Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 04/27] tcg-ppc64: Introduce and use tcg_out_ext32u Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 05/27] tcg-ppc64: Introduce and use tcg_out_shli64 Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 06/27] tcg-ppc64: Introduce and use tcg_out_shri64 Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 07/27] tcg-ppc64: Cleanup tcg_out_movi Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 08/27] tcg-ppc64: Introduce and use TAI and SAI Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 09/27] tcg-ppc64: Rearrange integer constant constraints Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 10/27] tcg-ppc64: Improve constant add and sub ops Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 11/27] tcg-ppc64: Tidy or and xor patterns Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 12/27] tcg-ppc64: Improve and_i32 with constant Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 13/27] tcg-ppc64: Improve and_i64 " Richard Henderson
2013-04-13 11:38   ` Aurelien Jarno
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 14/27] tcg-ppc64: Streamline qemu_ld/st insn selection Richard Henderson
2013-04-13 11:39   ` Aurelien Jarno
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 15/27] tcg-ppc64: Implement rotates Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 16/27] tcg-ppc64: Implement bswap16 and bswap32 Richard Henderson
2013-04-13 11:39   ` Aurelien Jarno
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 17/27] tcg-ppc64: Implement bswap64 Richard Henderson
2013-04-02  6:34   ` Alexander Graf
2013-04-02 13:44     ` Richard Henderson
2013-04-02 14:41       ` Alexander Graf
2013-04-02 15:12         ` Richard Henderson [this message]
2013-04-02 15:23           ` Alexander Graf
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 18/27] tcg-ppc64: Implement compound logicals Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 19/27] tcg-ppc64: Handle constant inputs for some " Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 20/27] tcg-ppc64: Implement deposit Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 21/27] tcg-ppc64: Use I constraint for mul Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 22/27] tcg-ppc64: Use TCGType throughout compares Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 23/27] tcg-ppc64: Rewrite setcond Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 24/27] tcg-ppc64: Implement movcond Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 25/27] tcg-ppc64: Use getauxval for ISA detection Richard Henderson
2013-04-13 11:39   ` Aurelien Jarno
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 26/27] tcg-ppc64: Implement add2/sub2_i64 Richard Henderson
2013-04-02  4:23 ` [Qemu-devel] [PATCH v3 27/27] tcg-ppc64: Implement mulu2/muls2_i64 Richard Henderson
2013-04-02 15:34 ` [Qemu-devel] [PATCH v3 00/27] Modernize tcg/ppc64 Alexander Graf
2013-04-02 15:54   ` Aurelien Jarno
2013-04-02 16:08     ` Alexander Graf
2013-04-13 11:38 ` Aurelien Jarno

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515AF541.1040900@twiddle.net \
    --to=rth@twiddle.net \
    --cc=agraf@suse.de \
    --cc=aurelien@aurel32.net \
    --cc=av1474@comtv.ru \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).