Re: [Qemu-devel] [RFC] Use of host vector operations in host helper functions

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <rth@twiddle.net>
Cc: qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [RFC] Use of host vector operations in host helper	functions
Date: Sat, 13 Sep 2014 17:02:38 +0100	[thread overview]
Message-ID: <87k3571pb5.fsf@linaro.org> (raw)
In-Reply-To: <53FF4D6B.2050202@twiddle.net>


Richard Henderson writes:

> Most of the time, guest vector operations are rare enough that it doesn't
> really matter that we implement them with a loop around integer operations.
>
> But for target-alpha, there's one vector comparison operation that appears in
> every guest string operation, and is used heavily enough that it's in the top
> 10 functions in the profile: cmpbge (compare bytes greater or equal).

For a helper function to top the profile is pretty impressive. I wonder
how it compares when you break it down by basic blocks?

> I did some experiments, where I rewrote the function using gcc's "generic"
> vector types and builtin operations.
>
> <snip>
>
> GCC doesn't do a half-bad job on other hosts either:
>
> aarch64:
>   b4:   4f000400        movi    v0.4s, #0x0
>   b8:   4ea01c01        mov     v1.16b, v0.16b
>   bc:   4e081c00        mov     v0.d[0], x0
>   c0:   4e081c21        mov     v1.d[0], x1
>   c4:   6e213c00        cmhs    v0.16b, v0.16b, v1.16b
>   c8:   4e083c00        mov     x0, v0.d[0]
>   cc:   9200c000        and     x0, x0, #0x101010101010101
>   d0:   aa401c00        orr     x0, x0, x0, lsr #7
>   d4:   aa403800        orr     x0, x0, x0, lsr #14
>   d8:   aa407000        orr     x0, x0, x0, lsr #28
>   dc:   53001c00        uxtb    w0, w0
>   e0:   d65f03c0        ret
>
> Of course aarch64 *does* have an 8-byte vector size that gcc knows how to use.
>  If I adjust the patch above to use it, only the first two insns are eliminated
> -- surely not a measurable difference.
>
> power7:
>   ...
>   vcmpgtub 13,0,1
>   vcmpequb 0,0,1
>   xxlor 32,45,32
>   ...
>
>
> But I guess the larger question here is: how much of this should we accept?
>
> (0) Ignore this and do nothing?
>
> (1) No general infrastructure.  Special case this one insn with #ifdef __SSE2__
> and ignore anything else.

Not a big fan of special cases that are arch dependent.

> (2) Put in just enough infrastructure to know if compiler support for general
> vectors is available, and then use it ad hoc when such functions are shown to
> be high on the profile?
>
> (3) Put in more infrastructure and allow it to be used to implement most guest
> vector operations, possibly tidying their implementations?
<snip>

(4) Consider supporting generic vector operations in the TCG?

While making helper functions faster is good I've wondered if they is
enough genericsm across the various SIMD/vector operations we could add
add TCG ops to translate them? The ops could fall back to generic helper
functions using the GCC instrinsics if we know there is no decent
back-end support for them?


-- 
Alex Bennée

next prev parent reply	other threads:[~2014-09-14 16:48 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-28 15:40 [Qemu-devel] [RFC] Use of host vector operations in host helper functions Richard Henderson
2014-09-13 16:02 ` Alex Bennée [this message]
2014-10-16  8:56   ` [Qemu-devel] [PATCH RFC 0/7] Translate guest vector operations to host vector operations Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 1/7] tcg: add support for 128bit vector type Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 2/7] tcg: store ENV global in TCGContext Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 3/7] tcg: add sync_temp opcode Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 4/7] tcg: add add_i32x4 opcode Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 5/7] target-arm: support access to 128-bit guest registers as globals Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 6/7] target-arm: use add_i32x4 opcode to handle vadd.i32 instruction Kirill Batuzov
2014-10-16  8:56     ` [Qemu-devel] [PATCH RFC 7/7] tcg/i386: add support for vector opcodes Kirill Batuzov
2014-10-16 10:03     ` [Qemu-devel] [PATCH RFC 0/7] Translate guest vector operations to host vector operations Alex Bennée
2014-10-16 11:07       ` Kirill Batuzov
2014-11-11 11:58     ` Kirill Batuzov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87k3571pb5.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).