From: Bharata B Rao <bharata@linux.vnet.ibm.com>
To: David Gibson <david@gibson.dropbear.id.au>
Cc: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>,
qemu-ppc@nongnu.org, rth@twiddle.net, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 13/13] target-ppc: Add xxperm and xxpermr instructions
Date: Tue, 6 Dec 2016 14:25:41 +0530 [thread overview]
Message-ID: <20161206085541.GD4211@in.ibm.com> (raw)
In-Reply-To: <20161206041122.GP32366@umbus.fritz.box>
On Tue, Dec 06, 2016 at 03:11:22PM +1100, David Gibson wrote:
> On Mon, Dec 05, 2016 at 04:55:30PM +0530, Nikunj A Dadhania wrote:
> > From: Bharata B Rao <bharata@linux.vnet.ibm.com>
> >
> > xxperm: VSX Vector Permute
> > xxpermr: VSX Vector Permute Right-indexed
> >
> > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
> > Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> > ---
> > target-ppc/fpu_helper.c | 50 +++++++++++++++++++++++++++++++++++++
> > target-ppc/helper.h | 2 ++
> > target-ppc/translate/vsx-impl.inc.c | 2 ++
> > target-ppc/translate/vsx-ops.inc.c | 2 ++
> > 4 files changed, 56 insertions(+)
> >
> > diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> > index 3b867cf..be552c7 100644
> > --- a/target-ppc/fpu_helper.c
> > +++ b/target-ppc/fpu_helper.c
> > @@ -2869,3 +2869,53 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
> > float_check_status(env);
> > return xt;
> > }
> > +
> > +static void vsr_copy_256(ppc_vsr_t *xa, ppc_vsr_t *xt, int8_t *src)
> > +{
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > + memcpy(src, xa, sizeof(*xa));
> > + memcpy(src + 16, xt, sizeof(*xt));
> > +#else
> > + memcpy(src, xt, sizeof(*xt));
> > + memcpy(src + 16, xa, sizeof(*xa));
>
> Is this right? I thought the order of the bytes within each word
> varied with the host endianness as well.
Since we are already working with 2 16 byte vectors xa and xb here, I thought
we don't have to worry about order of bytes within each vector, but instead can
construct the 32 byte vector as above based on host endianness.
>
> > +#endif
> > +}
> > +
> > +static int8_t vsr_get_byte(int8_t *src, int bound, int idx)
> > +{
> > + if (idx >= bound) {
> > + return 0xFF;
> > + }
>
> AFAICT you don't need this check. For both xxperm and xxpermr you're
> already masking the index to 5 bits, so it can't exceed 31.
Was thinking of making it a generic API and hence had that boundary
check but yes, no point for the check in the context of this instruction.
>
> > +#if defined(HOST_WORDS_BIGENDIAN)
> > + return src[idx];
> > +#else
> > + return src[bound - 1 - idx];
> > +#endif
> > +}
> > +
> > +#define VSX_XXPERM(op, indexed) \
> > +void helper_##op(CPUPPCState *env, uint32_t opcode) \
> > +{ \
> > + ppc_vsr_t xt, xa, pcv; \
> > + int i, idx; \
> > + int8_t src[32]; \
> > + \
> > + getVSR(xA(opcode), &xa, env); \
> > + getVSR(xT(opcode), &xt, env); \
> > + getVSR(xB(opcode), &pcv, env); \
> > + \
> > + vsr_copy_256(&xa, &xt, src); \
>
> You have a double copy here AFAICT - first from the actual env
> structure to xt and xa, then to the src array. That seems like it
> would be good to avoid.
>
> It seems like it would nice in any case to avoid even the one copy.
> You'd need a temporary for the output of course and to copy that, but
> you should be able to combine indexed with host endianness to
> translate each index to retrieve directly from the VSR values in env.
I am not sure it would be good to retrieve byte values directly from
env as getVSR nicely abstracts out from which fields
(env->[fpr, vsr, avr] the data is fetched based on the register specified
in the opcode.
I can reduce one copy though by not constructing a 32 byte vector (src)
but instead retrieving the bytes directly from xa and xt based on
the index.
Regards,
Bharata.
next prev parent reply other threads:[~2016-12-06 8:56 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-05 11:25 [Qemu-devel] [PATCH ppc-for-2.9 00/13] POWER9 TCG enablements - part9 Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 01/13] target-ppc: move ppc_vsr_t to common header Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 02/13] target-ppc: add mask_u128 routine Nikunj A Dadhania
2016-12-05 17:36 ` Richard Henderson
2016-12-06 5:19 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 03/13] target-ppc: implement lxvl instruction Nikunj A Dadhania
2016-12-05 17:46 ` Richard Henderson
2016-12-06 5:25 ` Nikunj A Dadhania
2016-12-06 10:11 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 04/13] target-ppc: implement lxvll instruction Nikunj A Dadhania
2016-12-05 17:52 ` Richard Henderson
2016-12-05 17:59 ` Richard Henderson
2016-12-06 5:45 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 05/13] target-ppc: implement stxvl instruction Nikunj A Dadhania
2016-12-05 17:55 ` Richard Henderson
2016-12-06 5:46 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 06/13] target-ppc: implement stxvll instructions Nikunj A Dadhania
2016-12-05 17:57 ` Richard Henderson
2016-12-06 6:03 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 07/13] target-ppc: implement xxextractuw instruction Nikunj A Dadhania
2016-12-05 18:10 ` Richard Henderson
2016-12-05 11:25 ` [Qemu-devel] [PATCH 08/13] target-ppc: implement xxinsertw instruction Nikunj A Dadhania
2016-12-05 18:14 ` Richard Henderson
2016-12-05 11:25 ` [Qemu-devel] [PATCH 09/13] target-ppc: implement stop instruction Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 10/13] target-ppc: implement xsabsqp/xsnabsqp instruction Nikunj A Dadhania
2016-12-05 18:14 ` Richard Henderson
2016-12-05 11:25 ` [Qemu-devel] [PATCH 11/13] target-ppc: implement xsnegqp instruction Nikunj A Dadhania
2016-12-05 18:15 ` Richard Henderson
2016-12-06 8:45 ` Nikunj A Dadhania
2016-12-05 11:25 ` [Qemu-devel] [PATCH 12/13] target-ppc: implement xscpsgnqp instruction Nikunj A Dadhania
2016-12-05 18:18 ` Richard Henderson
2016-12-05 11:25 ` [Qemu-devel] [PATCH 13/13] target-ppc: Add xxperm and xxpermr instructions Nikunj A Dadhania
2016-12-06 4:11 ` David Gibson
2016-12-06 8:55 ` Bharata B Rao [this message]
2016-12-06 4:12 ` [Qemu-devel] [PATCH ppc-for-2.9 00/13] POWER9 TCG enablements - part9 David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161206085541.GD4211@in.ibm.com \
--to=bharata@linux.vnet.ibm.com \
--cc=david@gibson.dropbear.id.au \
--cc=nikunj@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=rth@twiddle.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.