* [Qemu-devel] [PATCH v5 1/9] target-ppc: Implement mfvsrld instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 2/9] target-ppc: Implement mtvsrdd instruction Nikunj A Dadhania
` (8 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh, Ravi Bangoria
From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
mfvsrld: Move From VSR Lower Doubleword
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 17 +++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 18 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index eee6052..b669e8c 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -217,6 +217,23 @@ static void gen_##name(DisasContext *ctx) \
MV_VSRD(mfvsrd, cpu_gpr[rA(ctx->opcode)], cpu_vsrh(xS(ctx->opcode)))
MV_VSRD(mtvsrd, cpu_vsrh(xT(ctx->opcode)), cpu_gpr[rA(ctx->opcode)])
+static void gen_mfvsrld(DisasContext *ctx)
+{
+ if (xS(ctx->opcode) < 32) {
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ } else {
+ if (unlikely(!ctx->altivec_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VPU);
+ return;
+ }
+ }
+
+ tcg_gen_mov_i64(cpu_gpr[rA(ctx->opcode)], cpu_vsrl(xS(ctx->opcode)));
+}
+
#endif
static void gen_xxpermdi(DisasContext *ctx)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 414b73b..3b296f8 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -22,6 +22,7 @@ GEN_HANDLER_E(mtvsrwz, 0x1F, 0x13, 0x07, 0x0000F800, PPC_NONE, PPC2_VSX207),
#if defined(TARGET_PPC64)
GEN_HANDLER_E(mfvsrd, 0x1F, 0x13, 0x01, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mtvsrd, 0x1F, 0x13, 0x05, 0x0000F800, PPC_NONE, PPC2_VSX207),
+GEN_HANDLER_E(mfvsrld, 0X1F, 0x13, 0x09, 0x0000F800, PPC_NONE, PPC2_ISA300),
#endif
#define GEN_XX1FORM(name, opc2, opc3, fl2) \
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 2/9] target-ppc: Implement mtvsrdd instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 1/9] target-ppc: Implement mfvsrld instruction Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction Nikunj A Dadhania
` (7 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh, Ravi Bangoria
From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
mtvsrdd: Move To VSR Double Doubleword
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 23 +++++++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 24 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index b669e8c..c4c50dd 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -234,6 +234,29 @@ static void gen_mfvsrld(DisasContext *ctx)
tcg_gen_mov_i64(cpu_gpr[rA(ctx->opcode)], cpu_vsrl(xS(ctx->opcode)));
}
+static void gen_mtvsrdd(DisasContext *ctx)
+{
+ if (xT(ctx->opcode) < 32) {
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ } else {
+ if (unlikely(!ctx->altivec_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VPU);
+ return;
+ }
+ }
+
+ if (!rA(ctx->opcode)) {
+ tcg_gen_movi_i64(cpu_vsrh(xT(ctx->opcode)), 0);
+ } else {
+ tcg_gen_mov_i64(cpu_vsrh(xT(ctx->opcode)), cpu_gpr[rA(ctx->opcode)]);
+ }
+
+ tcg_gen_mov_i64(cpu_vsrl(xT(ctx->opcode)), cpu_gpr[rB(ctx->opcode)]);
+}
+
#endif
static void gen_xxpermdi(DisasContext *ctx)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 3b296f8..1287973 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -23,6 +23,7 @@ GEN_HANDLER_E(mtvsrwz, 0x1F, 0x13, 0x07, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mfvsrd, 0x1F, 0x13, 0x01, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mtvsrd, 0x1F, 0x13, 0x05, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mfvsrld, 0X1F, 0x13, 0x09, 0x0000F800, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(mtvsrdd, 0X1F, 0x13, 0x0D, 0x0, PPC_NONE, PPC2_ISA300),
#endif
#define GEN_XX1FORM(name, opc2, opc3, fl2) \
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 1/9] target-ppc: Implement mfvsrld instruction Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 2/9] target-ppc: Implement mtvsrdd instruction Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 20:21 ` Richard Henderson
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 4/9] target-ppc: improve lxvw4x implementation Nikunj A Dadhania
` (6 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh, Ravi Bangoria
From: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
mtvsrws: Move To VSR Word & Splat
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 23 +++++++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 24 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index c4c50dd..fa8240f 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -257,6 +257,29 @@ static void gen_mtvsrdd(DisasContext *ctx)
tcg_gen_mov_i64(cpu_vsrl(xT(ctx->opcode)), cpu_gpr[rB(ctx->opcode)]);
}
+static void gen_mtvsrws(DisasContext *ctx)
+{
+ TCGv_i64 t0 = tcg_temp_new_i64();
+
+ if (xT(ctx->opcode) < 32) {
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ } else {
+ if (unlikely(!ctx->altivec_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VPU);
+ return;
+ }
+ }
+
+ tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
+ tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
+ tcg_gen_mov_i64(cpu_vsrh(xT(ctx->opcode)), cpu_vsrl(xT(ctx->opcode)));
+
+ tcg_temp_free_i64(t0);
+}
+
#endif
static void gen_xxpermdi(DisasContext *ctx)
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 1287973..d5f5b87 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -24,6 +24,7 @@ GEN_HANDLER_E(mfvsrd, 0x1F, 0x13, 0x01, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mtvsrd, 0x1F, 0x13, 0x05, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mfvsrld, 0X1F, 0x13, 0x09, 0x0000F800, PPC_NONE, PPC2_ISA300),
GEN_HANDLER_E(mtvsrdd, 0X1F, 0x13, 0x0D, 0x0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(mtvsrws, 0x1F, 0x13, 0x0C, 0x0000F800, PPC_NONE, PPC2_ISA300),
#endif
#define GEN_XX1FORM(name, opc2, opc3, fl2) \
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction Nikunj A Dadhania
@ 2016-09-28 20:21 ` Richard Henderson
2016-09-29 1:53 ` David Gibson
2016-09-29 2:19 ` Nikunj A Dadhania
0 siblings, 2 replies; 18+ messages in thread
From: Richard Henderson @ 2016-09-28 20:21 UTC (permalink / raw)
To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, benh, Ravi Bangoria
On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
> + tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
> + tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
Why are you using t0?
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-28 20:21 ` Richard Henderson
@ 2016-09-29 1:53 ` David Gibson
2016-09-29 4:07 ` Richard Henderson
2016-09-29 2:19 ` Nikunj A Dadhania
1 sibling, 1 reply; 18+ messages in thread
From: David Gibson @ 2016-09-29 1:53 UTC (permalink / raw)
To: Richard Henderson
Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, benh, Ravi Bangoria
[-- Attachment #1: Type: text/plain, Size: 835 bytes --]
On Wed, Sep 28, 2016 at 01:21:00PM -0700, Richard Henderson wrote:
> On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
> > + tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
> > + tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
>
> Why are you using t0?
Richard, I don't quite understand your question. This looks correct
to me. It's duplicating the low 32-bits of rA into both the low-and
high 32-bits of t0, which will then be store to both the low and high
64-bit elements of the VSR. That matches the instruction definition
which puts the low 32-bits of RA into every 32-bit element of the
vector.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-29 1:53 ` David Gibson
@ 2016-09-29 4:07 ` Richard Henderson
0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2016-09-29 4:07 UTC (permalink / raw)
To: David Gibson; +Cc: Nikunj A Dadhania, qemu-ppc, qemu-devel, benh, Ravi Bangoria
On 09/28/2016 06:53 PM, David Gibson wrote:
> On Wed, Sep 28, 2016 at 01:21:00PM -0700, Richard Henderson wrote:
>> On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
>>> + tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
>>> + tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
>>
>> Why are you using t0?
>
> Richard, I don't quite understand your question.
There's no need for the copy into t0 -- just put rA into those two arguments.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-28 20:21 ` Richard Henderson
2016-09-29 1:53 ` David Gibson
@ 2016-09-29 2:19 ` Nikunj A Dadhania
2016-09-29 4:08 ` Richard Henderson
1 sibling, 1 reply; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-29 2:19 UTC (permalink / raw)
To: Richard Henderson, qemu-ppc, david; +Cc: qemu-devel, benh, Ravi Bangoria
Richard Henderson <rth@twiddle.net> writes:
> On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
>> + tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
>> + tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
>
> Why are you using t0?
Thought about dropping it, but wasn't sure if deposit_i64 would change it.
Regards,
Nikunj
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction
2016-09-29 2:19 ` Nikunj A Dadhania
@ 2016-09-29 4:08 ` Richard Henderson
0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2016-09-29 4:08 UTC (permalink / raw)
To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, benh, Ravi Bangoria
On 09/28/2016 07:19 PM, Nikunj A Dadhania wrote:
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
>>> + tcg_gen_mov_i64(t0, cpu_gpr[rA(ctx->opcode)]);
>>> + tcg_gen_deposit_i64(cpu_vsrl(xT(ctx->opcode)), t0, t0, 32, 32);
>>
>> Why are you using t0?
>
> Thought about dropping it, but wasn't sure if deposit_i64 would change it.
Nope, all of the tcg-op.c functions are safe that way, only modifying the outputs.
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 4/9] target-ppc: improve lxvw4x implementation
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (2 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 3/9] target-ppc: Implement mtvsrws instruction Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 5/9] target-ppc: improve stxvw4x implementation Nikunj A Dadhania
` (5 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
Load 8byte at a time and manipulate.
Big-Endian Storage
+-------------+-------------+-------------+-------------+
| 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
+-------------+-------------+-------------+-------------+
Little-Endian Storage
+-------------+-------------+-------------+-------------+
| 33 22 11 00 | 77 66 55 44 | BB AA 99 88 | FF EE DD CC |
+-------------+-------------+-------------+-------------+
Vector load results in:
+-------------+-------------+-------------+-------------+
| 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
+-------------+-------------+-------------+-------------+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index fa8240f..3bc3f6f 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -75,7 +75,6 @@ static void gen_lxvdsx(DisasContext *ctx)
static void gen_lxvw4x(DisasContext *ctx)
{
TCGv EA;
- TCGv_i64 tmp;
TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
if (unlikely(!ctx->vsx_enabled)) {
@@ -84,22 +83,27 @@ static void gen_lxvw4x(DisasContext *ctx)
}
gen_set_access_type(ctx, ACCESS_INT);
EA = tcg_temp_new();
- tmp = tcg_temp_new_i64();
gen_addr_reg_index(ctx, EA);
- gen_qemu_ld32u_i64(ctx, tmp, EA);
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_ld32u_i64(ctx, xth, EA);
- tcg_gen_deposit_i64(xth, xth, tmp, 32, 32);
-
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_ld32u_i64(ctx, tmp, EA);
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_ld32u_i64(ctx, xtl, EA);
- tcg_gen_deposit_i64(xtl, xtl, tmp, 32, 32);
-
+ if (ctx->le_mode) {
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+
+ tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
+ tcg_gen_shri_i64(t1, t0, 32);
+ tcg_gen_deposit_i64(xth, t1, t0, 32, 32);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_ld_i64(t0, EA, ctx->mem_idx, MO_LEQ);
+ tcg_gen_shri_i64(t1, t0, 32);
+ tcg_gen_deposit_i64(xtl, t1, t0, 32, 32);
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+ } else {
+ tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
+ }
tcg_temp_free(EA);
- tcg_temp_free_i64(tmp);
}
#define VSX_STORE_SCALAR(name, operation) \
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 5/9] target-ppc: improve stxvw4x implementation
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (3 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 4/9] target-ppc: improve lxvw4x implementation Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 6/9] target-ppc: add lxvh8x instruction Nikunj A Dadhania
` (4 subsequent siblings)
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
Manipulate data and store 8bytes instead of 4bytes.
Vector:
+-------------+-------------+-------------+-------------+
| 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
+-------------+-------------+-------------+-------------+
Store results in following:
Big-Endian Storage
+-------------+-------------+-------------+-------------+
| 00 11 22 33 | 44 55 66 77 | 88 99 AA BB | CC DD EE FF |
+-------------+-------------+-------------+-------------+
Little-Endian Storage
+-------------+-------------+-------------+-------------+
| 33 22 11 00 | 77 66 55 44 | BB AA 99 88 | FF EE DD CC |
+-------------+-------------+-------------+-------------+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 3bc3f6f..dbe483f 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -146,7 +146,8 @@ static void gen_stxvd2x(DisasContext *ctx)
static void gen_stxvw4x(DisasContext *ctx)
{
- TCGv_i64 tmp;
+ TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+ TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
TCGv EA;
if (unlikely(!ctx->vsx_enabled)) {
gen_exception(ctx, POWERPC_EXCP_VSXU);
@@ -155,21 +156,25 @@ static void gen_stxvw4x(DisasContext *ctx)
gen_set_access_type(ctx, ACCESS_INT);
EA = tcg_temp_new();
gen_addr_reg_index(ctx, EA);
- tmp = tcg_temp_new_i64();
-
- tcg_gen_shri_i64(tmp, cpu_vsrh(xS(ctx->opcode)), 32);
- gen_qemu_st32_i64(ctx, tmp, EA);
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_st32_i64(ctx, cpu_vsrh(xS(ctx->opcode)), EA);
-
- tcg_gen_shri_i64(tmp, cpu_vsrl(xS(ctx->opcode)), 32);
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_st32_i64(ctx, tmp, EA);
- tcg_gen_addi_tl(EA, EA, 4);
- gen_qemu_st32_i64(ctx, cpu_vsrl(xS(ctx->opcode)), EA);
+ if (ctx->le_mode) {
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+ tcg_gen_shri_i64(t0, xsh, 32);
+ tcg_gen_deposit_i64(t1, t0, xsh, 32, 32);
+ tcg_gen_qemu_st_i64(t1, EA, ctx->mem_idx, MO_LEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_shri_i64(t0, xsl, 32);
+ tcg_gen_deposit_i64(t1, t0, xsl, 32, 32);
+ tcg_gen_qemu_st_i64(t1, EA, ctx->mem_idx, MO_LEQ);
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+ } else {
+ tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+ }
tcg_temp_free(EA);
- tcg_temp_free_i64(tmp);
}
#define MV_VSRW(name, tcgop1, tcgop2, target, source) \
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 6/9] target-ppc: add lxvh8x instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (4 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 5/9] target-ppc: improve stxvw4x implementation Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 20:22 ` Richard Henderson
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 7/9] target-ppc: add stxvh8x instruction Nikunj A Dadhania
` (3 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
lxvh8x: Load VSX Vector Halfword*8
Big-Endian Storage
+-------+-------+-------+-------+-------+-------+-------+-------+
| 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Little-Endian Storage
+-------+-------+-------+-------+-------+-------+-------+-------+
| 01 00 | 11 10 | 21 20 | 31 30 | 41 40 | 51 50 | 61 60 | 71 70 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Vector load results in:
+-------+-------+-------+-------+-------+-------+-------+-------+
| 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
target-ppc/translate/vsx-impl.inc.c | 49 +++++++++++++++++++++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 50 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index dbe483f..25b5ce4 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -106,6 +106,55 @@ static void gen_lxvw4x(DisasContext *ctx)
tcg_temp_free(EA);
}
+static void gen_bswap16x8(TCGv_i64 outh, TCGv_i64 outl,
+ TCGv_i64 inh, TCGv_i64 inl)
+{
+ TCGv_i64 mask = tcg_const_i64(0x00FF00FF00FF00FF);
+ TCGv_i64 t0 = tcg_temp_new_i64();
+ TCGv_i64 t1 = tcg_temp_new_i64();
+
+ /* outh = ((inh & mask) << 8) | ((inh >> 8) & mask) */
+ tcg_gen_and_i64(t0, inh, mask);
+ tcg_gen_shli_i64(t0, t0, 8);
+ tcg_gen_shri_i64(t1, inh, 8);
+ tcg_gen_and_i64(t1, t1, mask);
+ tcg_gen_or_i64(outh, t0, t1);
+
+ /* outl = ((inl & mask) << 8) | ((inl >> 8) & mask) */
+ tcg_gen_and_i64(t0, inl, mask);
+ tcg_gen_shli_i64(t0, t0, 8);
+ tcg_gen_shri_i64(t1, inl, 8);
+ tcg_gen_and_i64(t1, t1, mask);
+ tcg_gen_or_i64(outl, t0, t1);
+
+ tcg_temp_free_i64(t0);
+ tcg_temp_free_i64(t1);
+ tcg_temp_free_i64(mask);
+}
+
+static void gen_lxvh8x(DisasContext *ctx)
+{
+ TCGv EA;
+ TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+ TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ gen_set_access_type(ctx, ACCESS_INT);
+
+ EA = tcg_temp_new();
+ gen_addr_reg_index(ctx, EA);
+ tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
+ if (ctx->le_mode) {
+ gen_bswap16x8(xth, xtl, xth, xtl);
+ }
+ tcg_temp_free(EA);
+}
+
#define VSX_STORE_SCALAR(name, operation) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index d5f5b87..c52e6ff 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -7,6 +7,7 @@ GEN_HANDLER_E(lxsspx, 0x1F, 0x0C, 0x10, 0, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE, PPC2_ISA300),
GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 6/9] target-ppc: add lxvh8x instruction
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 6/9] target-ppc: add lxvh8x instruction Nikunj A Dadhania
@ 2016-09-28 20:22 ` Richard Henderson
0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2016-09-28 20:22 UTC (permalink / raw)
To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, benh
On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
> lxvh8x: Load VSX Vector Halfword*8
>
> Big-Endian Storage
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Little-Endian Storage
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 01 00 | 11 10 | 21 20 | 31 30 | 41 40 | 51 50 | 61 60 | 71 70 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Vector load results in:
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
> target-ppc/translate/vsx-impl.inc.c | 49 +++++++++++++++++++++++++++++++++++++
> target-ppc/translate/vsx-ops.inc.c | 1 +
> 2 files changed, 50 insertions(+)
Reviewed-by: Richard Henderson <rth@twiddle.net>
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 7/9] target-ppc: add stxvh8x instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (5 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 6/9] target-ppc: add lxvh8x instruction Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 20:23 ` Richard Henderson
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 8/9] target-ppc: add lxvb16x instruction Nikunj A Dadhania
` (2 subsequent siblings)
9 siblings, 1 reply; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
stxvh8x: Store VSX Vector Halfword*8
Vector:
+-------+-------+-------+-------+-------+-------+-------+-------+
| 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Store results in following:
Big-Endian Storage
+-------+-------+-------+-------+-------+-------+-------+-------+
| 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Little-Endian Storage
+-------+-------+-------+-------+-------+-------+-------+-------+
| 01 00 | 11 10 | 21 20 | 31 30 | 41 40 | 51 50 | 61 60 | 71 70 |
+-------+-------+-------+-------+-------+-------+-------+-------+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
---
target-ppc/translate/vsx-impl.inc.c | 31 +++++++++++++++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 32 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 25b5ce4..e762c0a 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -226,6 +226,37 @@ static void gen_stxvw4x(DisasContext *ctx)
tcg_temp_free(EA);
}
+static void gen_stxvh8x(DisasContext *ctx)
+{
+ TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+ TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
+ TCGv EA;
+
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ gen_set_access_type(ctx, ACCESS_INT);
+ EA = tcg_temp_new();
+ gen_addr_reg_index(ctx, EA);
+ if (ctx->le_mode) {
+ TCGv_i64 outh = tcg_temp_new_i64();
+ TCGv_i64 outl = tcg_temp_new_i64();
+
+ gen_bswap16x8(outh, outl, xsh, xsl);
+ tcg_gen_qemu_st_i64(outh, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_st_i64(outl, EA, ctx->mem_idx, MO_BEQ);
+ tcg_temp_free_i64(outh);
+ tcg_temp_free_i64(outl);
+ } else {
+ tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+ }
+ tcg_temp_free(EA);
+}
+
#define MV_VSRW(name, tcgop1, tcgop2, target, source) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index c52e6ff..17975ec 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -16,6 +16,7 @@ GEN_HANDLER_E(stxsiwx, 0x1F, 0xC, 0x04, 0, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
+GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE, PPC2_ISA300),
GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 7/9] target-ppc: add stxvh8x instruction
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 7/9] target-ppc: add stxvh8x instruction Nikunj A Dadhania
@ 2016-09-28 20:23 ` Richard Henderson
0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2016-09-28 20:23 UTC (permalink / raw)
To: Nikunj A Dadhania, qemu-ppc, david; +Cc: qemu-devel, benh
On 09/28/2016 11:41 AM, Nikunj A Dadhania wrote:
> stxvh8x: Store VSX Vector Halfword*8
>
> Vector:
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Store results in following:
>
> Big-Endian Storage
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 00 01 | 10 11 | 20 21 | 30 31 | 40 41 | 50 51 | 60 61 | 70 71 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Little-Endian Storage
> +-------+-------+-------+-------+-------+-------+-------+-------+
> | 01 00 | 11 10 | 21 20 | 31 30 | 41 40 | 51 50 | 61 60 | 71 70 |
> +-------+-------+-------+-------+-------+-------+-------+-------+
>
> Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
> ---
> target-ppc/translate/vsx-impl.inc.c | 31 +++++++++++++++++++++++++++++++
> target-ppc/translate/vsx-ops.inc.c | 1 +
> 2 files changed, 32 insertions(+)
Reviewed-by: Richard Henderson <rth@twiddle.net>
r~
^ permalink raw reply [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 8/9] target-ppc: add lxvb16x instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (6 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 7/9] target-ppc: add stxvh8x instruction Nikunj A Dadhania
@ 2016-09-28 18:41 ` Nikunj A Dadhania
2016-09-28 18:42 ` [Qemu-devel] [PATCH v5 9/9] target-ppc: add stxvb16x instruction Nikunj A Dadhania
2016-09-29 1:51 ` [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 David Gibson
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:41 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
lxvb16x: Load VSX Vector Byte*16
Little/Big-endian Storage
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Vector load results in:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 19 +++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 20 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index e762c0a..40fba6e 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -155,6 +155,25 @@ static void gen_lxvh8x(DisasContext *ctx)
tcg_temp_free(EA);
}
+static void gen_lxvb16x(DisasContext *ctx)
+{
+ TCGv EA;
+ TCGv_i64 xth = cpu_vsrh(xT(ctx->opcode));
+ TCGv_i64 xtl = cpu_vsrl(xT(ctx->opcode));
+
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ gen_set_access_type(ctx, ACCESS_INT);
+ EA = tcg_temp_new();
+ gen_addr_reg_index(ctx, EA);
+ tcg_gen_qemu_ld_i64(xth, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_ld_i64(xtl, EA, ctx->mem_idx, MO_BEQ);
+ tcg_temp_free(EA);
+}
+
#define VSX_STORE_SCALAR(name, operation) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 17975ec..3274859 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -8,6 +8,7 @@ GEN_HANDLER_E(lxvd2x, 0x1F, 0x0C, 0x1A, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(lxvdsx, 0x1F, 0x0C, 0x0A, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(lxvw4x, 0x1F, 0x0C, 0x18, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(lxvh8x, 0x1F, 0x0C, 0x19, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(lxvb16x, 0x1F, 0x0C, 0x1B, 0, PPC_NONE, PPC2_ISA300),
GEN_HANDLER_E(stxsdx, 0x1F, 0xC, 0x16, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(stxsibx, 0x1F, 0xD, 0x1C, 0, PPC_NONE, PPC2_ISA300),
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [Qemu-devel] [PATCH v5 9/9] target-ppc: add stxvb16x instruction
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (7 preceding siblings ...)
2016-09-28 18:41 ` [Qemu-devel] [PATCH v5 8/9] target-ppc: add lxvb16x instruction Nikunj A Dadhania
@ 2016-09-28 18:42 ` Nikunj A Dadhania
2016-09-29 1:51 ` [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 David Gibson
9 siblings, 0 replies; 18+ messages in thread
From: Nikunj A Dadhania @ 2016-09-28 18:42 UTC (permalink / raw)
To: qemu-ppc, david, rth; +Cc: qemu-devel, nikunj, benh
stxvb16x: Store VSX Vector Byte*16
Vector:
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Store results in following:
Little/Big-endian Storage
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|F0|F1|F2|F3|F4|F5|F6|F7|E0|E1|E2|E3|E4|E5|E6|E7|
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
---
target-ppc/translate/vsx-impl.inc.c | 19 +++++++++++++++++++
target-ppc/translate/vsx-ops.inc.c | 1 +
2 files changed, 20 insertions(+)
diff --git a/target-ppc/translate/vsx-impl.inc.c b/target-ppc/translate/vsx-impl.inc.c
index 40fba6e..01f2157 100644
--- a/target-ppc/translate/vsx-impl.inc.c
+++ b/target-ppc/translate/vsx-impl.inc.c
@@ -276,6 +276,25 @@ static void gen_stxvh8x(DisasContext *ctx)
tcg_temp_free(EA);
}
+static void gen_stxvb16x(DisasContext *ctx)
+{
+ TCGv_i64 xsh = cpu_vsrh(xS(ctx->opcode));
+ TCGv_i64 xsl = cpu_vsrl(xS(ctx->opcode));
+ TCGv EA;
+
+ if (unlikely(!ctx->vsx_enabled)) {
+ gen_exception(ctx, POWERPC_EXCP_VSXU);
+ return;
+ }
+ gen_set_access_type(ctx, ACCESS_INT);
+ EA = tcg_temp_new();
+ gen_addr_reg_index(ctx, EA);
+ tcg_gen_qemu_st_i64(xsh, EA, ctx->mem_idx, MO_BEQ);
+ tcg_gen_addi_tl(EA, EA, 8);
+ tcg_gen_qemu_st_i64(xsl, EA, ctx->mem_idx, MO_BEQ);
+ tcg_temp_free(EA);
+}
+
#define MV_VSRW(name, tcgop1, tcgop2, target, source) \
static void gen_##name(DisasContext *ctx) \
{ \
diff --git a/target-ppc/translate/vsx-ops.inc.c b/target-ppc/translate/vsx-ops.inc.c
index 3274859..10eb4b9 100644
--- a/target-ppc/translate/vsx-ops.inc.c
+++ b/target-ppc/translate/vsx-ops.inc.c
@@ -18,6 +18,7 @@ GEN_HANDLER_E(stxsspx, 0x1F, 0xC, 0x14, 0, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(stxvd2x, 0x1F, 0xC, 0x1E, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(stxvw4x, 0x1F, 0xC, 0x1C, 0, PPC_NONE, PPC2_VSX),
GEN_HANDLER_E(stxvh8x, 0x1F, 0x0C, 0x1D, 0, PPC_NONE, PPC2_ISA300),
+GEN_HANDLER_E(stxvb16x, 0x1F, 0x0C, 0x1F, 0, PPC_NONE, PPC2_ISA300),
GEN_HANDLER_E(mfvsrwz, 0x1F, 0x13, 0x03, 0x0000F800, PPC_NONE, PPC2_VSX207),
GEN_HANDLER_E(mtvsrwa, 0x1F, 0x13, 0x06, 0x0000F800, PPC_NONE, PPC2_VSX207),
--
2.7.4
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4
2016-09-28 18:41 [Qemu-devel] [PATCH v5 0/9] POWER9 TCG enablements - part4 Nikunj A Dadhania
` (8 preceding siblings ...)
2016-09-28 18:42 ` [Qemu-devel] [PATCH v5 9/9] target-ppc: add stxvb16x instruction Nikunj A Dadhania
@ 2016-09-29 1:51 ` David Gibson
9 siblings, 0 replies; 18+ messages in thread
From: David Gibson @ 2016-09-29 1:51 UTC (permalink / raw)
To: Nikunj A Dadhania; +Cc: qemu-ppc, rth, qemu-devel, benh
[-- Attachment #1: Type: text/plain, Size: 3217 bytes --]
On Thu, Sep 29, 2016 at 12:11:51AM +0530, Nikunj A Dadhania wrote:
> This series contains 7 new instructions for POWER9 ISA3.0
> Use newer qemu load/store tcg helpers and optimize stxvw4x and lxvw4x.
>
> GCC was adding epilogue for every VSX instructions causing change in
> behaviour. For testing the load vector instructions used mfvsrld/mfvsrd
> for loading vsr to register. And for testing store vector, used mtvsrdd
> instructions. This helped in getting rid of the epilogue added by gcc.
>
> Patches:
> 01: mfvsrld: Move From VSR Lower Doubleword
> 02: mtvsrdd: Move To VSR Double Doubleword
> 03: mtvsrws: Move To VSR Word & Splat
> 05: lxvw4x: improve implementation
> 05: stxv4x: improve implementation
> 06: lxvh8x: Load VSX Vector Halfword*8
> 07: stxvh8x: Store VSX Vector Halfword*8
> 08: lxvb16x: Load VSX Vector Byte*16
> 09: stxvb16x: Store VSX Vector Byte*16
I've applied everything that rth reviewed to ppc-for-2.8.
I've tweaked the ascii art diagrams describing the endianness
transformations. Specifically I removed the within-element spaces for
each element on the vector (not memory) side. That's to emphasise the
fact that in-register there's no endianness, just numbers.
>
> Changelog:
> v4:
> * Added gen_bswap16x8 inline for lxvh8x and stxvh8x in tcg
> * Dropped helper_bswap16x4
> * Use temporaries in stxvh8x and not clobber the register
>
> v3:
> * Added 3 new VSR instructions.
> * Fixed all the vector load/store instructions for BE/LE.
> * Added detailed commit messages to patches.
> * Dropped deposit32x2 and implemented it using tcg ops
>
> v2:
> * Fix lxvw4x/stxv4x translation as LE/BE were both similar
> one in tcg and other as helper
> * Rename bswap32x2 to deposit32x2 as it does not need to
> swap content(32bit)
> * stxvh8x had a bug as David suggested.
>
> v1:
> * More load/store cleanups in byte reverse routines
> * ld64/st64 converted to newer macro and updated call sites
> * Cleanup load with reservation and store conditional
> * Return invalid random for darn instruction
>
> v0:
> * darn - read /dev/random to get the random number
> * xxspltib - make is PPC64 only
> * Consolidate load/store operations and use macros to generate qemu_st/ld
> * Simplify load/store vsx endian manipulation
>
> Nikunj A Dadhania (6):
> target-ppc: improve lxvw4x implementation
> target-ppc: improve stxvw4x implementation
> target-ppc: add lxvh8x instruction
> target-ppc: add stxvh8x instruction
> target-ppc: add lxvb16x instruction
> target-ppc: add stxvb16x instruction
>
> Ravi Bangoria (3):
> target-ppc: Implement mfvsrld instruction
> target-ppc: Implement mtvsrdd instruction
> target-ppc: Implement mtvsrws instruction
>
> target-ppc/translate/vsx-impl.inc.c | 238 ++++++++++++++++++++++++++++++++----
> target-ppc/translate/vsx-ops.inc.c | 7 ++
> 2 files changed, 221 insertions(+), 24 deletions(-)
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 18+ messages in thread