qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap
@ 2011-02-21 11:05 Peter Maydell
  2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 1/2] target-arm: Refactor to pull narrowing decode into separate function Peter Maydell
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Peter Maydell @ 2011-02-21 11:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

This patchset is a replacement for the unapplied patch 10/10
of my shift-fixes series (http://patchwork.ozlabs.org/patch/83248/)
which corrects the compilation failure.

Since the fix involves refactoring to pull out some code into a
separate function and then pulling the size check outside of
the loop, I've done it as two patches for clarity.

Change from v3: removed two now-unnecessary uses of TCGV_UNUSED
(forgot to stg refresh before sending out patch series, sorry.)

Peter Maydell (2):
  target-arm: Refactor to pull narrowing decode into separate function
  target-arm: Fix shift by immediate and narrow where src, dest overlap

 target-arm/translate.c |  127 +++++++++++++++++++++++------------------------
 1 files changed, 62 insertions(+), 65 deletions(-)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH v3 1/2] target-arm: Refactor to pull narrowing decode into separate function
  2011-02-21 11:05 [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap Peter Maydell
@ 2011-02-21 11:05 ` Peter Maydell
  2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 2/2] target-arm: Fix shift by immediate and narrow where src, dest overlap Peter Maydell
  2011-02-21 11:25 ` [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift " Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2011-02-21 11:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

Pull the code which decodes narrowing operations as being either
signed/unsigned saturate or plain out into its own function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c |   45 +++++++++++++++++++--------------------------
 1 files changed, 19 insertions(+), 26 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 5f377a4..fa20e84 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4164,6 +4164,23 @@ static inline void gen_neon_mull(TCGv_i64 dest, TCGv a, TCGv b, int size, int u)
     }
 }
 
+static void gen_neon_narrow_op(int op, int u, int size, TCGv dest, TCGv_i64 src)
+{
+    if (op) {
+        if (u) {
+            gen_neon_unarrow_sats(size, dest, src);
+        } else {
+            gen_neon_narrow(size, dest, src);
+        }
+    } else {
+        if (u) {
+            gen_neon_narrow_satu(size, dest, src);
+        } else {
+            gen_neon_narrow_sats(size, dest, src);
+        }
+    }
+}
+
 /* Translate a NEON data processing instruction.  Return nonzero if the
    instruction is invalid.
    We process data in a mixture of 32-bit and 64-bit chunks.
@@ -4839,19 +4856,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                         dead_tmp(tmp3);
                     }
                     tmp = new_tmp();
-                    if (op == 8) {
-                        if (u) { /* VQSHRUN / VQRSHRUN */
-                            gen_neon_unarrow_sats(size - 1, tmp, cpu_V0);
-                        } else { /* VSHRN / VRSHRN */
-                            gen_neon_narrow(size - 1, tmp, cpu_V0);
-                        }
-                    } else {
-                        if (u) { /* VQSHRN / VQRSHRN */
-                            gen_neon_narrow_satu(size - 1, tmp, cpu_V0);
-                        } else { /* VQSHRN / VQRSHRN */
-                            gen_neon_narrow_sats(size - 1, tmp, cpu_V0);
-                        }
-                    }
+                    gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
                     neon_store_reg(rd, pass, tmp);
                 } /* for pass */
                 if (size == 3) {
@@ -5439,19 +5444,7 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
                     for (pass = 0; pass < 2; pass++) {
                         neon_load_reg64(cpu_V0, rm + pass);
                         tmp = new_tmp();
-                        if (op == 36) {
-                            if (q) { /* VQMOVUN */
-                                gen_neon_unarrow_sats(size, tmp, cpu_V0);
-                            } else { /* VMOVN */
-                                gen_neon_narrow(size, tmp, cpu_V0);
-                            }
-                        } else { /* VQMOVN */
-                            if (q) {
-                                gen_neon_narrow_satu(size, tmp, cpu_V0);
-                            } else {
-                                gen_neon_narrow_sats(size, tmp, cpu_V0);
-                            }
-                        }
+                        gen_neon_narrow_op(op == 36, q, size, tmp, cpu_V0);
                         if (pass == 0) {
                             tmp2 = tmp;
                         } else {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [Qemu-devel] [PATCH v3 2/2] target-arm: Fix shift by immediate and narrow where src, dest overlap
  2011-02-21 11:05 [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap Peter Maydell
  2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 1/2] target-arm: Refactor to pull narrowing decode into separate function Peter Maydell
@ 2011-02-21 11:05 ` Peter Maydell
  2011-02-21 11:25 ` [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift " Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2011-02-21 11:05 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

For Neon shifts by immediate and narrow, correctly handle the case
where the source registers and the destination registers overlap
(the second pass should use the original register contents, not the
results of the first pass).

This includes a refactoring to pull the size check outside the
loop rather than inside, since there is now very little common
code between the size == 3 and size != 3 case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target-arm/translate.c |   84 +++++++++++++++++++++++++-----------------------
 1 files changed, 44 insertions(+), 40 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index fa20e84..dbd958b 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -4804,64 +4804,68 @@ static int disas_neon_data_insn(CPUState * env, DisasContext *s, uint32_t insn)
 
                 shift = shift - (1 << (size + 3));
                 size++;
-                switch (size) {
-                case 1:
-                    imm = (uint16_t)shift;
-                    imm |= imm << 16;
-                    tmp2 = tcg_const_i32(imm);
-                    TCGV_UNUSED_I64(tmp64);
-                    break;
-                case 2:
-                    imm = (uint32_t)shift;
-                    tmp2 = tcg_const_i32(imm);
-                    TCGV_UNUSED_I64(tmp64);
-                    break;
-                case 3:
+                if (size == 3) {
                     tmp64 = tcg_const_i64(shift);
-                    TCGV_UNUSED(tmp2);
-                    break;
-                default:
-                    abort();
-                }
-
-                for (pass = 0; pass < 2; pass++) {
-                    if (size == 3) {
-                        neon_load_reg64(cpu_V0, rm + pass);
+                    neon_load_reg64(cpu_V0, rm);
+                    neon_load_reg64(cpu_V1, rm + 1);
+                    for (pass = 0; pass < 2; pass++) {
+                        TCGv_i64 in;
+                        if (pass == 0) {
+                            in = cpu_V0;
+                        } else {
+                            in = cpu_V1;
+                        }
                         if (q) {
                             if (input_unsigned) {
-                                gen_helper_neon_rshl_u64(cpu_V0, cpu_V0,
-                                                         tmp64);
+                                gen_helper_neon_rshl_u64(cpu_V0, in, tmp64);
                             } else {
-                                gen_helper_neon_rshl_s64(cpu_V0, cpu_V0,
-                                                         tmp64);
+                                gen_helper_neon_rshl_s64(cpu_V0, in, tmp64);
                             }
                         } else {
                             if (input_unsigned) {
-                                gen_helper_neon_shl_u64(cpu_V0, cpu_V0,
-                                                        tmp64);
+                                gen_helper_neon_shl_u64(cpu_V0, in, tmp64);
                             } else {
-                                gen_helper_neon_shl_s64(cpu_V0, cpu_V0,
-                                                        tmp64);
+                                gen_helper_neon_shl_s64(cpu_V0, in, tmp64);
                             }
                         }
+                        tmp = new_tmp();
+                        gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
+                        neon_store_reg(rd, pass, tmp);
+                    } /* for pass */
+                    tcg_temp_free_i64(tmp64);
+                } else {
+                    if (size == 1) {
+                        imm = (uint16_t)shift;
+                        imm |= imm << 16;
                     } else {
-                        tmp = neon_load_reg(rm + pass, 0);
+                        /* size == 2 */
+                        imm = (uint32_t)shift;
+                    }
+                    tmp2 = tcg_const_i32(imm);
+                    tmp4 = neon_load_reg(rm + 1, 0);
+                    tmp5 = neon_load_reg(rm + 1, 1);
+                    for (pass = 0; pass < 2; pass++) {
+                        if (pass == 0) {
+                            tmp = neon_load_reg(rm, 0);
+                        } else {
+                            tmp = tmp4;
+                        }
                         gen_neon_shift_narrow(size, tmp, tmp2, q,
                                               input_unsigned);
-                        tmp3 = neon_load_reg(rm + pass, 1);
+                        if (pass == 0) {
+                            tmp3 = neon_load_reg(rm, 1);
+                        } else {
+                            tmp3 = tmp5;
+                        }
                         gen_neon_shift_narrow(size, tmp3, tmp2, q,
                                               input_unsigned);
                         tcg_gen_concat_i32_i64(cpu_V0, tmp, tmp3);
                         dead_tmp(tmp);
                         dead_tmp(tmp3);
-                    }
-                    tmp = new_tmp();
-                    gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
-                    neon_store_reg(rd, pass, tmp);
-                } /* for pass */
-                if (size == 3) {
-                    tcg_temp_free_i64(tmp64);
-                } else {
+                        tmp = new_tmp();
+                        gen_neon_narrow_op(op == 8, u, size - 1, tmp, cpu_V0);
+                        neon_store_reg(rd, pass, tmp);
+                    } /* for pass */
                     tcg_temp_free_i32(tmp2);
                 }
             } else if (op == 10) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap
  2011-02-21 11:05 [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap Peter Maydell
  2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 1/2] target-arm: Refactor to pull narrowing decode into separate function Peter Maydell
  2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 2/2] target-arm: Fix shift by immediate and narrow where src, dest overlap Peter Maydell
@ 2011-02-21 11:25 ` Peter Maydell
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Maydell @ 2011-02-21 11:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: patches

On 21 February 2011 11:05, Peter Maydell <peter.maydell@linaro.org> wrote:
> Change from v3: removed two now-unnecessary uses of TCGV_UNUSED
> (forgot to stg refresh before sending out patch series, sorry.)

...obviously I meant "Change from v2". Ah, Monday mornings :-)

-- PMM

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-02-21 11:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-21 11:05 [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift where src, dest overlap Peter Maydell
2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 1/2] target-arm: Refactor to pull narrowing decode into separate function Peter Maydell
2011-02-21 11:05 ` [Qemu-devel] [PATCH v3 2/2] target-arm: Fix shift by immediate and narrow where src, dest overlap Peter Maydell
2011-02-21 11:25 ` [Qemu-devel] [PATCH v3 0/2] target-arm: fix shift " Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).