From: Peter Maydell <peter.maydell@linaro.org>
To: Anthony Liguori <anthony@codemonkey.ws>, qemu-devel@nongnu.org
Cc: Aurelien Jarno <aurelien@aurel32.net>
Subject: [Qemu-devel] [PATCH 08/10] target-arm: Fix VLD of single element to all lanes
Date: Fri, 1 Apr 2011 15:30:41 +0100 [thread overview]
Message-ID: <1301668243-29886-9-git-send-email-peter.maydell@linaro.org> (raw)
In-Reply-To: <1301668243-29886-1-git-send-email-peter.maydell@linaro.org>
Fix several bugs in VLD of single element to all lanes:
The "single element to all lanes" form of VLD1 differs from those for
VLD2, VLD3 and VLD4 in that bit 5 indicates whether the loaded element
should be written to one or two Dregs (rather than being a register
stride). Handle this by special-casing VLD1 rather than trying to
have one loop which deals with both VLD1 and 2/3/4.
Handle VLD4.32 with 16 byte alignment specified, rather than UNDEFfing.
UNDEF for the invalid size and alignment combinations.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target-arm/translate.c | 84 +++++++++++++++++++++++++++++++++--------------
1 files changed, 59 insertions(+), 25 deletions(-)
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 6ce8b7a..e79ea03 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -2648,6 +2648,28 @@ static void gen_neon_dup_high16(TCGv var)
tcg_temp_free_i32(tmp);
}
+static TCGv gen_load_and_replicate(DisasContext *s, TCGv addr, int size)
+{
+ /* Load a single Neon element and replicate into a 32 bit TCG reg */
+ TCGv tmp;
+ switch (size) {
+ case 0:
+ tmp = gen_ld8u(addr, IS_USER(s));
+ gen_neon_dup_u8(tmp, 0);
+ break;
+ case 1:
+ tmp = gen_ld16u(addr, IS_USER(s));
+ gen_neon_dup_low16(tmp);
+ break;
+ case 2:
+ tmp = gen_ld32(addr, IS_USER(s));
+ break;
+ default: /* Avoid compiler warnings. */
+ abort();
+ }
+ return tmp;
+}
+
/* Disassemble a VFP instruction. Returns nonzero if an error occured
(ie. an undefined instruction). */
static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn)
@@ -3890,36 +3912,48 @@ static int disas_neon_ls_insn(CPUState * env, DisasContext *s, uint32_t insn)
size = (insn >> 10) & 3;
if (size == 3) {
/* Load single element to all lanes. */
- if (!load)
+ int a = (insn >> 4) & 1;
+ if (!load) {
return 1;
+ }
size = (insn >> 6) & 3;
nregs = ((insn >> 8) & 3) + 1;
- stride = (insn & (1 << 5)) ? 2 : 1;
- load_reg_var(s, addr, rn);
- for (reg = 0; reg < nregs; reg++) {
- switch (size) {
- case 0:
- tmp = gen_ld8u(addr, IS_USER(s));
- gen_neon_dup_u8(tmp, 0);
- break;
- case 1:
- tmp = gen_ld16u(addr, IS_USER(s));
- gen_neon_dup_low16(tmp);
- break;
- case 2:
- tmp = gen_ld32(addr, IS_USER(s));
- break;
- case 3:
+
+ if (size == 3) {
+ if (nregs != 4 || a == 0) {
return 1;
- default: /* Avoid compiler warnings. */
- abort();
}
- tcg_gen_addi_i32(addr, addr, 1 << size);
- tmp2 = tcg_temp_new_i32();
- tcg_gen_mov_i32(tmp2, tmp);
- neon_store_reg(rd, 0, tmp2);
- neon_store_reg(rd, 1, tmp);
- rd += stride;
+ /* For VLD4 size==3 a == 1 means 32 bits at 16 byte alignment */
+ size = 2;
+ }
+ if (nregs == 1 && a == 1 && size == 0) {
+ return 1;
+ }
+ if (nregs == 3 && a == 1) {
+ return 1;
+ }
+ load_reg_var(s, addr, rn);
+ if (nregs == 1) {
+ /* VLD1 to all lanes: bit 5 indicates how many Dregs to write */
+ tmp = gen_load_and_replicate(s, addr, size);
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
+ if (insn & (1 << 5)) {
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 0));
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd + 1, 1));
+ }
+ tcg_temp_free_i32(tmp);
+ } else {
+ /* VLD2/3/4 to all lanes: bit 5 indicates register stride */
+ stride = (insn & (1 << 5)) ? 2 : 1;
+ for (reg = 0; reg < nregs; reg++) {
+ tmp = gen_load_and_replicate(s, addr, size);
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 0));
+ tcg_gen_st_i32(tmp, cpu_env, neon_reg_offset(rd, 1));
+ tcg_temp_free_i32(tmp);
+ tcg_gen_addi_i32(addr, addr, 1 << size);
+ rd += stride;
+ }
}
stride = (1 << size) * nregs;
} else {
--
1.7.1
next prev parent reply other threads:[~2011-04-01 15:07 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-01 14:30 [Qemu-devel] [PATCH 00/10] [PULL] ARM Neon fixes Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 01/10] target-arm: Make Neon helper routines use correct FP status Peter Maydell
2011-04-01 18:29 ` Blue Swirl
2011-04-01 22:33 ` Peter Maydell
2011-04-03 9:41 ` Blue Swirl
2011-04-03 10:51 ` Peter Maydell
2011-04-03 11:10 ` Blue Swirl
2011-04-03 11:21 ` Peter Maydell
2011-04-03 11:52 ` Blue Swirl
2011-04-03 15:12 ` Aurelien Jarno
2011-04-03 15:32 ` Peter Maydell
2011-04-03 16:01 ` Aurelien Jarno
2011-04-01 14:30 ` [Qemu-devel] [PATCH 02/10] target-arm/neon_helper.c: Use make_float32/float32_val macros Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 03/10] target-arm: Return right result for Neon comparison with NaNs Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 04/10] target-arm: Fix VCLE.F32 #0, VCLT.F32 #0 NaN handling Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 05/10] target-arm: Correct ABD's handling of negative zeroes Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 06/10] softfloat: Add float*_min() and float*_max() functions Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 07/10] target-arm: Use new softfloat min/max functions for VMAX, VMIN Peter Maydell
2011-04-01 14:30 ` Peter Maydell [this message]
2011-04-01 14:30 ` [Qemu-devel] [PATCH 09/10] target-arm: Don't leak TCG temp for UNDEFs in Neon load/store space Peter Maydell
2011-04-01 14:30 ` [Qemu-devel] [PATCH 10/10] target-arm/helper.c: For float-int conversion helpers pass ints as ints Peter Maydell
2011-04-01 18:25 ` Blue Swirl
2011-04-03 16:03 ` [Qemu-devel] [PATCH 00/10] [PULL] ARM Neon fixes Aurelien Jarno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1301668243-29886-9-git-send-email-peter.maydell@linaro.org \
--to=peter.maydell@linaro.org \
--cc=anthony@codemonkey.ws \
--cc=aurelien@aurel32.net \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).