From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: Laurent Desnogues <laurent.desnogues@gmail.com>,
peter.maydell@linaro.org, alex.bennee@linaro.org
Subject: [PATCH v3 01/81] target/arm: Fix sve_uzp_p vs odd vector lengths
Date: Fri, 18 Sep 2020 11:36:31 -0700 [thread overview]
Message-ID: <20200918183751.2787647-2-richard.henderson@linaro.org> (raw)
In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org>
Missed out on compressing the second half of a predicate
with length vl % 512 > 256.
Adjust all of the x + (y << s) to x | (y << s) as a
general style fix. Drop the extract64 because the input
uint64_t are known to be already zero-extended from the
current size of the predicate.
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/sve_helper.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c
index 4758d46f34..fcb46f150f 100644
--- a/target/arm/sve_helper.c
+++ b/target/arm/sve_helper.c
@@ -1938,7 +1938,7 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
if (oprsz <= 8) {
l = compress_bits(n[0] >> odd, esz);
h = compress_bits(m[0] >> odd, esz);
- d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz);
+ d[0] = l | (h << (4 * oprsz));
} else {
ARMPredicateReg tmp_m;
intptr_t oprsz_16 = oprsz / 16;
@@ -1952,23 +1952,35 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
h = n[2 * i + 1];
l = compress_bits(l >> odd, esz);
h = compress_bits(h >> odd, esz);
- d[i] = l + (h << 32);
+ d[i] = l | (h << 32);
}
- /* For VL which is not a power of 2, the results from M do not
- align nicely with the uint64_t for D. Put the aligned results
- from M into TMP_M and then copy it into place afterward. */
+ /*
+ * For VL which is not a multiple of 512, the results from M do not
+ * align nicely with the uint64_t for D. Put the aligned results
+ * from M into TMP_M and then copy it into place afterward.
+ */
if (oprsz & 15) {
- d[i] = compress_bits(n[2 * i] >> odd, esz);
+ int final_shift = (oprsz & 15) * 2;
+
+ l = n[2 * i + 0];
+ h = n[2 * i + 1];
+ l = compress_bits(l >> odd, esz);
+ h = compress_bits(h >> odd, esz);
+ d[i] = l | (h << final_shift);
for (i = 0; i < oprsz_16; i++) {
l = m[2 * i + 0];
h = m[2 * i + 1];
l = compress_bits(l >> odd, esz);
h = compress_bits(h >> odd, esz);
- tmp_m.p[i] = l + (h << 32);
+ tmp_m.p[i] = l | (h << 32);
}
- tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz);
+ l = m[2 * i + 0];
+ h = m[2 * i + 1];
+ l = compress_bits(l >> odd, esz);
+ h = compress_bits(h >> odd, esz);
+ tmp_m.p[i] = l | (h << final_shift);
swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2);
} else {
@@ -1977,7 +1989,7 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc)
h = m[2 * i + 1];
l = compress_bits(l >> odd, esz);
h = compress_bits(h >> odd, esz);
- d[oprsz_16 + i] = l + (h << 32);
+ d[oprsz_16 + i] = l | (h << 32);
}
}
}
--
2.25.1
next prev parent reply other threads:[~2020-09-18 18:43 UTC|newest]
Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-18 18:36 [PATCH v3 00/81] target/arm: Implement SVE2 Richard Henderson
2020-09-18 18:36 ` Richard Henderson [this message]
2020-09-18 18:36 ` [PATCH v3 02/81] target/arm: Fix sve_zip_p vs odd vector lengths Richard Henderson
2020-09-18 18:36 ` [PATCH v3 03/81] target/arm: Fix sve_punpk_p " Richard Henderson
2020-09-18 18:36 ` [PATCH v3 04/81] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 Richard Henderson
2020-09-18 18:36 ` [PATCH v3 05/81] target/arm: Implement SVE2 Integer Multiply - Unpredicated Richard Henderson
2020-09-18 18:36 ` [PATCH v3 06/81] target/arm: Implement SVE2 integer pairwise add and accumulate long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 07/81] target/arm: Implement SVE2 integer unary operations (predicated) Richard Henderson
2020-09-18 18:36 ` [PATCH v3 08/81] target/arm: Split out saturating/rounding shifts from neon Richard Henderson
2020-09-18 18:36 ` [PATCH v3 09/81] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Richard Henderson
2020-09-18 18:36 ` [PATCH v3 10/81] target/arm: Implement SVE2 integer halving add/subtract (predicated) Richard Henderson
2020-09-18 18:36 ` [PATCH v3 11/81] target/arm: Implement SVE2 integer pairwise arithmetic Richard Henderson
2020-09-18 18:36 ` [PATCH v3 12/81] target/arm: Implement SVE2 saturating add/subtract (predicated) Richard Henderson
2020-09-18 18:36 ` [PATCH v3 13/81] target/arm: Implement SVE2 integer add/subtract long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 14/81] target/arm: Implement SVE2 integer add/subtract interleaved long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 15/81] target/arm: Implement SVE2 integer add/subtract wide Richard Henderson
2020-09-18 18:36 ` [PATCH v3 16/81] target/arm: Implement SVE2 integer multiply long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 17/81] target/arm: Implement PMULLB and PMULLT Richard Henderson
2020-09-18 18:36 ` [PATCH v3 18/81] target/arm: Implement SVE2 bitwise shift left long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 19/81] target/arm: Implement SVE2 bitwise exclusive-or interleaved Richard Henderson
2020-09-18 18:36 ` [PATCH v3 20/81] target/arm: Implement SVE2 bitwise permute Richard Henderson
2020-09-18 18:36 ` [PATCH v3 21/81] target/arm: Implement SVE2 complex integer add Richard Henderson
2020-09-18 18:36 ` [PATCH v3 22/81] target/arm: Implement SVE2 integer absolute difference and accumulate long Richard Henderson
2020-09-18 18:36 ` [PATCH v3 23/81] target/arm: Implement SVE2 integer add/subtract long with carry Richard Henderson
2020-09-18 18:36 ` [PATCH v3 24/81] target/arm: Implement SVE2 bitwise shift right and accumulate Richard Henderson
2020-09-18 18:36 ` [PATCH v3 25/81] target/arm: Implement SVE2 bitwise shift and insert Richard Henderson
2020-09-18 18:36 ` [PATCH v3 26/81] target/arm: Implement SVE2 integer absolute difference and accumulate Richard Henderson
2020-09-18 18:36 ` [PATCH v3 27/81] target/arm: Implement SVE2 saturating extract narrow Richard Henderson
2020-09-18 18:36 ` [PATCH v3 28/81] target/arm: Implement SVE2 floating-point pairwise Richard Henderson
2020-09-18 18:36 ` [PATCH v3 29/81] target/arm: Implement SVE2 SHRN, RSHRN Richard Henderson
2020-09-18 18:37 ` [PATCH v3 30/81] target/arm: Implement SVE2 SQSHRUN, SQRSHRUN Richard Henderson
2020-09-18 18:37 ` [PATCH v3 31/81] target/arm: Implement SVE2 UQSHRN, UQRSHRN Richard Henderson
2020-09-18 18:37 ` [PATCH v3 32/81] target/arm: Implement SVE2 SQSHRN, SQRSHRN Richard Henderson
2020-09-18 18:37 ` [PATCH v3 33/81] target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS Richard Henderson
2020-09-18 18:37 ` [PATCH v3 34/81] target/arm: Implement SVE2 WHILERW, WHILEWR Richard Henderson
2020-10-13 2:33 ` LIU Zhiwei
2020-10-19 21:58 ` Richard Henderson
2020-09-18 18:37 ` [PATCH v3 35/81] target/arm: Implement SVE2 bitwise ternary operations Richard Henderson
2020-09-18 18:37 ` [PATCH v3 36/81] target/arm: Implement SVE2 MATCH, NMATCH Richard Henderson
2020-09-18 18:37 ` [PATCH v3 37/81] target/arm: Implement SVE2 saturating multiply-add long Richard Henderson
2020-09-18 18:37 ` [PATCH v3 38/81] target/arm: Implement SVE2 saturating multiply-add high Richard Henderson
2020-09-18 18:37 ` [PATCH v3 39/81] target/arm: Implement SVE2 integer multiply-add long Richard Henderson
2020-09-18 18:37 ` [PATCH v3 40/81] target/arm: Implement SVE2 complex integer multiply-add Richard Henderson
2020-09-18 18:37 ` [PATCH v3 41/81] target/arm: Implement SVE2 ADDHNB, ADDHNT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 42/81] target/arm: Implement SVE2 RADDHNB, RADDHNT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 43/81] target/arm: Implement SVE2 SUBHNB, SUBHNT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 44/81] target/arm: Implement SVE2 RSUBHNB, RSUBHNT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 45/81] target/arm: Implement SVE2 HISTCNT, HISTSEG Richard Henderson
2020-10-09 6:13 ` LIU Zhiwei
2020-10-09 12:35 ` Richard Henderson
2020-09-18 18:37 ` [PATCH v3 46/81] target/arm: Implement SVE2 XAR Richard Henderson
2020-09-18 18:37 ` [PATCH v3 47/81] target/arm: Implement SVE2 scatter store insns Richard Henderson
2020-09-18 18:37 ` [PATCH v3 48/81] target/arm: Implement SVE2 gather load insns Richard Henderson
2020-09-18 18:37 ` [PATCH v3 49/81] target/arm: Implement SVE2 FMMLA Richard Henderson
2020-09-18 18:37 ` [PATCH v3 50/81] target/arm: Implement SVE2 SPLICE, EXT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers Richard Henderson
2020-09-23 10:01 ` LIU Zhiwei
2020-09-23 14:46 ` Richard Henderson
2020-09-24 1:29 ` LIU Zhiwei
2020-09-23 11:48 ` LIU Zhiwei
2020-10-09 12:42 ` Richard Henderson
2020-09-18 18:37 ` [PATCH v3 52/81] target/arm: Pass separate addend to FCMLA helpers Richard Henderson
2020-09-18 18:37 ` [PATCH v3 53/81] target/arm: Split out formats for 2 vectors + 1 index Richard Henderson
2020-09-18 18:37 ` [PATCH v3 54/81] target/arm: Split out formats for 3 " Richard Henderson
2020-09-18 18:37 ` [PATCH v3 55/81] target/arm: Implement SVE2 integer multiply (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 56/81] target/arm: Implement SVE2 integer multiply-add (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 57/81] target/arm: Implement SVE2 saturating multiply-add high (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 58/81] target/arm: Implement SVE2 saturating multiply-add (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 59/81] target/arm: Implement SVE2 integer multiply long (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 60/81] target/arm: Implement SVE2 saturating multiply (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 61/81] target/arm: Implement SVE2 signed saturating doubling multiply high Richard Henderson
2020-09-18 18:37 ` [PATCH v3 62/81] target/arm: Implement SVE2 saturating multiply high (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 63/81] target/arm: Implement SVE2 multiply-add long (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 64/81] target/arm: Implement SVE2 complex integer multiply-add (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 65/81] target/arm: Implement SVE mixed sign dot product (indexed) Richard Henderson
2020-09-18 18:37 ` [PATCH v3 66/81] target/arm: Implement SVE mixed sign dot product Richard Henderson
2020-09-18 18:37 ` [PATCH v3 67/81] target/arm: Implement SVE2 crypto unary operations Richard Henderson
2020-09-18 18:37 ` [PATCH v3 68/81] target/arm: Implement SVE2 crypto destructive binary operations Richard Henderson
2020-09-18 18:37 ` [PATCH v3 69/81] target/arm: Implement SVE2 crypto constructive " Richard Henderson
2020-09-18 18:37 ` [PATCH v3 70/81] target/arm: Implement SVE2 TBL, TBX Richard Henderson
2020-09-18 18:37 ` [PATCH v3 71/81] target/arm: Implement SVE2 FCVTNT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 72/81] target/arm: Implement SVE2 FCVTLT Richard Henderson
2020-09-18 18:37 ` [PATCH v3 73/81] target/arm: Implement SVE2 FCVTXNT, FCVTX Richard Henderson
2020-09-18 18:37 ` [PATCH v3 74/81] target/arm: Implement SVE2 FLOGB Richard Henderson
2020-09-18 18:37 ` [PATCH v3 75/81] target/arm: Share table of sve load functions Richard Henderson
2020-09-18 18:37 ` [PATCH v3 76/81] target/arm: Implement SVE2 LD1RO Richard Henderson
2020-09-18 18:37 ` [PATCH v3 77/81] target/arm: Implement 128-bit ZIP, UZP, TRN Richard Henderson
2020-09-18 18:37 ` [PATCH v3 78/81] target/arm: Implement SVE2 bitwise shift immediate Richard Henderson
2020-09-18 18:37 ` [PATCH v3 79/81] target/arm: Implement SVE2 fp multiply-add long Richard Henderson
2020-09-18 18:37 ` [PATCH v3 80/81] target/arm: Implement SVE2 complex integer dot product Richard Henderson
2020-09-18 18:37 ` [PATCH v3 81/81] target/arm: Enable SVE2 and some extensions Richard Henderson
2020-11-10 19:55 ` [PATCH v3 00/81] target/arm: Implement SVE2 Stephen Long
2020-11-12 21:06 ` Richard Henderson
2020-11-11 18:17 ` Stephen Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200918183751.2787647-2-richard.henderson@linaro.org \
--to=richard.henderson@linaro.org \
--cc=alex.bennee@linaro.org \
--cc=laurent.desnogues@gmail.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).