From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org, qemu-stable@nongnu.org
Subject: [PATCH 02/13] target/arm: Fix SQDMULH (by element) with Q=0
Date: Mon, 24 Jun 2024 22:07:59 -0700 [thread overview]
Message-ID: <20240625050810.1475643-3-richard.henderson@linaro.org> (raw)
In-Reply-To: <20240625050810.1475643-1-richard.henderson@linaro.org>
The inner loop, bounded by eltspersegment, must not be
larger than the outer loop, bounded by elements.
Cc: qemu-stable@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/vec_helper.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 7b34cc98af..d477479bb1 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -317,10 +317,12 @@ void HELPER(neon_sqdmulh_idx_h)(void *vd, void *vn, void *vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx);
+ intptr_t elements = opr_sz / 2;
+ intptr_t eltspersegment = MIN(16 / 2, elements);
- for (i = 0; i < opr_sz / 2; i += 16 / 2) {
+ for (i = 0; i < elements; i += 16 / 2) {
int16_t mm = m[i];
- for (j = 0; j < 16 / 2; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, vq);
}
}
@@ -333,10 +335,12 @@ void HELPER(neon_sqrdmulh_idx_h)(void *vd, void *vn, void *vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx);
+ intptr_t elements = opr_sz / 2;
+ intptr_t eltspersegment = MIN(16 / 2, elements);
- for (i = 0; i < opr_sz / 2; i += 16 / 2) {
+ for (i = 0; i < elements; i += 16 / 2) {
int16_t mm = m[i];
- for (j = 0; j < 16 / 2; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, vq);
}
}
@@ -512,10 +516,12 @@ void HELPER(neon_sqdmulh_idx_s)(void *vd, void *vn, void *vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx);
+ intptr_t elements = opr_sz / 4;
+ intptr_t eltspersegment = MIN(16 / 4, elements);
- for (i = 0; i < opr_sz / 4; i += 16 / 4) {
+ for (i = 0; i < elements; i += 16 / 4) {
int32_t mm = m[i];
- for (j = 0; j < 16 / 4; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, vq);
}
}
@@ -528,10 +534,12 @@ void HELPER(neon_sqrdmulh_idx_s)(void *vd, void *vn, void *vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx);
+ intptr_t elements = opr_sz / 4;
+ intptr_t eltspersegment = MIN(16 / 4, elements);
- for (i = 0; i < opr_sz / 4; i += 16 / 4) {
+ for (i = 0; i < elements; i += 16 / 4) {
int32_t mm = m[i];
- for (j = 0; j < 16 / 4; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, vq);
}
}
--
2.34.1
next prev parent reply other threads:[~2024-06-25 5:09 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-25 5:07 [PATCH 00/13] target/arm: AdvSIMD conversion, part 2 Richard Henderson
2024-06-25 5:07 ` [PATCH 01/13] target/arm: Fix VCMLA Dd, Dn, Dm[idx] Richard Henderson
2024-06-25 11:42 ` Peter Maydell
2024-06-25 5:07 ` Richard Henderson [this message]
2024-06-25 11:43 ` [PATCH 02/13] target/arm: Fix SQDMULH (by element) with Q=0 Peter Maydell
2024-06-25 5:08 ` [PATCH 03/13] target/arm: Fix FJCVTZS vs flush-to-zero Richard Henderson
2024-06-25 11:56 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 04/13] target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree Richard Henderson
2024-06-25 12:38 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 05/13] target/arm: Convert SDOT, UDOT " Richard Henderson
2024-06-25 12:38 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 06/13] target/arm: Convert SUDOT, USDOT " Richard Henderson
2024-06-25 12:37 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 07/13] target/arm: Convert BFDOT " Richard Henderson
2024-06-25 12:42 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 08/13] target/arm: Convert BFMLALB, BFMLALT " Richard Henderson
2024-06-25 12:38 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 09/13] target/arm: Convert BFMMLA, SMMLA, UMMLA, USMMLA " Richard Henderson
2024-06-25 12:42 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 10/13] target/arm: Add data argument to do_fp3_vector Richard Henderson
2024-06-25 12:43 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 11/13] target/arm: Convert FCADD to decodetree Richard Henderson
2024-06-25 12:41 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 12/13] target/arm: Convert FCMLA " Richard Henderson
2024-06-25 12:35 ` Peter Maydell
2024-06-25 5:08 ` [PATCH 13/13] target/arm: Delete dead code from disas_simd_indexed Richard Henderson
2024-06-25 12:41 ` Peter Maydell
2024-06-25 14:18 ` Richard Henderson
2024-06-25 14:21 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240625050810.1475643-3-richard.henderson@linaro.org \
--to=richard.henderson@linaro.org \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).