From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: linux-arm-kernel@lists.infradead.org
Cc: catalin.marinas@arm.com, will.deacon@arm.com,
jeremy.linton@arm.com, Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: [PATCH] lib/raid6: arm: optimize away a mask operation in NEON recovery routine
Date: Tue, 26 Feb 2019 12:36:18 +0100 [thread overview]
Message-ID: <20190226113618.17891-1-ard.biesheuvel@linaro.org> (raw)
The NEON recovery code was modeled after the x86 SIMD code, and for
some reason, that code uses a 16 bit wide signed shift and a mask to
perform what amounts to a 8 bit unsigned shift. So fold the ops
together.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
lib/raid6/recov_neon_inner.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/raid6/recov_neon_inner.c b/lib/raid6/recov_neon_inner.c
index 8cd20c9f834a..f80e8cead9cf 100644
--- a/lib/raid6/recov_neon_inner.c
+++ b/lib/raid6/recov_neon_inner.c
@@ -60,14 +60,14 @@ void __raid6_2data_recov_neon(int bytes, uint8_t *p, uint8_t *q, uint8_t *dp,
px = veorq_u8(vld1q_u8(p), vld1q_u8(dp));
vx = veorq_u8(vld1q_u8(q), vld1q_u8(dq));
- vy = (uint8x16_t)vshrq_n_s16((int16x8_t)vx, 4);
+ vy = vshrq_n_u8(vx, 4);
vx = vqtbl1q_u8(qm0, vandq_u8(vx, x0f));
- vy = vqtbl1q_u8(qm1, vandq_u8(vy, x0f));
+ vy = vqtbl1q_u8(qm1, vy);
qx = veorq_u8(vx, vy);
- vy = (uint8x16_t)vshrq_n_s16((int16x8_t)px, 4);
+ vy = vshrq_n_u8(px, 4);
vx = vqtbl1q_u8(pm0, vandq_u8(px, x0f));
- vy = vqtbl1q_u8(pm1, vandq_u8(vy, x0f));
+ vy = vqtbl1q_u8(pm1, vy);
vx = veorq_u8(vx, vy);
db = veorq_u8(vx, qx);
@@ -100,9 +100,9 @@ void __raid6_datap_recov_neon(int bytes, uint8_t *p, uint8_t *q, uint8_t *dq,
vx = veorq_u8(vld1q_u8(q), vld1q_u8(dq));
- vy = (uint8x16_t)vshrq_n_s16((int16x8_t)vx, 4);
+ vy = vshrq_n_u8(vx, 4);
vx = vqtbl1q_u8(qm0, vandq_u8(vx, x0f));
- vy = vqtbl1q_u8(qm1, vandq_u8(vy, x0f));
+ vy = vqtbl1q_u8(qm1, vy);
vx = veorq_u8(vx, vy);
vy = veorq_u8(vx, vld1q_u8(p));
--
2.20.1
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2019-02-26 11:36 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-26 11:36 Ard Biesheuvel [this message]
2019-02-28 17:48 ` [PATCH] lib/raid6: arm: optimize away a mask operation in NEON recovery routine Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190226113618.17891-1-ard.biesheuvel@linaro.org \
--to=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=jeremy.linton@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).