From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
To: linux-bluetooth@vger.kernel.org
Cc: Siarhei Siamashka <siarhei.siamashka@nokia.com>
Subject: [PATCH 3/5] sbc: slightly faster 'sbc_calc_scalefactors_neon'
Date: Fri, 2 Jul 2010 15:25:40 +0300 [thread overview]
Message-ID: <1278073542-14859-4-git-send-email-siarhei.siamashka@gmail.com> (raw)
In-Reply-To: <1278073542-14859-1-git-send-email-siarhei.siamashka@gmail.com>
From: Siarhei Siamashka <siarhei.siamashka@nokia.com>
Previous variant was basically derived from C and MMX implementations.
Now new variant makes use of 'vmax' instruction, which is available in
NEON and can do this job faster. The same method for calculating scale
factors is also used in 'sbc_calc_scalefactors_j_neon'.
Benchmarked without joint stereo on ARM Cortex-A8:
== Before: ==
$ time ./sbcenc -b53 -s8 test.au > /dev/null
real 0m3.851s
user 0m3.375s
sys 0m0.469s
samples % image name symbol name
26260 34.2672 sbcenc sbc_pack_frame
20013 26.1154 sbcenc sbc_analyze_4b_8s_neon
13796 18.0027 sbcenc sbc_calculate_bits
8388 10.9457 no-vmlinux /no-vmlinux
3229 4.2136 sbcenc sbc_enc_process_input_8s_be_neon
2408 3.1422 sbcenc sbc_calc_scalefactors_neon
2093 2.7312 sbcenc sbc_encode
== After: ==
$ time ./sbcenc -b53 -s8 test.au > /dev/null
real 0m3.796s
user 0m3.344s
sys 0m0.438s
samples % image name symbol name
26582 34.8726 sbcenc sbc_pack_frame
20032 26.2797 sbcenc sbc_analyze_4b_8s_neon
13808 18.1146 sbcenc sbc_calculate_bits
8374 10.9858 no-vmlinux /no-vmlinux
3187 4.1810 sbcenc sbc_enc_process_input_8s_be_neon
2027 2.6592 sbcenc sbc_encode
1766 2.3168 sbcenc sbc_calc_scalefactors_neon
---
sbc/sbc_primitives_neon.c | 25 ++++++++++---------------
1 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/sbc/sbc_primitives_neon.c b/sbc/sbc_primitives_neon.c
index 7713759..0572158 100644
--- a/sbc/sbc_primitives_neon.c
+++ b/sbc/sbc_primitives_neon.c
@@ -248,8 +248,11 @@ static void sbc_calc_scalefactors_neon(
int blk = blocks;
int32_t *in = &sb_sample_f[0][ch][sb];
asm volatile (
- "vmov.s32 q0, %[c1]\n"
+ "vmov.s32 q0, #0\n"
"vmov.s32 q1, %[c1]\n"
+ "vmov.s32 q14, #1\n"
+ "vmov.s32 q15, %[c2]\n"
+ "vadd.s32 q1, q1, q14\n"
"1:\n"
"vld1.32 {d16, d17}, [%[in], :128], %[inc]\n"
"vabs.s32 q8, q8\n"
@@ -259,22 +262,14 @@ static void sbc_calc_scalefactors_neon(
"vabs.s32 q10, q10\n"
"vld1.32 {d22, d23}, [%[in], :128], %[inc]\n"
"vabs.s32 q11, q11\n"
- "vcgt.s32 q12, q8, #0\n"
- "vcgt.s32 q13, q9, #0\n"
- "vcgt.s32 q14, q10, #0\n"
- "vcgt.s32 q15, q11, #0\n"
- "vadd.s32 q8, q8, q12\n"
- "vadd.s32 q9, q9, q13\n"
- "vadd.s32 q10, q10, q14\n"
- "vadd.s32 q11, q11, q15\n"
- "vorr.s32 q0, q0, q8\n"
- "vorr.s32 q1, q1, q9\n"
- "vorr.s32 q0, q0, q10\n"
- "vorr.s32 q1, q1, q11\n"
+ "vmax.s32 q0, q0, q8\n"
+ "vmax.s32 q1, q1, q9\n"
+ "vmax.s32 q0, q0, q10\n"
+ "vmax.s32 q1, q1, q11\n"
"subs %[blk], %[blk], #4\n"
"bgt 1b\n"
- "vorr.s32 q0, q0, q1\n"
- "vmov.s32 q15, %[c2]\n"
+ "vmax.s32 q0, q0, q1\n"
+ "vsub.s32 q0, q0, q14\n"
"vclz.s32 q0, q0\n"
"vsub.s32 q0, q15, q0\n"
"vst1.32 {d0, d1}, [%[out], :128]\n"
--
1.6.4.4
next prev parent reply other threads:[~2010-07-02 12:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-02 12:25 [PATCH 0/5] SBC encoder optimizations for ARM processors Siarhei Siamashka
2010-07-02 12:25 ` [PATCH 1/5] sbc: ARM NEON optimized joint stereo processing in SBC encoder Siarhei Siamashka
2010-07-02 12:25 ` [PATCH 2/5] sbc: ARM NEON optimizations for input permutation " Siarhei Siamashka
2010-07-02 12:25 ` Siarhei Siamashka [this message]
2010-07-02 12:25 ` [PATCH 4/5] sbc: faster 'sbc_calculate_bits' function Siarhei Siamashka
2010-07-02 12:25 ` [PATCH 5/5] sbc: ARMv6 optimized version of analysis filter for SBC encoder Siarhei Siamashka
2010-07-02 19:04 ` [PATCH 0/5] SBC encoder optimizations for ARM processors Johan Hedberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1278073542-14859-4-git-send-email-siarhei.siamashka@gmail.com \
--to=siarhei.siamashka@gmail.com \
--cc=linux-bluetooth@vger.kernel.org \
--cc=siarhei.siamashka@nokia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).