From: Eric Biggers <ebiggers@kernel.org>
To: l00374334 <liqiang64@huawei.com>
Cc: herbert@gondor.apana.org.au, davem@davemloft.net,
catalin.marinas@arm.com, will@kernel.org,
mcoquelin.stm32@gmail.com, alexandre.torgue@st.com,
linux-arm-kernel@lists.infradead.org,
linux-crypto@vger.kernel.org
Subject: Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.
Date: Wed, 4 Nov 2020 09:57:42 -0800 [thread overview]
Message-ID: <20201104175742.GA846@sol.localdomain> (raw)
In-Reply-To: <20201103121506.1533-2-liqiang64@huawei.com>
On Tue, Nov 03, 2020 at 08:15:06PM +0800, l00374334 wrote:
> From: liqiang <liqiang64@huawei.com>
>
> In the libz library, the checksum algorithm adler32 usually occupies
> a relatively high hot spot, and the SVE instruction set can easily
> accelerate it, so that the performance of libz library will be
> significantly improved.
>
> We can divides buf into blocks according to the bit width of SVE,
> and then uses vector registers to perform operations in units of blocks
> to achieve the purpose of acceleration.
>
> On machines that support ARM64 sve instructions, this algorithm is
> about 3~4 times faster than the algorithm implemented in C language
> in libz. The wider the SVE instruction, the better the acceleration effect.
>
> Measured on a Taishan 1951 machine that supports 256bit width SVE,
> below are the results of my measured random data of 1M and 10M:
>
> [root@xxx adler32]# ./benchmark 1000000
> Libz alg: Time used: 608 us, 1644.7 Mb/s.
> SVE alg: Time used: 166 us, 6024.1 Mb/s.
>
> [root@xxx adler32]# ./benchmark 10000000
> Libz alg: Time used: 6484 us, 1542.3 Mb/s.
> SVE alg: Time used: 2034 us, 4916.4 Mb/s.
>
> The blocks can be of any size, so the algorithm can automatically adapt
> to SVE hardware with different bit widths without modifying the code.
>
>
> Signed-off-by: liqiang <liqiang64@huawei.com>
Note that this patch does nothing to actually wire up the kernel's copy of libz
(lib/zlib_{deflate,inflate}/) to use this implementation of Adler32. To do so,
libz would either need to be changed to use the shash API, or you'd need to
implement an adler32() function in lib/crypto/ that automatically uses an
accelerated implementation if available, and make libz call it.
Also, in either case a C implementation would be required too. There can't be
just an architecture-specific implementation.
Also as others have pointed out, there's probably not much point in having a SVE
implementation of Adler32 when there isn't even a NEON implementation yet. It's
not too hard to implement Adler32 using NEON, and there are already several
permissively-licensed NEON implementations out there that could be used as a
reference, e.g. my implementation using NEON instrinsics here:
https://github.com/ebiggers/libdeflate/blob/v1.6/lib/arm/adler32_impl.h
- Eric
next prev parent reply other threads:[~2020-11-04 17:57 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-03 12:15 [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions l00374334
2020-11-03 12:15 ` [PATCH 1/1] " l00374334
2020-11-03 14:34 ` Ard Biesheuvel
2020-11-03 18:00 ` Dave Martin
2020-11-04 9:19 ` Li Qiang
2020-11-04 14:49 ` Dave Martin
2020-11-05 2:32 ` Li Qiang
2020-11-04 17:50 ` Mark Brown
2020-11-04 18:13 ` Dave Martin
2020-11-04 18:49 ` Mark Brown
2020-11-05 17:56 ` Dave Martin
2020-11-04 8:01 ` Li Qiang
2020-11-04 8:04 ` Ard Biesheuvel
2020-11-04 8:14 ` Li Qiang
2020-11-04 17:57 ` Eric Biggers [this message]
2020-11-05 2:49 ` Li Qiang
2020-11-05 7:51 ` Ard Biesheuvel
2020-11-05 9:05 ` Li Qiang
2020-11-05 18:21 ` Eric Biggers
2020-11-09 6:29 ` Li Qiang
2020-11-05 16:53 ` [PATCH 0/1] " Dave Martin
2020-11-09 3:43 ` Li Qiang
2020-11-10 10:46 ` Dave Martin
2020-11-10 13:20 ` Li Qiang
2020-11-10 16:07 ` Dave Martin
2020-11-12 7:20 ` Li Qiang
2020-11-12 11:17 ` Dave Martin
2020-11-14 7:31 ` Li Qiang
2020-11-16 15:56 ` Dave Martin
2020-11-17 12:45 ` Li Qiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201104175742.GA846@sol.localdomain \
--to=ebiggers@kernel.org \
--cc=alexandre.torgue@st.com \
--cc=catalin.marinas@arm.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-crypto@vger.kernel.org \
--cc=liqiang64@huawei.com \
--cc=mcoquelin.stm32@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox