Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.

public inbox for linux-crypto@vger.kernel.org
 help / color / mirror / Atom feed

From: Eric Biggers <ebiggers@kernel.org>
To: l00374334 <liqiang64@huawei.com>
Cc: herbert@gondor.apana.org.au, davem@davemloft.net,
	catalin.marinas@arm.com, will@kernel.org,
	mcoquelin.stm32@gmail.com, alexandre.torgue@st.com,
	linux-arm-kernel@lists.infradead.org,
	linux-crypto@vger.kernel.org
Subject: Re: [PATCH 1/1] arm64: Accelerate Adler32 using arm64 SVE instructions.
Date: Wed, 4 Nov 2020 09:57:42 -0800	[thread overview]
Message-ID: <20201104175742.GA846@sol.localdomain> (raw)
In-Reply-To: <20201103121506.1533-2-liqiang64@huawei.com>

On Tue, Nov 03, 2020 at 08:15:06PM +0800, l00374334 wrote:
> From: liqiang <liqiang64@huawei.com>
> 
> 	In the libz library, the checksum algorithm adler32 usually occupies
> 	a relatively high hot spot, and the SVE instruction set can easily
> 	accelerate it, so that the performance of libz library will be
> 	significantly improved.
> 
> 	We can divides buf into blocks according to the bit width of SVE,
> 	and then uses vector registers to perform operations in units of blocks
> 	to achieve the purpose of acceleration.
> 
> 	On machines that support ARM64 sve instructions, this algorithm is
> 	about 3~4 times faster than the algorithm implemented in C language
> 	in libz. The wider the SVE instruction, the better the acceleration effect.
> 
> 	Measured on a Taishan 1951 machine that supports 256bit width SVE,
> 	below are the results of my measured random data of 1M and 10M:
> 
> 		[root@xxx adler32]# ./benchmark 1000000
> 		Libz alg: Time used:    608 us, 1644.7 Mb/s.
> 		SVE  alg: Time used:    166 us, 6024.1 Mb/s.
> 
> 		[root@xxx adler32]# ./benchmark 10000000
> 		Libz alg: Time used:   6484 us, 1542.3 Mb/s.
> 		SVE  alg: Time used:   2034 us, 4916.4 Mb/s.
> 
> 	The blocks can be of any size, so the algorithm can automatically adapt
> 	to SVE hardware with different bit widths without modifying the code.
> 
> 
> Signed-off-by: liqiang <liqiang64@huawei.com>

Note that this patch does nothing to actually wire up the kernel's copy of libz
(lib/zlib_{deflate,inflate}/) to use this implementation of Adler32.  To do so,
libz would either need to be changed to use the shash API, or you'd need to
implement an adler32() function in lib/crypto/ that automatically uses an
accelerated implementation if available, and make libz call it.

Also, in either case a C implementation would be required too.  There can't be
just an architecture-specific implementation.

Also as others have pointed out, there's probably not much point in having a SVE
implementation of Adler32 when there isn't even a NEON implementation yet.  It's
not too hard to implement Adler32 using NEON, and there are already several
permissively-licensed NEON implementations out there that could be used as a
reference, e.g. my implementation using NEON instrinsics here:
https://github.com/ebiggers/libdeflate/blob/v1.6/lib/arm/adler32_impl.h

- Eric

next prev parent reply	other threads:[~2020-11-04 17:57 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-03 12:15 [PATCH 0/1] arm64: Accelerate Adler32 using arm64 SVE instructions l00374334
2020-11-03 12:15 ` [PATCH 1/1] " l00374334
2020-11-03 14:34   ` Ard Biesheuvel
2020-11-03 18:00     ` Dave Martin
2020-11-04  9:19       ` Li Qiang
2020-11-04 14:49         ` Dave Martin
2020-11-05  2:32           ` Li Qiang
2020-11-04 17:50       ` Mark Brown
2020-11-04 18:13         ` Dave Martin
2020-11-04 18:49           ` Mark Brown
2020-11-05 17:56             ` Dave Martin
2020-11-04  8:01     ` Li Qiang
2020-11-04  8:04       ` Ard Biesheuvel
2020-11-04  8:14         ` Li Qiang
2020-11-04 17:57   ` Eric Biggers [this message]
2020-11-05  2:49     ` Li Qiang
2020-11-05  7:51       ` Ard Biesheuvel
2020-11-05  9:05         ` Li Qiang
2020-11-05 18:21           ` Eric Biggers
2020-11-09  6:29             ` Li Qiang
2020-11-05 16:53 ` [PATCH 0/1] " Dave Martin
2020-11-09  3:43   ` Li Qiang
2020-11-10 10:46     ` Dave Martin
2020-11-10 13:20       ` Li Qiang
2020-11-10 16:07         ` Dave Martin
2020-11-12  7:20           ` Li Qiang
2020-11-12 11:17             ` Dave Martin
2020-11-14  7:31               ` Li Qiang
2020-11-16 15:56                 ` Dave Martin
2020-11-17 12:45                   ` Li Qiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201104175742.GA846@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=alexandre.torgue@st.com \
    --cc=catalin.marinas@arm.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=liqiang64@huawei.com \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox