linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] x86: new optimized CRC functions, with VPCLMULQDQ support
@ 2024-11-25  4:11 Eric Biggers
  2024-11-25  4:11 ` [PATCH 1/6] x86: move zmm exclusion list into CPU feature flag Eric Biggers
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Eric Biggers @ 2024-11-25  4:11 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-crypto, x86, Ard Biesheuvel

This patchset is also available in git via:

    git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git crc-x86-v1

This patchset applies on top of my other recent CRC patchsets
https://lore.kernel.org/r/20241103223154.136127-1-ebiggers@kernel.org/ and
https://lore.kernel.org/r/20241117002244.105200-1-ebiggers@kernel.org/ .
Consider it a preview for what may be coming next, as my priority is
getting those two other patchsets merged first.

This patchset adds a new assembly macro that expands into the body of a
CRC function for x86 for the specified number of bits, bit order, vector
length, and AVX level.  There's also a new script that generates the
constants needed by this function, given a CRC generator polynomial.

This approach allows easily wiring up an x86-optimized implementation of
any variant of CRC-8, CRC-16, CRC-32, or CRC-64, including full support
for VPCLMULQDQ.  On long messages the resulting functions are up to 4x
faster than the existing PCLMULQDQ optimized functions when they exist,
or up to 29x faster than the existing table-based functions.

This patchset starts by wiring up the new macro for crc32_le,
crc_t10dif, and crc32_be.  Later I'd also like to wire up crc64_be and
crc64_rocksoft, once the design of the library functions for those has
been fixed to be like what I'm doing for crc32* and crc_t10dif.

A similar approach of sharing code between CRC variants, and vector
lengths when applicable, should work for other architectures.  The CRC
constant generation script should be mostly reusable.

Eric Biggers (6):
  x86: move zmm exclusion list into CPU feature flag
  scripts/crc: add gen-crc-consts.py
  x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
  x86/crc32: implement crc32_le using new template
  x86/crc-t10dif: implement crc_t10dif using new template
  x86/crc32: implement crc32_be using new template

 arch/x86/Kconfig                        |   2 +-
 arch/x86/crypto/aesni-intel_glue.c      |  22 +-
 arch/x86/include/asm/cpufeatures.h      |   1 +
 arch/x86/kernel/cpu/intel.c             |  22 +
 arch/x86/lib/Makefile                   |   2 +-
 arch/x86/lib/crc-pclmul-consts.h        | 148 ++++++
 arch/x86/lib/crc-pclmul-template-glue.h |  84 ++++
 arch/x86/lib/crc-pclmul-template.S      | 588 ++++++++++++++++++++++++
 arch/x86/lib/crc-t10dif-glue.c          |  22 +-
 arch/x86/lib/crc16-msb-pclmul.S         |   6 +
 arch/x86/lib/crc32-glue.c               |  38 +-
 arch/x86/lib/crc32-pclmul.S             | 220 +--------
 arch/x86/lib/crct10dif-pcl-asm_64.S     | 332 -------------
 scripts/crc/gen-crc-consts.py           | 207 +++++++++
 14 files changed, 1087 insertions(+), 607 deletions(-)
 create mode 100644 arch/x86/lib/crc-pclmul-consts.h
 create mode 100644 arch/x86/lib/crc-pclmul-template-glue.h
 create mode 100644 arch/x86/lib/crc-pclmul-template.S
 create mode 100644 arch/x86/lib/crc16-msb-pclmul.S
 delete mode 100644 arch/x86/lib/crct10dif-pcl-asm_64.S
 create mode 100755 scripts/crc/gen-crc-consts.py

-- 
2.47.0


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-11-29 18:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25  4:11 [PATCH 0/6] x86: new optimized CRC functions, with VPCLMULQDQ support Eric Biggers
2024-11-25  4:11 ` [PATCH 1/6] x86: move zmm exclusion list into CPU feature flag Eric Biggers
2024-11-25  8:33   ` Ingo Molnar
2024-11-25 18:08     ` Eric Biggers
2024-11-25 20:25       ` Ingo Molnar
2024-11-25  4:11 ` [PATCH 2/6] scripts/crc: add gen-crc-consts.py Eric Biggers
2024-11-29 16:09   ` Ard Biesheuvel
2024-11-29 17:47     ` Eric Biggers
2024-11-29 18:33       ` Ard Biesheuvel
2024-11-25  4:11 ` [PATCH 3/6] x86/crc: add "template" for [V]PCLMULQDQ based CRC functions Eric Biggers
2024-11-25  4:11 ` [PATCH 4/6] x86/crc32: implement crc32_le using new template Eric Biggers
2024-11-25  4:11 ` [PATCH 5/6] x86/crc-t10dif: implement crc_t10dif " Eric Biggers
2024-11-25  4:11 ` [PATCH 6/6] x86/crc32: implement crc32_be " Eric Biggers
2024-11-29 16:16 ` [PATCH 0/6] x86: new optimized CRC functions, with VPCLMULQDQ support Ard Biesheuvel
2024-11-29 17:50   ` Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).