linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] x86 CRC optimizations
@ 2025-02-06  7:39 Eric Biggers
  2025-02-06  7:39 ` [PATCH v3 1/6] x86: move ZMM exclusion list into CPU feature flag Eric Biggers
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Eric Biggers @ 2025-02-06  7:39 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-crypto, x86, linux-block, Ard Biesheuvel, Keith Busch,
	Kent Overstreet, Martin K . Petersen

This patchset applies to the crc tree and is also available at:

    git fetch https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git crc-x86-v3

This series replaces the existing x86 PCLMULQDQ optimized CRC code with
new code that is shared among the different CRC variants and also adds
VPCLMULQDQ support, greatly improving performance on recent CPUs.  The
last patch wires up the same optimization to crc64_be() and crc64_nvme()
(a.k.a. the old "crc64_rocksoft") which previously were unoptimized,
improving the performance of those CRC functions by as much as 100x.
crc64_be is used by bcachefs, and crc64_nvme is used by blk-integrity.

Changed in v3:
- It's back to just the x86 patches now, since I've applied the CRC64
  library rework patches.
- Added review and ack tags.
- Made more improvements to crc-pclmul-template.S and gen-crc-consts.py,
  such as improving the comments that explain some of the steps,
  tweaking the exact choice of constants in certain cases where more
  than one is equivalent, sharing a bit more of the source code between
  lsb and msb-first CRCs, and eliminating an unnecessary instruction.

Changed in v2:
- Rebased onto upstream
- Added CRC64 library rework patches
- Capitalized YMM and ZMM
- Moved gen-crc-consts.py from scripts/crc/ to just scripts/
- Renamed crc-pclmul-template-glue.h to just crc-pclmul-template.h
- The asm functions that use longer vectors no longer tail-call the ones
  that use shorter vectors in order to handle short lengths.  Each
  function now handles all lengths >= 16 bytes directly.
- Made various other improvements to crc-pclmul-template.S and
  gen-crc-consts.py
- It's 2025 now; updated the copyright statements
- Improved commit messages
- Added ack tags

Eric Biggers (6):
  x86: move ZMM exclusion list into CPU feature flag
  scripts/gen-crc-consts: add gen-crc-consts.py
  x86/crc: add "template" for [V]PCLMULQDQ based CRC functions
  x86/crc32: implement crc32_le using new template
  x86/crc-t10dif: implement crc_t10dif using new template
  x86/crc64: implement crc64_be and crc64_nvme using new template

 MAINTAINERS                         |   1 +
 arch/x86/Kconfig                    |   3 +-
 arch/x86/crypto/aesni-intel_glue.c  |  22 +-
 arch/x86/include/asm/cpufeatures.h  |   1 +
 arch/x86/kernel/cpu/intel.c         |  22 ++
 arch/x86/lib/Makefile               |   5 +-
 arch/x86/lib/crc-pclmul-consts.h    | 195 ++++++++++
 arch/x86/lib/crc-pclmul-template.S  | 584 ++++++++++++++++++++++++++++
 arch/x86/lib/crc-pclmul-template.h  |  81 ++++
 arch/x86/lib/crc-t10dif-glue.c      |  23 +-
 arch/x86/lib/crc16-msb-pclmul.S     |   6 +
 arch/x86/lib/crc32-glue.c           |  37 +-
 arch/x86/lib/crc32-pclmul.S         | 219 +----------
 arch/x86/lib/crc64-glue.c           |  50 +++
 arch/x86/lib/crc64-pclmul.S         |   7 +
 arch/x86/lib/crct10dif-pcl-asm_64.S | 332 ----------------
 scripts/gen-crc-consts.py           | 239 ++++++++++++
 17 files changed, 1214 insertions(+), 613 deletions(-)
 create mode 100644 arch/x86/lib/crc-pclmul-consts.h
 create mode 100644 arch/x86/lib/crc-pclmul-template.S
 create mode 100644 arch/x86/lib/crc-pclmul-template.h
 create mode 100644 arch/x86/lib/crc16-msb-pclmul.S
 create mode 100644 arch/x86/lib/crc64-glue.c
 create mode 100644 arch/x86/lib/crc64-pclmul.S
 delete mode 100644 arch/x86/lib/crct10dif-pcl-asm_64.S
 create mode 100755 scripts/gen-crc-consts.py


base-commit: 5b793bbee96c666ca14db8409509abd73a3e0130
-- 
2.48.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-02-06 23:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-06  7:39 [PATCH v3 0/6] x86 CRC optimizations Eric Biggers
2025-02-06  7:39 ` [PATCH v3 1/6] x86: move ZMM exclusion list into CPU feature flag Eric Biggers
2025-02-06  7:39 ` [PATCH v3 2/6] scripts/gen-crc-consts: add gen-crc-consts.py Eric Biggers
2025-02-06 19:31   ` David Laight
2025-02-06 20:08     ` Eric Biggers
2025-02-06 22:28       ` David Laight
2025-02-06 23:41         ` Eric Biggers
2025-02-06  7:39 ` [PATCH v3 3/6] x86/crc: add "template" for [V]PCLMULQDQ based CRC functions Eric Biggers
2025-02-06  7:39 ` [PATCH v3 4/6] x86/crc32: implement crc32_le using new template Eric Biggers
2025-02-06  7:39 ` [PATCH v3 5/6] x86/crc-t10dif: implement crc_t10dif " Eric Biggers
2025-02-06  7:39 ` [PATCH v3 6/6] x86/crc64: implement crc64_be and crc64_nvme " Eric Biggers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).