linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation
@ 2013-04-17 16:12 Tim Chen
  2013-04-17 16:12 ` [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework Tim Chen
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Tim Chen @ 2013-04-17 16:12 UTC (permalink / raw)
  To: Herbert Xu, H. Peter Anvin, David S. Miller, Martin K. Petersen,
	James Bottomley
  Cc: Tim Chen, Matthew Wilcox, Jim Kukunas, Keith Busch, Erdinc Ozturk,
	Vinodh Gopal, James Guilford, Wajdi Feghali, Jussi Kivilinna,
	linux-kernel, linux-crypto, linux-scsi

Currently the CRC-T10DIF checksum is computed using a generic table lookup
algorithm.  By switching the checksum to PCLMULQDQ based computation,
we can speedup the computation by 8x for checksumming 512 bytes and
even more for larger buffer size.  This will improve performance of SCSI
drivers turning on the CRC-T10IDF checksum.  In our SSD based experiments,
we have seen increase disk throughput by 3.5x with T10DIF for 512 byte
block size.

This patch set provides the x86_64 routine using PCLMULQDQ instruction
and switches the crc_t10dif library function to use the faster PCLMULQDQ
based routine when available.

Tim

v1->v2
1. Get rid of unnecessary xmm registers save and restore and fix ENDPROC
position in PCLMULQDQ version of crc t10dif computation.
2. Fix URL to paper reference of CRC computation with PCLMULQDQ. 
3. Add one additional tcrypt test case to exercise more code paths through 
crc t10dif computation.
4. Fix config dependencies of CRYPTO_CRCT10DIF.

Thanks to Matthew and Jussi who reviewed the patches and Keith
for testing version 1 of the patch set.  

Tim Chen (4):
  Wrap crc_t10dif function all to use crypto transform framework
  Accelerated CRC T10 DIF computation with PCLMULQDQ instruction
  Glue code to cast accelerated CRCT10DIF assembly as a crypto
    transform
  Simple correctness and speed test for CRCT10DIF hash

 arch/x86/crypto/Makefile                |   2 +
 arch/x86/crypto/crct10dif-pcl-asm_64.S  | 643 ++++++++++++++++++++++++++++++++
 arch/x86/crypto/crct10dif-pclmul_glue.c | 153 ++++++++
 crypto/Kconfig                          |  21 ++
 crypto/tcrypt.c                         |   8 +
 crypto/testmgr.c                        |  10 +
 crypto/testmgr.h                        |  33 ++
 include/linux/crc-t10dif.h              |  10 +
 lib/crc-t10dif.c                        |  96 +++++
 9 files changed, 976 insertions(+)
 create mode 100644 arch/x86/crypto/crct10dif-pcl-asm_64.S
 create mode 100644 arch/x86/crypto/crct10dif-pclmul_glue.c

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-05-02  2:59 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-17 16:12 [PATCH v2 0/4] Patchset to use PCLMULQDQ to accelerate CRC-T10DIF checksum computation Tim Chen
2013-04-17 16:12 ` [PATCH v2 1/4] Wrap crc_t10dif function all to use crypto transform framework Tim Chen
2013-04-25 13:22   ` Herbert Xu
2013-04-25 17:28     ` Tim Chen
2013-04-26 12:52       ` Herbert Xu
2013-04-26 16:44         ` Tim Chen
2013-04-28  0:11           ` Herbert Xu
2013-04-29 20:40             ` Tim Chen
2013-04-30  3:27               ` Herbert Xu
2013-05-02  2:59                 ` Tim Chen
2013-04-17 16:12 ` [PATCH v2 2/4] Accelerated CRC T10 DIF computation with PCLMULQDQ instruction Tim Chen
2013-04-17 16:12 ` [PATCH v2 3/4] Glue code to cast accelerated CRCT10DIF assembly as a crypto transform Tim Chen
2013-04-17 16:12 ` [PATCH v2 4/4] Simple correctness and speed test for CRCT10DIF hash Tim Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).