public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	linux-crypto@vger.kernel.org, qat-linux@intel.com,
	stable@vger.kernel.org
Subject: Re: [PATCH] crypto: qat - lower priority for skcipher and aead algorithms
Date: Mon, 16 Jun 2025 09:47:52 -0700	[thread overview]
Message-ID: <20250616164752.GB1373@sol> (raw)
In-Reply-To: <aFAyBgwCUN2NLXOE@gcabiddu-mobl.ger.corp.intel.com>

On Mon, Jun 16, 2025 at 04:02:30PM +0100, Giovanni Cabiddu wrote:
> On Mon, Jun 16, 2025 at 12:18:02PM +0800, Herbert Xu wrote:
> > On Fri, Jun 13, 2025 at 11:32:27AM +0100, Giovanni Cabiddu wrote:
> > > Most kernel applications utilizing the crypto API operate synchronously
> > > and on small buffer sizes, therefore do not benefit from QAT acceleration.
> > 
> > So what performance numbers should we be getting with QAT if the
> > buffer sizes were large enough?
> 
> Specifically for AES128-XTS, under optimal conditions, the current
> generation of QAT (GEN4) can achieve approximately 12 GB/s throughput at
> 4KB block sizes using a single device. Systems typically include between
> 1 and 4 QAT devices per socket and each device contains two internal
> engines capable of performing that algorithm.
> 
> This level of performance is observed in userspace, where it is possible
> to (1) batch requests to amortize MMIO overhead (e.g., multiple requests
> per write), (2) submit requests asynchronously, (3) use flat buffers
> instead of scatter-gather lists, and (4) rely on polling rather than
> interrupts.
> 
> However, in the kernel, we are currently unable to keep the accelerator
> sufficiently busy. For example, using a synthetic synchronous and single
> threaded benchmark on a Sapphire Rapids system, with interrupts properly
> affinitized, I observed throughput of around 500 Mbps with 4KB buffers.
> Debugfs statistics (telemetry) indicated that the accelerator was
> utilized at only ~4%.
> 
> Given this, VAES is currently the more suitable choice for kernel use
> cases. The patch to lower the priority of QAT's symmetric crypto
> algorithms reflects this practical reality. The original high priority
> (4001) was set when the driver was first upstreamed in 2014 and had not
> been revisited until now.

For some perspective, encrypting or decrypting 4 KiB messages with AES-128-XTS
serially, I get 18.4 GB/s per thread with the VAES-accelerated code on an Intel
Emerald Rapids processor.  (The code is arch/x86/crypto/aes-xts-avx-x86_64.S,
which I wrote and contributed in Linux 6.10.)  The processor appeared to be
running at about 3.28 GHz.  That's about 5.6 bytes per cycle.

Emerald Rapids processors have 6 to 60 cores per socket.  Even assuming that a
second thread in each core provides no benefit due to competing for the same
core's resources, that would be an AES-128-XTS throughput of 110 to 1100 GB/s.

That's way more than QAT could provide, even under the optimal conditions which
do not exist in reality as QAT is much harder to use than VAES.

FWIW, on an AMD EPYC 9B45 (Zen 5 / Turin) server processor, I get 35.2 GB/s.
This processor appeared to run at about 4.15 GHz, so that's about 8.5 bytes per
cycle.  That's 51% more bytes per cycle than Intel.  This shows that there is
still room for improvement in VAES, even when it's already much better than QAT.

It's unclear why Intel's efforts seem to be focused on QAT instead of VAES.

- Eric

  reply	other threads:[~2025-06-16 16:48 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13 10:32 [PATCH] crypto: qat - lower priority for skcipher and aead algorithms Giovanni Cabiddu
2025-06-13 19:01 ` Eric Biggers
2025-06-13 19:28   ` Eric Biggers
2025-06-16  4:18 ` Herbert Xu
2025-06-16 15:02   ` Giovanni Cabiddu
2025-06-16 16:47     ` Eric Biggers [this message]
2025-06-16 16:54       ` Eric Biggers
2025-06-16 21:25       ` Eric Biggers
2025-06-17  4:57     ` Herbert Xu
2025-06-19 20:21       ` Giovanni Cabiddu
2025-06-23  9:20 ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250616164752.GB1373@sol \
    --to=ebiggers@kernel.org \
    --cc=giovanni.cabiddu@intel.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=qat-linux@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox