From: Christoph Hellwig <hch@infradead.org>
To: Simon Richter <Simon.Richter@hogyros.de>
Cc: Christoph Hellwig <hch@infradead.org>,
Jani Partanen <jiipee@sotapeli.fi>,
Giovanni Cabiddu <giovanni.cabiddu@intel.com>,
clm@fb.com, dsterba@suse.com, terrelln@fb.com,
herbert@gondor.apana.org.au, linux-btrfs@vger.kernel.org,
linux-crypto@vger.kernel.org, qat-linux@intel.com, cyan@meta.com,
brian.will@intel.com, weigang.li@intel.com,
senozhatsky@chromium.org
Subject: Re: [RFC PATCH 00/16] btrfs: offload compression to hardware accelerators
Date: Thu, 4 Dec 2025 02:06:35 -0800 [thread overview]
Message-ID: <aTFdK1QU6Q-GPZe4@infradead.org> (raw)
In-Reply-To: <9d7b182e-9da7-458f-b913-14eee415359d@hogyros.de>
On Wed, Dec 03, 2025 at 07:47:11PM +0900, Simon Richter wrote:
> Unpacking is quite a bit faster as well, to the point where unpacking the
> compressed block of 4GiB NUL bytes is faster than reading 4 GiB from
> /dev/zero for me.
Which makes me wonder why Intel isn't showing decompression numbers.
For file system workloads those generally are more much common, and
they are generally synchronous while writes often or not, and
compressible ones should be even less so.
> For acomp, I pretty much always expect offloading to be worth the overhead
> if hardware is available, simply because working with bitstreams is awkward
> on any architecture that isn't specifically designed for it, and when an
> algorithm requires building a dictionary, gathering statistics and two-pass
> processing, that becomes even more visible.
I would be really surprised if it makes sense for just a few kilobyes,
e.g. a single compressible btrfs extent. I'd love to see numbers
proving me wrong, though.
> For ahash/acrypt, there is a trade-off here, and where it is depends on CPU
> features, the overhead of offloading, the overhead of receiving the result,
> and how much of that overhead can be mitigated by submitting a batch of
> operations.
>
> For the latter, we also need a better submission interface that actually
> allows large batches, and submitters to use that.
For acrypt Eric has shown pretty devastating numbers for offloads. Which
doesn't surprise me at all given how well modern CPUs handle the
low-level building blocks for cryptographic algorithms.
> As an example of interface pain points: ahash has synchronous import/export
> functions, and no way for the driver to indicate that the result buffer must
> be reachable by DMA as well, so even with a mailbox interface that allows me
> to submit operations with low overhead, I need to synthesize state readbacks
> into an auxiliary buffer and request an interrupt to be delivered after each
> "update" operation, simply so I can have the state available in case it is
> requested, while normally I would only generate an interrupt after an
> "export" or "final" operation is completed (and also rate-limit these).
Which brings me to my previous point: ahash from the looks off it
just looks like a pretty horrible interface. So someone really needs
to come up with an easy to use interface that covers to hardware and
software needs. Note that on the software side offloading to multiple
other CPU core would be a natural fit and make it look a lot like an
async hardware offload. You'd need to make it use the correct data
structures, e.g. bio_vecs provided for source and destination instead
of scatterlist, and clear definitions of addressability. Bounce points
for supporting PCIe P2P transfers, which seems like a very natural fit
here.
prev parent reply other threads:[~2025-12-04 10:06 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-28 19:04 [RFC PATCH 00/16] btrfs: offload compression to hardware accelerators Giovanni Cabiddu
2025-11-28 14:25 ` Giovanni Cabiddu
2025-11-28 16:13 ` Chris Mason
2025-11-28 19:04 ` [RFC PATCH 01/16] crypto: zstd - fix double-free in per-CPU stream cleanup Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 02/16] Revert "crypto: qat - remove unused macros in qat_comp_alg.c" Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 03/16] Revert "crypto: qat - Remove zlib-deflate" Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 04/16] crypto: qat - use memcpy_*_sglist() in zlib deflate Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 05/16] Revert "crypto: testmgr - Remove zlib-deflate" Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 06/16] crypto: deflate - add support for deflate rfc1950 (zlib) Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 07/16] crypto: scomp - Add setparam interface Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 08/16] crypto: acomp " Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 09/16] crypto: acomp - Add comp_params helpers Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 10/16] crypto: acomp - add NUMA-aware stream allocation Giovanni Cabiddu
2025-11-28 19:04 ` [RFC PATCH 11/16] crypto: deflate - add support for compression levels Giovanni Cabiddu
2025-11-28 19:05 ` [RFC PATCH 12/16] crypto: zstd " Giovanni Cabiddu
2025-11-28 19:05 ` [RFC PATCH 13/16] crypto: qat - increase number of preallocated sgl descriptors Giovanni Cabiddu
2025-11-28 19:05 ` [RFC PATCH 14/16] crypto: qat - add support for zstd Giovanni Cabiddu
2025-11-28 19:05 ` [RFC PATCH 15/16] crypto: qat - add support for compression levels Giovanni Cabiddu
2025-11-28 19:05 ` [RFC PATCH 16/16] btrfs: add compression hw-accelerated offload Giovanni Cabiddu
2025-11-28 21:55 ` Qu Wenruo
2025-11-28 22:40 ` Giovanni Cabiddu
2025-11-28 23:59 ` Qu Wenruo
2025-11-29 0:23 ` Qu Wenruo
2025-12-01 14:32 ` Giovanni Cabiddu
2025-12-01 15:10 ` Giovanni Cabiddu
2025-12-01 20:57 ` Qu Wenruo
2025-12-01 22:18 ` Giovanni Cabiddu
2025-12-01 23:13 ` Qu Wenruo
2025-12-02 17:09 ` Giovanni Cabiddu
2025-12-02 20:38 ` Qu Wenruo
2025-12-02 22:37 ` Giovanni Cabiddu
2025-12-02 22:59 ` Qu Wenruo
2025-11-29 1:00 ` David Sterba
2025-11-29 1:08 ` David Sterba
2025-12-02 7:53 ` [RFC PATCH 00/16] btrfs: offload compression to hardware accelerators Christoph Hellwig
2025-12-02 15:46 ` Jani Partanen
2025-12-02 17:19 ` Giovanni Cabiddu
2025-12-03 7:00 ` Christoph Hellwig
2025-12-03 10:15 ` Giovanni Cabiddu
2025-12-04 9:59 ` Christoph Hellwig
2025-12-03 10:47 ` Simon Richter
2025-12-04 10:06 ` Christoph Hellwig [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aTFdK1QU6Q-GPZe4@infradead.org \
--to=hch@infradead.org \
--cc=Simon.Richter@hogyros.de \
--cc=brian.will@intel.com \
--cc=clm@fb.com \
--cc=cyan@meta.com \
--cc=dsterba@suse.com \
--cc=giovanni.cabiddu@intel.com \
--cc=herbert@gondor.apana.org.au \
--cc=jiipee@sotapeli.fi \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-crypto@vger.kernel.org \
--cc=qat-linux@intel.com \
--cc=senozhatsky@chromium.org \
--cc=terrelln@fb.com \
--cc=weigang.li@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).