From: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
nphamcs@gmail.com, chengming.zhou@linux.dev,
usamaarif642@gmail.com, ryan.roberts@arm.com,
ying.huang@intel.com, 21cnbao@gmail.com,
akpm@linux-foundation.org, linux-crypto@vger.kernel.org,
herbert@gondor.apana.org.au, davem@davemloft.net,
clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com,
surenb@google.com, kristen.c.accardi@intel.com,
wajdi.k.feghali@intel.com, vinodh.gopal@intel.com
Subject: Re: [PATCH v4 10/10] mm: zswap: Compress batching with Intel IAA in zswap_batch_store() of large folios.
Date: Mon, 25 Nov 2024 16:47:24 -0500 [thread overview]
Message-ID: <20241125214724.GA2405574@cmpxchg.org> (raw)
In-Reply-To: <CAJD7tkb0WyLD3hxQ5fHWHogyW5g+eF+GrR15r0PjK9YbFO3szg@mail.gmail.com>
On Mon, Nov 25, 2024 at 12:20:01PM -0800, Yosry Ahmed wrote:
> On Fri, Nov 22, 2024 at 11:01 PM Kanchana P Sridhar
> <kanchana.p.sridhar@intel.com> wrote:
> >
> > This patch adds two new zswap API:
> >
> > 1) bool zswap_can_batch(void);
> > 2) void zswap_batch_store(struct folio_batch *batch, int *errors);
> >
> > Higher level mm code, for instance, swap_writepage(), can query if the
> > current zswap pool supports batching, by calling zswap_can_batch(). If so
> > it can invoke zswap_batch_store() to swapout a large folio much more
> > efficiently to zswap, instead of calling zswap_store().
> >
> > Hence, on systems with Intel IAA hardware compress/decompress accelerators,
> > swap_writepage() will invoke zswap_batch_store() for large folios.
> >
> > zswap_batch_store() will call crypto_acomp_batch_compress() to compress up
> > to SWAP_CRYPTO_BATCH_SIZE (i.e. 8) pages in large folios in parallel using
> > the multiple compress engines available in IAA.
> >
> > On platforms with multiple IAA devices per package, compress jobs from all
> > cores in a package will be distributed among all IAA devices in the package
> > by the iaa_crypto driver.
> >
> > The newly added zswap_batch_store() follows the general structure of
> > zswap_store(). Some amount of restructuring and optimization is done to
> > minimize failure points for a batch, fail early and maximize the zswap
> > store pipeline occupancy with SWAP_CRYPTO_BATCH_SIZE pages, potentially
> > from multiple folios in future. This is intended to maximize reclaim
> > throughput with the IAA hardware parallel compressions.
> >
> > Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
> > Suggested-by: Yosry Ahmed <yosryahmed@google.com>
>
> This is definitely not what I suggested :)
>
> I won't speak for Johannes here but I suspect it's not quite what he
> wanted either.
It is not.
I suggested having an integrated code path where "legacy" stores of
single pages is just the batch_size=1 case.
https://lore.kernel.org/linux-mm/20241107185340.GG1172372@cmpxchg.org/
> What we really need to do (and I suppose what Johannes meant, but
> please correct me if I am wrong), is to make the existing flow work
> with batches.
>
> For example, most of zswap_store() should remain the same. It is still
> getting a folio to compress, the only difference is that we will
> parallelize the page compressions. zswap_store_page() is where some
> changes need to be made. Instead of a single function that handles the
> storage of each page, we need a vectorized function that handles the
> storage of N pages in a folio (allocate zswap_entry's, do xarray
> insertions, etc). This should be refactoring in a separate patch.
>
> Once we have that, the logic introduced by this patch should really be
> mostly limited to zswap_compress(), where the acomp interfacing would
> be different based on whether batching is supported or not. This could
> be changes in zswap_compress() itself, or maybe at this point we can
> have a completely different path (e.g. zswap_compress_batch()). But
> outside of that, I don't see why we should have a completely different
> store path for the batching.
+1
next prev parent reply other threads:[~2024-11-25 21:47 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-23 7:01 [PATCH v4 00/10] zswap IAA compress batching Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 01/10] crypto: acomp - Define two new interfaces for compress/decompress batching Kanchana P Sridhar
2024-11-25 9:35 ` Herbert Xu
2024-11-25 20:03 ` Sridhar, Kanchana P
2024-11-26 2:13 ` Sridhar, Kanchana P
2024-11-26 2:14 ` Herbert Xu
2024-11-26 2:37 ` Sridhar, Kanchana P
2024-11-27 1:22 ` Sridhar, Kanchana P
2024-11-27 5:04 ` Herbert Xu
2024-11-23 7:01 ` [PATCH v4 02/10] crypto: iaa - Add an acomp_req flag CRYPTO_ACOMP_REQ_POLL to enable async mode Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 03/10] crypto: iaa - Implement batch_compress(), batch_decompress() API in iaa_crypto Kanchana P Sridhar
2024-11-26 7:05 ` kernel test robot
2024-11-23 7:01 ` [PATCH v4 04/10] crypto: iaa - Make async mode the default Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 05/10] crypto: iaa - Disable iaa_verify_compress by default Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 06/10] crypto: iaa - Re-organize the iaa_crypto driver code Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 07/10] crypto: iaa - Map IAA devices/wqs to cores based on packages instead of NUMA Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 08/10] crypto: iaa - Distribute compress jobs from all cores to all IAAs on a package Kanchana P Sridhar
2024-11-23 7:01 ` [PATCH v4 09/10] mm: zswap: Allocate pool batching resources if the crypto_alg supports batching Kanchana P Sridhar
2024-12-02 19:15 ` Nhat Pham
2024-12-03 0:30 ` Sridhar, Kanchana P
2024-12-03 8:00 ` Herbert Xu
2024-12-03 21:37 ` Sridhar, Kanchana P
2024-12-03 21:44 ` Yosry Ahmed
2024-12-03 22:17 ` Sridhar, Kanchana P
2024-12-03 22:24 ` Sridhar, Kanchana P
2024-12-04 1:42 ` Herbert Xu
2024-12-04 22:35 ` Yosry Ahmed
2024-12-04 22:49 ` Sridhar, Kanchana P
2024-12-04 22:55 ` Yosry Ahmed
2024-12-04 23:12 ` Sridhar, Kanchana P
2024-12-21 6:30 ` Sridhar, Kanchana P
2024-11-23 7:01 ` [PATCH v4 10/10] mm: zswap: Compress batching with Intel IAA in zswap_batch_store() of large folios Kanchana P Sridhar
2024-11-25 8:00 ` kernel test robot
2024-11-25 20:20 ` Yosry Ahmed
2024-11-25 21:47 ` Johannes Weiner [this message]
2024-11-25 21:54 ` Sridhar, Kanchana P
2024-11-25 22:08 ` Yosry Ahmed
2024-12-02 19:26 ` Nhat Pham
2024-12-03 0:34 ` Sridhar, Kanchana P
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241125214724.GA2405574@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=ardb@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=clabbe@baylibre.com \
--cc=davem@davemloft.net \
--cc=ebiggers@google.com \
--cc=herbert@gondor.apana.org.au \
--cc=kanchana.p.sridhar@intel.com \
--cc=kristen.c.accardi@intel.com \
--cc=linux-crypto@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=vinodh.gopal@intel.com \
--cc=wajdi.k.feghali@intel.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.