Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Yosry Ahmed <yosryahmed@google.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Nhat Pham <nphamcs@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Chengming Zhou <chengming.zhou@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 Srividya Desireddy <srividya.desireddy@gmail.com>
Subject: Re: [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling
Date: Fri, 29 Mar 2024 15:29:07 -0700	[thread overview]
Message-ID: <CAJD7tkZeYi65nYZ8c-3ZdNRWuSsgHjerXAPbZcMH5kKF3Kjdow@mail.gmail.com> (raw)
In-Reply-To: <20240329211723.GA882964@cmpxchg.org>

On Fri, Mar 29, 2024 at 2:17 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Fri, Mar 29, 2024 at 11:56:46AM -0700, Yosry Ahmed wrote:
> > > I perf'd zswapping out data that is 10% same-filled and 90% data that
> > > always needs compression. It does nothing but madvise(MADV_PAGEOUT),
> > > and the zswap_store() stack is already only ~60% of the cycles.
> > >
> > > Using zsmalloc + zstd, this is the diff between vanilla and my patch:
> > >
> > > # Baseline  Delta Abs  Shared Object         Symbol
> > > # ........  .........  ....................  .....................................................
> > > #
> > >      4.34%     -3.02%  [kernel.kallsyms]     [k] zswap_store
> > >     11.07%     +1.41%  [kernel.kallsyms]     [k] ZSTD_compressBlock_doubleFast
> > >     15.55%     +0.91%  [kernel.kallsyms]     [k] FSE_buildCTable_wksp
> > >
> > > As expected, we have to compress a bit more; on the other hand we're
> > > removing the content scan for same-filled for 90% of the pages that
> > > don't benefit from it. They almost amortize each other. Let's round it
> > > up and the remaining difference is ~1%.
> >
> > Thanks for the data, this is very useful.
> >
> > There is also the load path. The original patch that introduced
> > same-filled pages reports more gains on the load side, which also
> > happens to be more latency-sensitive. Of course, the data could be
> > outdated -- but it's a different type of workload than what will be
> > running in a data center fleet AFAICT.
> >
> > Is there also no noticeable difference on the load side in your data?
>
>      9.40%     +2.51%  [kernel.kallsyms]  [k] ZSTD_safecopy
>      0.76%     -0.48%  [kernel.kallsyms]  [k] zswap_load
>
> About 2% net.
>
> Checking other compression algorithms, lzo eats 27.58%, and lz4
> 13.82%. The variance between these alone makes our "try to be smarter
> than an actual compression algorithm" code look even sillier.
>
> > Also, how much increase did you observe in the size of compressed data
> > with your patch? Just curious about how zstd ended up handling those
> > pages.
>
> Checking zsmalloc stats, it did pack the same-filled ones down into 32
> byte objects. So 0.7% of their original size. That's negligible, even
> for workloads that have an unusually high share of them.

Glad to see that this was handled appropriately.

>
> > > It's difficult to make the case that this matters to any real
> > > workloads with actual think time in between paging.
> >
> > If the difference in performance is 1%, I surely agree.
> >
> > The patch reported 19-32% improvement in store time for same-filled
> > pages depending on the workload and platform. For 10% same-filled
> > pages, this should roughly correspond to 2-3% overall improvement,
> > which is not too far from the 1% you observed.
>
> Right.
>
> > The patch reported much larger improvement for load times (which
> > matters more), 49-85% for same-filled pages. If this corresponds to
> > 5-8% overall improvement in zswap load time for a workload with 10%
> > same-filled pages, that would be very significant. I don't expect to
> > see such improvements tbh, but we should check.
>
> No, I'm seeing around 2% net.

You mentioned that other compressors eat more cycles in this case, so
perhaps the data in the original patch comes from one of those
compressors.

>
> > > But let's say you do make the case that zero-filled pages are worth
> > > optimizing for.
> >
> > I am not saying they are for sure, but removing the same-filled
> > checking didn't seem to improve performance much in my testing, so the
> > cost seems to be mostly in maintenance. This makes it seem to me that
> > it *could* be a good tradeoff to only handle zero-filled pages. We can
> > make them slightly faster and we can trim the complexity -- as shown
> > by this patch.
>
> In the original numbers, there was a certain percentage of non-zero
> same-filled pages. You still first have to find that number for real
> production loads to determine what the tradeoff actually is.
>
> And I disagree that the code is less complex. There are fewer lines to
> the memchr_inv() than to the open-coded word-scan, but that scan is
> dead simple, self-contained and out of the way.
>
> So call that a wash in terms of maintenance burden, but for a
> reduction in functionality and a regression risk (however small).
>
> The next patch to store them as special xarray entries is also a wash
> at best. It trades the entry having an implicit subtype for no type at
> all. zswap_load_zero_filled() looks like it's the fast path, and
> decompression the fallback; the entry destructor is now called
> "zswap_tree_free_element" and takes a void pointer. It also re-adds
> most of the lines deleted by the zero-only patch.
>
> I think this is actually a bit worse than the status quo.
>
> But my point is, this all seems like a lot of opportunity cost in
> terms of engineering time, review bandwidth, regression risk, and
> readability, hackability, reliability, predictability of the code -
> for gains that are still not shown to actually matter in practice.

Yeah in terms of code cleanliness it did not turn out as I thought it
would. Actually even using different subtypes will have either
similarly abstract (yet less clear) helpers, or completely separate
paths with code duplication -- both of which are not ideal. So perhaps
it's better to leave it alone (and perhaps clean it up slightly) or
remove it completely.

I wanted to see what others thought, and I was aware it's
controversial (hence RFC) :)

Anyway, I will send a separate series with only cleanups and removing
knobs. We can discuss completely removing same-filled handling
separately. I suspect 2% regression in the load path (and potentially
larger with other compressors) may not be okay.

If handling for zero-filled pages is added directly in reclaim as you
suggested though, then the justification for handling other patterns
in zswap becomes much less :) Handling it in reclaim also means we
avoid IO for zero pages completely, which would be even more
beneficial than just avoiding compression/decompression in zswap.

>
> https://lore.kernel.org/all/20240328122352.a001a56aed97b01ac5931998@linux-foundation.org/
>
> This needs numbers to show not just that your patches are fine, but
> that any version of this optimization actually matters for real
> workloads. Without that, my vote would be NAK.

next prev parent reply	other threads:[~2024-03-29 22:29 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 23:50 [RFC PATCH 0/9] zswap: store zero-filled pages more efficiently Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 1/9] mm: zswap: always shrink in zswap_store() if zswap_pool_reached_full Yosry Ahmed
2024-03-26 21:49   ` Nhat Pham
2024-03-27  2:21   ` Chengming Zhou
2024-03-28 19:09   ` Johannes Weiner
2024-03-25 23:50 ` [RFC PATCH 2/9] mm: zswap: refactor storing to the tree out of zswap_store() Yosry Ahmed
2024-03-27  2:25   ` Chengming Zhou
2024-03-27 22:29     ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 3/9] mm: zswap: refactor limit checking from zswap_store() Yosry Ahmed
2024-03-27  2:42   ` Chengming Zhou
2024-03-27 22:30     ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 4/9] mm: zswap: move more same-filled pages checks outside of zswap_store() Yosry Ahmed
2024-03-26 21:57   ` Nhat Pham
2024-03-27  2:39   ` Chengming Zhou
2024-03-27 22:32     ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 5/9] mm: zswap: remove zswap_same_filled_pages_enabled Yosry Ahmed
2024-03-26 22:01   ` Nhat Pham
2024-03-27  2:44   ` Chengming Zhou
2024-03-27 22:34     ` Yosry Ahmed
2024-03-28 19:11   ` Johannes Weiner
2024-03-28 20:06     ` Yosry Ahmed
2024-03-29  2:14       ` Yosry Ahmed
2024-03-29 14:02         ` Maciej S. Szmigiero
2024-03-29 17:44           ` Johannes Weiner
2024-03-29 18:22             ` Yosry Ahmed
2024-04-01 10:37               ` Maciej S. Szmigiero
2024-04-01 18:29                 ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 6/9] mm: zswap: drop support for non-zero same-filled pages handling Yosry Ahmed
2024-03-27 11:25   ` Chengming Zhou
2024-03-27 16:40   ` Nhat Pham
2024-03-27 22:38     ` Yosry Ahmed
2024-03-28 19:31   ` Johannes Weiner
2024-03-28 20:23     ` Yosry Ahmed
2024-03-28 21:07       ` Johannes Weiner
2024-03-28 23:19         ` Nhat Pham
2024-03-29  2:05           ` Yosry Ahmed
2024-03-29  4:27             ` Yosry Ahmed
2024-03-29 17:37               ` Johannes Weiner
2024-03-29 18:56                 ` Yosry Ahmed
2024-03-29 21:17                   ` Johannes Weiner
2024-03-29 22:29                     ` Yosry Ahmed [this message]
2024-03-28 23:33       ` Nhat Pham
2024-03-29  2:07         ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 7/9] mm: zswap: store zero-filled pages without a zswap_entry Yosry Ahmed
2024-03-28  8:12   ` Chengming Zhou
2024-03-28 18:45     ` Yosry Ahmed
2024-03-28 19:38   ` Johannes Weiner
2024-03-28 20:29     ` Yosry Ahmed
2024-03-25 23:50 ` [RFC PATCH 8/9] mm: zswap: do not check the global limit for zero-filled pages Yosry Ahmed
2024-03-28  8:15   ` Chengming Zhou
2024-03-25 23:50 ` [RFC PATCH 9/9] mm: zswap: use zswap_entry_free() for partially initialized entries Yosry Ahmed
2024-03-28  8:31   ` Chengming Zhou
2024-03-28 18:49     ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkZeYi65nYZ8c-3ZdNRWuSsgHjerXAPbZcMH5kKF3Kjdow@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=srividya.desireddy@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).