linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Yosry Ahmed <yosryahmed@google.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Sergey Senozhatsky <senozhatsky@chromium.org>,
	 Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, ngupta@vflare.org,
	sjenning@redhat.com,  ddstreet@ieee.org,
	vitaly.wool@konsulko.com
Subject: Re: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks
Date: Thu, 3 Nov 2022 16:31:51 -0700	[thread overview]
Message-ID: <CAJD7tkYNkf+jpKZnKBWQLjmOnSmOYsh3wzhu_Qkj01V6966EpA@mail.gmail.com> (raw)
In-Reply-To: <Y2Q16sjnZSgOYHd9@google.com>

On Thu, Nov 3, 2022 at 2:43 PM Minchan Kim <minchan@kernel.org> wrote:
>
> On Thu, Nov 03, 2022 at 01:46:56PM -0700, Yosry Ahmed wrote:
> > On Thu, Nov 3, 2022 at 1:37 PM Minchan Kim <minchan@kernel.org> wrote:
> > >
> > > On Thu, Nov 03, 2022 at 11:10:47AM -0700, Yosry Ahmed wrote:
> > > < snip >
> > >
> > > > > > > > I am also worry about that LRU stuff should be part of allocator
> > > > > > > > instead of higher level.
> > > > > > >
> > > > > > > I'm sorry, but that's not a reasonable objection.
> > > > > > >
> > > > > > > These patches implement a core feature of being a zswap backend, using
> > > > > > > standard LRU and locking techniques established by the other backends.
> > > > > > >
> > > > > > > I don't disagree that it would nicer if zswap had a strong abstraction
> > > > > > > for backend pages and a generalized LRU. But that is major surgery on
> > > > > > > a codebase of over 6,500 lines. It's not a reasonable ask to change
> > > > > > > all that first before implementing a basic feature that's useful now.
> > > > > >
> > > > > > With same logic, folks added the LRU logic into their allocators
> > > > > > without the effort considering moving the LRU into upper layer.
> > > > > >
> > > > > > And then trend is still going on since I have seen multiple times
> > > > > > people are trying to add more allocators. So if it's not a reasonable
> > > > > > ask to consier, we couldn't stop the trend in the end.
> > > > >
> > > > > So there is actually an ongoing effort to do that. Yosry and I have
> > > > > spent quite some time on coming up with an LRU design that's
> > > > > independent from compression policy over email and at Plumbers.
> > > > >
> > > > > My concern is more about the order of doing things:
> > > > >
> > > > > 1. The missing writeback support is a gaping hole in zsmalloc, which
> > > > >    affects production systems. A generalized LRU list is a good idea,
> > > > >    but it's a huge task that from a user pov really is not
> > > > >    critical. Even from a kernel dev / maintainer POV, there are bigger
> > > > >    fish to fry in the zswap code base and the backends than this.
> > > > >
> > > > > 2. Refactoring existing functionality is much easier than writing
> > > > >    generalized code that simultaneously enables new behavior. zsmalloc
> > > > >    is the most complex of our backends. To make its LRU writeback work
> > > > >    we had to patch zswap's ->map ordering to accomodate it, e.g. Such
> > > > >    tricky changes are easier to make and test incrementally.
> > > > >
> > > > >    The generalized LRU project will hugely benefit from already having
> > > > >    a proven writeback implementation in zsmalloc, because then all the
> > > > >    requirements in zswap and zsmalloc will be in black and white.
> > > > >
> > > > > > > I get that your main interest is zram, and so this feature isn't of
> > > > > > > interest to you. But zram isn't the only user, nor is it the primary
> > > > > >
> > > > > > I am interest to the feature but my interest is more of general swap
> > > > > > layer to manage the LRU so that it could support any hierarchy among
> > > > > > swap devices, not only zswap.
> > > > >
> > > > > I think we're on the same page about the longer term goals.
> > > > >
> > > >
> > > > Yeah. As Johannes said, I was also recently looking into this. This
> > > > can also help solve other problems than consolidating implementations.
> > > > Currently if zswap rejects a page, it goes into swap, which is
> > > > more-or-less a violation of page LRUs since hotter pages that are more
> > > > recently reclaimed end up in swap (slow), while colder pages that were
> > > > reclaimed before are in zswap. Having a separate layer managing the
> > > > LRU of swap pages can also make sure this doesn't happen.
> > >
> > > True.
> > >
> > > >
> > > > More broadly, making zswap a separate layer from swap enables other
> > > > improvements such as using zswap regardless of the presence of a
> > > > backend swapfile and not consuming space in swapfiles if a page is in
> > > > zswap. Of course, this is a much larger surgery.
> > >
> > > If we could decouple the LRU writeback from zswap and supports
> > > compression without backing swapfile, sounds like becoming more of
> > > zram. ;-)
> >
> > That's a little bit grey. Maybe we can consolidate them one day :)
> >
> > We have been using zswap without swapfile at Google for a while, this
> > gives us the ability to reject pages that do not compress well enough
> > for us, which I suspect zram would not support given that it is
> > designed to be the final destination of the page. Also, having the
>
> zRAM could do with little change but at current implmentation, it will
> print swapout failure message(it's not a big deal since we could
> suppress) but I have thought rather than that, we needs to move the
> page unevictable LRU list with marking the page CoW to catch a time
> to move the page into evictable LRU list o provide second chance to
> be compressed. Just off-topic.

Right. We do something similar-ish today. However, this does not work
though for zswap if there is a backing swapfile, as the page needs to
still be evictable to the swapfile. A decoupled LRU can also manage
this appropriately.

>
> > same configuration and code running on machines whether or not they
> > have a swapfile is nice, otherwise one would need to use zram if there
> > is no swapfile and switch to zswap if there is one.


  reply	other threads:[~2022-11-03 23:32 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-26 20:06 [PATCH 0/5] Implement writeback for zsmalloc Nhat Pham
2022-10-26 20:06 ` [PATCH 1/5] zswap: fix writeback lock ordering " Nhat Pham
2022-10-26 20:06 ` [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks Nhat Pham
2022-10-28 14:46   ` Johannes Weiner
2022-11-02  3:28   ` Sergey Senozhatsky
2022-11-02 21:36     ` Minchan Kim
2022-11-03 15:18       ` Johannes Weiner
2022-11-03 15:53         ` Minchan Kim
2022-11-03 18:08           ` Johannes Weiner
2022-11-03 18:10             ` Yosry Ahmed
2022-11-03 20:37               ` Minchan Kim
2022-11-03 20:46                 ` Yosry Ahmed
2022-11-03 21:15                   ` Yu Zhao
2022-11-03 23:19                     ` Yosry Ahmed
2022-11-03 21:43                   ` Minchan Kim
2022-11-03 23:31                     ` Yosry Ahmed [this message]
2022-11-03 20:22             ` Minchan Kim
2022-11-04  3:58         ` Sergey Senozhatsky
2022-11-07 21:31           ` Nhat Pham
2022-11-07 22:35             ` Minchan Kim
2022-10-26 20:06 ` [PATCH 3/5] zsmalloc: Add a LRU to zs_pool to keep track of zspages in LRU order Nhat Pham
2022-10-28 14:47   ` Johannes Weiner
2022-10-26 20:06 ` [PATCH 4/5] zsmalloc: Add ops fields to zs_pool to store evict handlers Nhat Pham
2022-10-28 15:18   ` Johannes Weiner
2022-11-02  4:10   ` Sergey Senozhatsky
2022-11-07 21:36     ` Nhat Pham
2022-10-26 20:06 ` [PATCH 5/5] zsmalloc: Implement writeback mechanism for zsmalloc Nhat Pham
2022-10-27 13:53   ` kernel test robot
2022-10-27 18:27     ` [PATCH v2 " Nhat Pham
2022-10-28 15:19       ` Johannes Weiner
2022-11-02  3:42       ` Sergey Senozhatsky
2022-11-03 15:26         ` Johannes Weiner
2022-11-02  3:44       ` Sergey Senozhatsky
2022-11-02  4:13       ` Sergey Senozhatsky
2022-11-03 16:45         ` Johannes Weiner
2022-11-04  4:02           ` Sergey Senozhatsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJD7tkYNkf+jpKZnKBWQLjmOnSmOYsh3wzhu_Qkj01V6966EpA@mail.gmail.com \
    --to=yosryahmed@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=ddstreet@ieee.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan@kernel.org \
    --cc=ngupta@vflare.org \
    --cc=nphamcs@gmail.com \
    --cc=senozhatsky@chromium.org \
    --cc=sjenning@redhat.com \
    --cc=vitaly.wool@konsulko.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).