public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org, cerasuolodomenico@gmail.com,
	sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com,
	hughd@google.com, corbet@lwn.net, konrad.wilk@oracle.com,
	senozhatsky@chromium.org, rppt@kernel.org, linux-mm@kvack.org,
	kernel-team@meta.com, linux-kernel@vger.kernel.org,
	david@ixit.cz, Wei Xu <weixugc@google.com>,
	Chris Li <chrisl@kernel.org>, Greg Thelen <gthelen@google.com>
Subject: Re: [PATCH 0/2] minimize swapping on zswap store failure
Date: Tue, 17 Oct 2023 10:51:24 -0400	[thread overview]
Message-ID: <20231017145124.GA1122010@cmpxchg.org> (raw)
In-Reply-To: <CAJD7tkbEJDczxPqp2ZcZiz1ZWYdUWZLaiovxiGWcM57md-URhA@mail.gmail.com>

On Mon, Oct 16, 2023 at 10:33:23PM -0700, Yosry Ahmed wrote:
> On Mon, Oct 16, 2023 at 9:47 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Mon, Oct 16, 2023 at 05:57:31PM -0700, Yosry Ahmed wrote:
> > > On Mon, Oct 16, 2023 at 5:35 PM Nhat Pham <nphamcs@gmail.com> wrote:
> > So I obviously agree that we still need to invest in decoupling zswap
> > space from physical disk slots. It's insanely wasteful, especially
> > with larger memory capacities. But while it would be a fantastic
> > optimization, I don't see how it would be an automatic solution to the
> > problem that inspired this proposal.
> 
> Well, in my head, I imagine such a world where we have multiple
> separate swapping backends with cgroup knob(s) that control what
> backends are allowed for each cgroup. A zswap-is-terminal knob is
> hacky-ish way of doing that where the backends are only zswap and disk
> swap.

"I want compression" vs "I want disk offloading" is a more reasonable
question to ask at the cgroup level. We've had historically a variety
of swap configurations across the fleet. E.g. it's a lot easier to add
another swapfile than it is to grow an existing one at runtime. In
other cases, one storage config might have one swapfile, another
machine model might want to spread it out over multiple disks etc.

This doesn't matter much with ghost files. But with conventional
swapfiles this requires an unnecessary awareness of the backend
topology in order to express container policy. That's no bueno.

> > > Perhaps there is a way we can do this without allocating a zswap entry?
> > >
> > > I thought before about having a special list_head that allows us to
> > > use the lower bits of the pointers as markers, similar to the xarray.
> > > The markers can be used to place different objects on the same list.
> > > We can have a list that is a mixture of struct page and struct
> > > zswap_entry. I never pursued this idea, and I am sure someone will
> > > scream at me for suggesting it. Maybe there is a less convoluted way
> > > to keep the LRU ordering intact without allocating memory on the
> > > reclaim path.
> >
> > That should work. Once zswap has exclusive control over the page, it
> > is free to muck with its lru linkage. A lower bit tag on the next or
> > prev pointer should suffice to distinguish between struct page and
> > struct zswap_entry when pulling stuff from the list.
> 
> Right.
> 
> We handle incompressible memory internally in a different way, we put
> them back on the unevictable list with an incompressible page flag.
> This achieves a similar effect.

It doesn't. We want those incompressible pages to continue aging
alongside their compressible peers, and eventually get written back to
disk with them.

  reply	other threads:[~2023-10-17 14:53 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17  0:35 [PATCH 0/2] minimize swapping on zswap store failure Nhat Pham
2023-10-17  0:35 ` [PATCH 1/2] swap: allows swap bypassing " Nhat Pham
2023-10-17  0:35 ` [PATCH 2/2] zswap: store uncompressed pages when compression algorithm fails Nhat Pham
2023-10-17  0:57 ` [PATCH 0/2] minimize swapping on zswap store failure Yosry Ahmed
2023-10-17  4:47   ` Johannes Weiner
2023-10-17  5:33     ` Yosry Ahmed
2023-10-17 14:51       ` Johannes Weiner [this message]
2023-10-17 15:51         ` Yosry Ahmed
2023-10-17 19:24     ` Nhat Pham
2023-10-17 19:03   ` Nhat Pham
2023-10-17 19:04     ` Nhat Pham
2025-04-02 20:06   ` Joshua Hahn
2025-04-03 20:38     ` Nhat Pham
2025-04-04  1:46       ` Sergey Senozhatsky
2025-04-04 14:06         ` Joshua Hahn
2025-04-04 15:29           ` Nhat Pham
2025-04-08  3:33           ` Sergey Senozhatsky
2025-04-04 15:39     ` Nhat Pham
2025-04-22 11:27     ` Yosry Ahmed
2025-04-22 15:00       ` Joshua Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231017145124.GA1122010@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=cerasuolodomenico@gmail.com \
    --cc=chrisl@kernel.org \
    --cc=corbet@lwn.net \
    --cc=david@ixit.cz \
    --cc=ddstreet@ieee.org \
    --cc=gthelen@google.com \
    --cc=hughd@google.com \
    --cc=kernel-team@meta.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=rppt@kernel.org \
    --cc=senozhatsky@chromium.org \
    --cc=sjenning@redhat.com \
    --cc=vitaly.wool@konsulko.com \
    --cc=weixugc@google.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox