All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shakeel Butt <shakeel.butt@linux.dev>
To: Matt Fleming <matt@readmodwrite.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 Jens Axboe <axboe@kernel.dk>, Minchan Kim <minchan@kernel.org>,
	 Sergey Senozhatsky <senozhatsky@chromium.org>,
	Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	 Barry Song <baohua@kernel.org>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,
	 Brendan Jackman <jackmanb@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
	 linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org,  kernel-team@cloudflare.com,
	Matt Fleming <mfleming@cloudflare.com>,
	roman.gushchin@linux.dev
Subject: Re: [RFC PATCH 0/1] mm: Reduce direct reclaim stalls with RAM-backed swap
Date: Tue, 3 Mar 2026 06:59:04 -0800	[thread overview]
Message-ID: <aab0AFAgAzBi4jO6@linux.dev> (raw)
In-Reply-To: <20260303115358.1323188-1-matt@readmodwrite.com>

Hi Matt,

Thanks for the report and one request I have is to avoid cover letter for a
single patch to avoid partitioning the discussion.

On Tue, Mar 03, 2026 at 11:53:57AM +0000, Matt Fleming wrote:
> From: Matt Fleming <mfleming@cloudflare.com>
> 
> Hi,
> 
> Systems with zram-only swap can spin in direct reclaim for 20-30
> minutes without ever invoking the OOM killer. We've hit this repeatedly
> in production on machines with 377 GiB RAM and a 377 GiB zram device.
> 

Have you tried zswap and if you see similar issues with zswap?

> The problem
> -----------
> 
> should_reclaim_retry() calls zone_reclaimable_pages() to estimate how
> much memory is still reclaimable. That estimate includes anonymous
> pages, on the assumption that swapping them out frees physical pages.
> 
> With disk-backed swap, that's true -- writing a page to disk frees a
> page of RAM, and SwapFree accurately reflects how many more pages can
> be written. With zram, the free slot count is inaccurate. A 377 GiB
> zram device with 10% used reports ~340 GiB of free swap slots, but
> filling those slots requires physical RAM that the system doesn't have
> -- that's why it's in direct reclaim in the first place.
> 
> The reclaimable estimate is off by orders of magnitude.
> 

Over the time we (kernel MM community) have implicitly decided to keep the
kernel oom-killer very conservative as adding more heuristics in the reclaim/oom
path makes the kernel more unreliable and punt the aggressiveness of oom-killing
to the userspace as a policy. All major Linux deployments have started using
userspace oom-killers like systemd-oomd, Android's LMKD, fb-oomd or some
internal alternatives. That provides more flexibility to define the
aggressiveness of oom-killing based on your business needs.

Though userspace oom-killers are prone to reliability issues (oom-killer getting
stuck in reclaim or not getting enough CPU), so we (Roman) are working on adding
support for BPF based oom-killer where wen think we can do oom policies more
reliably.

Anyways, I am wondering if you have tried systemd-oomd or some userspace
alternative. If you are interested in BPF oom-killer, we can help with that as
well.

thanks,
Shakeel

  parent reply	other threads:[~2026-03-03 14:59 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 11:53 [RFC PATCH 0/1] mm: Reduce direct reclaim stalls with RAM-backed swap Matt Fleming
2026-03-03 11:53 ` [RFC PATCH 1/1] " Matt Fleming
2026-03-03 14:10   ` Christoph Hellwig
2026-03-03 16:59     ` Johannes Weiner
2026-03-03 14:59 ` Shakeel Butt [this message]
2026-03-03 19:37   ` [RFC PATCH 0/1] " Jens Axboe
2026-03-03 19:37   ` Matt Fleming
2026-03-03 22:47     ` Shakeel Butt
2026-03-03 19:35 ` Johannes Weiner
2026-03-04 15:35   ` Matt Fleming
2026-03-12  3:05 ` Sergey Senozhatsky
2026-04-10  9:41 ` [PATCH] mm: Require LRU reclaim progress before retrying direct reclaim Matt Fleming
2026-04-10 10:13   ` Matt Fleming

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aab0AFAgAzBi4jO6@linux.dev \
    --to=shakeel.butt@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=chrisl@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jackmanb@google.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matt@readmodwrite.com \
    --cc=mfleming@cloudflare.com \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=senozhatsky@chromium.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.