public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Li Wang <liwang@redhat.com>
To: Nhat Pham <nphamcs@gmail.com>
Cc: "Yosry Ahmed" <yosryahmed@google.com>,
	linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, "Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Muchun Song" <muchun.song@linux.dev>,
	"Tejun Heo" <tj@kernel.org>,
	"Roman Gushchin" <roman.gushchin@linux.dev>,
	"Shakeel Butt" <shakeel.butt@linux.dev>
Subject: Re: [PATCH 2/5] selftests/cgroup: avoid OOM in test_swapin_nozswap
Date: Fri, 13 Mar 2026 10:59:31 +0800	[thread overview]
Message-ID: <abN9k5A8rJaA8mkR@redhat.com> (raw)
In-Reply-To: <CAKEwX=O69LepTsB1tJO1otDziyE4Oi3PAW=jShiTxQ=Hg7dB-Q@mail.gmail.com>

On Thu, Mar 12, 2026 at 10:09:10AM -0700, Nhat Pham wrote:
> On Wed, Mar 11, 2026 at 9:01 PM Li Wang <liwang@redhat.com> wrote:
> >
> > On Wed, Mar 11, 2026 at 11:50:05AM -0700, Yosry Ahmed wrote:
> > > On Wed, Mar 11, 2026 at 4:05 AM Li Wang <liwang@redhat.com> wrote:
> > > >
> > > > test_swapin_nozswap can hit OOM before reaching its assertions on some
> > > > setups. The test currently sets memory.max=8M and then allocates/reads
> > > > 32M with memory.zswap.max=0, which may over-constrain reclaim and kill
> > > > the workload process.
> > > >
> > > > Raise memory.max to 24M so the workload can make forward progress, and
> > > > lower the swap_peak expectation from 24M to 8M to keep the check robust
> > > > across environments.
> > > >
> > > > The test intent is unchanged: verify that swapping happens while zswap
> > > > remains unused when memory.zswap.max=0.
> > > >
> > > > === Error Logs ===
> > > >
> > > >   # ./test_zswap
> > > >   TAP version 13
> > > >   1..7
> > > >   ok 1 test_zswap_usage
> > > >   not ok 2 test_swapin_nozswap
> > > >   ...
> > > >
> > > >   # dmesg
> > > >   [271641.879153] test_zswap invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
> > > >   [271641.879168] CPU: 1 UID: 0 PID: 177372 Comm: test_zswap Kdump: loaded Not tainted 6.12.0-211.el10.ppc64le #1 VOLUNTARY
> > > >   [271641.879171] Hardware name: IBM,9009-41A POWER9 (architected) 0x4e0202 0xf000005 of:IBM,FW940.02 (UL940_041) hv:phyp pSeries
> > > >   [271641.879173] Call Trace:
> > > >   [271641.879174] [c00000037540f730] [c00000000127ec44] dump_stack_lvl+0x88/0xc4 (unreliable)
> > > >   [271641.879184] [c00000037540f760] [c0000000005cc594] dump_header+0x5c/0x1e4
> > > >   [271641.879188] [c00000037540f7e0] [c0000000005cb464] oom_kill_process+0x324/0x3b0
> > > >   [271641.879192] [c00000037540f860] [c0000000005cbe48] out_of_memory+0x118/0x420
> > > >   [271641.879196] [c00000037540f8f0] [c00000000070d8ec] mem_cgroup_out_of_memory+0x18c/0x1b0
> > > >   [271641.879200] [c00000037540f990] [c000000000713888] try_charge_memcg+0x598/0x890
> > > >   [271641.879204] [c00000037540fa70] [c000000000713dbc] charge_memcg+0x5c/0x110
> > > >   [271641.879207] [c00000037540faa0] [c0000000007159f8] __mem_cgroup_charge+0x48/0x120
> > > >   [271641.879211] [c00000037540fae0] [c000000000641914] alloc_anon_folio+0x2b4/0x5a0
> > > >   [271641.879215] [c00000037540fb60] [c000000000641d58] do_anonymous_page+0x158/0x6b0
> > > >   [271641.879218] [c00000037540fbd0] [c000000000642f8c] __handle_mm_fault+0x4bc/0x910
> > > >   [271641.879221] [c00000037540fcf0] [c000000000643500] handle_mm_fault+0x120/0x3c0
> > > >   [271641.879224] [c00000037540fd40] [c00000000014bba0] ___do_page_fault+0x1c0/0x980
> > > >   [271641.879228] [c00000037540fdf0] [c00000000014c44c] hash__do_page_fault+0x2c/0xc0
> > > >   [271641.879232] [c00000037540fe20] [c0000000001565d8] do_hash_fault+0x128/0x1d0
> > > >   [271641.879236] [c00000037540fe50] [c000000000008be0] data_access_common_virt+0x210/0x220
> > > >   [271641.879548] Tasks state (memory values in pages):
> > > >   ...
> > > >   [271641.879550] [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
> > > >   [271641.879555] [ 177372]     0 177372      571        0        0        0         0    51200       96             0 test_zswap
> > > >   [271641.879562] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/no_zswap_test,task_memcg=/no_zswap_test,task=test_zswap,pid=177372,uid=0
> > > >   [271641.879578] Memory cgroup out of memory: Killed process 177372 (test_zswap) total-vm:36544kB, anon-rss:0kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:50kB oom_score_adj:0
> > >
> > > Why are we getting an OOM kill when there's a swap device? Is the
> > > device slow / not keeping up with reclaim pace?
> >
> > This is a good question. The OOM is triggered very likely because memcg
> > reclaim can't make forward progress fast enough within the retry budget
> > of try_charge_memcg.
> >
> > Looking at the OOM info, the system has 64K pages, so memory.max=8M gives
> > only 128 pages. At OOM time, RSS is 0 and swapents is only 96. Swap space
> > itself isn't full, the charge path simply gave up trying to reclaim.
> >
> > The core issue, I guess, is that with memory.zswap.max=0, every page
> > reclaimed must go through the real block device. The charge path works
> > like this: a page fault fires, charge_memcg tries to charge 64K to the
> > cgroup, the cgroup is at its limit, so try_charge_memcg attempts direct
> > reclaim to free space. If the swap device can't drain pages fast enough,
> > the reclaim attempts within the retry loop fail to bring usage below
> > memory.max, and the kernel invokes OOM, even though swap space is
> > technically available.
> >
> > Raising memory.max to 24M gives reclaim a much larger pool to work with,
> > so it can absorb I/O latency without exhausting its retry budget.
> 
> Hmmm, perhaps we should change all these constants to multiples of
> base page size of a system?

Yeah, this may better, let me try it in next version.

-- 
Regards,
Li Wang


  reply	other threads:[~2026-03-13  2:59 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 11:05 [PATCH 1/5] selftests/cgroup: detect and handle global zswap state in test_zswap Li Wang
2026-03-11 11:05 ` [PATCH 2/5] selftests/cgroup: avoid OOM in test_swapin_nozswap Li Wang
2026-03-11 18:50   ` Yosry Ahmed
2026-03-12  4:01     ` Li Wang
2026-03-12 17:09       ` Nhat Pham
2026-03-13  2:59         ` Li Wang [this message]
2026-03-11 11:05 ` [PATCH 3/5] selftests/cgroup: use runtime page size for zswpin check Li Wang
2026-03-11 18:56   ` Yosry Ahmed
2026-03-12  2:35     ` Li Wang
2026-03-11 11:05 ` [PATCH 4/5] selftest/cgroup: fix zswap test_no_invasive_cgroup_shrink on 64K pagesize system Li Wang
2026-03-11 19:01   ` Yosry Ahmed
2026-03-12  2:36     ` Li Wang
2026-03-11 11:05 ` [PATCH 5/5] selftest/cgroup: fix zswap attempt_writeback() " Li Wang
2026-03-11 18:58   ` Yosry Ahmed
2026-03-12  2:38     ` Li Wang
2026-03-11 13:20 ` [PATCH 1/5] selftests/cgroup: detect and handle global zswap state in test_zswap Michal Koutný
2026-03-11 18:41   ` Yosry Ahmed
2026-03-11 18:47 ` Yosry Ahmed
2026-03-12  1:41   ` Li Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abN9k5A8rJaA8mkR@redhat.com \
    --to=liwang@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=nphamcs@gmail.com \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=tj@kernel.org \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox