From: Johannes Weiner <hannes@cmpxchg.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Barry Song <baohua@kernel.org>,
akpm@linux-foundation.org, axelrasmussen@google.com,
baolin.wang@linux.alibaba.com, dev.jain@arm.com,
kasong@tencent.com, lance.yang@linux.dev, liam@infradead.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
npache@redhat.com, qi.zheng@linux.dev, ryan.roberts@arm.com,
shakeel.butt@linux.dev, weixugc@google.com, yuanchu@google.com,
zhaonanzhe@xiaomi.com, ziy@nvidia.com,
Michal Hocko <mhocko@suse.com>,
Roman Gushchin <roman.gushchin@linux.dev>
Subject: Re: [RFC PATCH] mm: Avoiding split large folios if swap has no space
Date: Thu, 25 Jun 2026 09:36:23 -0400 [thread overview]
Message-ID: <aj0u12N_GzGtQT6K@cmpxchg.org> (raw)
In-Reply-To: <c29f90c6-2075-43e8-8f0d-0d6718a0f124@kernel.org>
On Thu, Jun 25, 2026 at 09:49:56AM +0200, David Hildenbrand (Arm) wrote:
> >>
> >> But now I wonder whether we would also want to check "is there any free swap
> >> space", not just "is there any swap".
> >
> > I don't quite understand you. get_nr_swap_pages() returns
> > nr_swap_pages, which increases or decreases as swap is allocated or
> > freed. I guess it just reflects how many swaps we currently have
> > available?
>
> Indeed, I was confused by the function name it's "free swap pages". So all goof :)
>
> >
> >>
> >>
> >> Essentially, try returning -E2BIG if there is the chance to swap out after
> >> split, and -ENOSPC / -ENOMEM if a split wouldn't help.
> >>
> >>> }
> >>>
> >>> again:
> >>> @@ -1769,11 +1772,13 @@ int folio_alloc_swap(struct folio *folio)
> >>> }
> >>>
> >>> /* Need to call this even if allocation failed, for MEMCG_SWAP_FAIL. */
> >>> - if (unlikely(mem_cgroup_try_charge_swap(folio)))
> >>> + if (unlikely(mem_cgroup_try_charge_swap(folio))) {
> >>> swap_cache_del_folio(folio);
> >>> + return -ENOMEM;
> >>
> >> Here we wouldn't have the information whether we could charge after a split.
> >>
> >> So that would require a rework to signal this more cleanly to the caller.
> >
> > Yep. The tricky part is that mem_cgroup_try_charge_swap() cannot
> > return how much swap quota is available in the memcg. Do you prefer to
> > add an output argument to mem_cgroup_try_charge_swap() to expose
> > that
> That would probably be cleanest, if that is easily possible. We would want to
> get memcg maintainer feedback on that.
>
> @memcg folks: we'd like to know whether splitting a large folio would make
> mem_cgroup_try_charge_swap() succeed on a split (smaller) part, to distinguish
> "there is no way we can swap out anything, don't split" vs. "we could swap out,
> split".
It's technically doable, but is this worth the bother? The remaining
headroom is less than a large folio. You can split this one, but you
cannot even swap out all of its subpages anymore? From the cgroup
side, we don't need the limit to be obeyed this rigidly. We overcharge
temporarily in other places if it's convenient to do so. A fuzz factor
around the limit is acceptable.
But if you still want to do it, here is how:
The page_counter_try_charge() in __mem_cgroup_try_charge_swap() walks
the hierarchy upwards. If it fails, it will store the first level that
failed against its limit. You can do the mem_cgroup_margin() math
against this counter to determine headroom. An ancestor *could* be
more restrictive, so you need to finish the hierarchy walk to the root
and use the min() of all the swap.max - page_counter_read(swap). Then
return that in a return argument from __mem_cgroup_try_charge_swap().
next prev parent reply other threads:[~2026-06-25 13:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-18 22:17 [RFC PATCH] mm: Avoiding split large folios if swap has no space Barry Song (Xiaomi)
2026-06-18 23:46 ` Nico Pache
2026-06-19 0:59 ` Barry Song
2026-06-19 14:01 ` David Hildenbrand (Arm)
2026-06-19 23:01 ` Barry Song
2026-06-19 14:04 ` David Hildenbrand (Arm)
2026-06-20 8:10 ` Barry Song (Xiaomi)
2026-06-22 3:04 ` Baolin Wang
2026-06-22 3:36 ` Barry Song
2026-06-22 4:06 ` Baolin Wang
2026-06-22 8:58 ` David Hildenbrand (Arm)
2026-06-24 23:08 ` Barry Song
2026-06-25 7:49 ` David Hildenbrand (Arm)
2026-06-25 13:36 ` Johannes Weiner [this message]
2026-06-25 13:45 ` David Hildenbrand (Arm)
2026-06-26 6:15 ` Barry Song
2026-06-26 10:01 ` Johannes Weiner
2026-06-19 19:17 ` Kairui Song
2026-06-19 22:42 ` Barry Song (Xiaomi)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aj0u12N_GzGtQT6K@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=kasong@tencent.com \
--cc=lance.yang@linux.dev \
--cc=liam@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=npache@redhat.com \
--cc=qi.zheng@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhaonanzhe@xiaomi.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox