From: Zhao Li <enderaoelyther@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mawupeng1@huawei.com, Zhao Li <enderaoelyther@gmail.com>,
Muchun Song <muchun.song@linux.dev>,
Oscar Salvador <osalvador@suse.de>,
David Hildenbrand <david@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org
Subject: [PATCH v3] mm/hugetlb: fix max-only subpool accounting on alloc_hugetlb_folio failure
Date: Tue, 28 Apr 2026 19:30:38 +0800 [thread overview]
Message-ID: <20260428113037.88766-2-enderaoelyther@gmail.com> (raw)
In-Reply-To: <20260428030712.66256-2-enderaoelyther@gmail.com>
alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
is set. For a subpool with max_hpages != -1, that bumps used_hpages
regardless of whether it returns gbl_chg = 0 (rsv slot consumed) or
gbl_chg > 0 (used_hpages slot only). If the allocation later fails
before a folio is returned, the unwind must undo the used_hpages
bump. The old cleanup only ran for !gbl_chg, leaking used_hpages on
the gbl_chg > 0 path.
For gbl_chg > 0 on max-only subpools (max_hpages != -1, min_hpages
== -1), hugepage_subpool_get_pages() took only a speculative
used_hpages slot. Drop that slot directly under spool->lock. In
that configuration hugepage_subpool_put_pages() cannot restore
rsv_hpages, so the direct decrement is the exact inverse and is
race-free against concurrent puts. This matches the used_hpages-only
part of hugetlb_reserve_pages()'s out_put_pages cleanup, but
restricts it to the max-only case where no rsv_hpages restoration is
possible.
Mounts with min_hpages != -1 are left unchanged for now. v2's
approach (hugepage_subpool_put_pages() + h->resv_huge_pages++ to
back a restored rsv_hpages slot) double-counts global backing under
concurrent free_huge_folio() and creates phantom reservations under
concurrent hugetlb_unreserve_pages(). Safe cleanup of that quadrant
needs a coordinated fix across multiple call sites.
Reproduced on size=20M hugetlbfs with the faulting task in a hugetlb
cgroup whose limit is exceeded. Vanilla leaks 6/8 hugepages of
subpool quota; this patch leaks 0/8. Verified under QEMU.
Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Zhao Li <enderaoelyther@gmail.com>
---
Changes in v3:
- Replace v2's hugepage_subpool_put_pages() + h->resv_huge_pages++ on
the gbl_chg > 0 branch with a direct used_hpages-- under spool->lock.
- Restrict the cleanup to (max_hpages != -1, min_hpages == -1) where
the direct decrement is the exact inverse of the speculative bump.
Changes in v2:
- Skip the gbl_chg > 0 cleanup when max_hpages is unset.
- Add hugepage_subpool_put_pages() + h->resv_huge_pages++ on the
gbl_chg > 0 branch.
mm/hugetlb.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24bf49be047e..cfdeaf6394c5b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3025,13 +3025,24 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h),
h_cg);
out_subpool_put:
- /*
- * put page to subpool iff the quota of subpool's rsv_hpages is used
- * during hugepage_subpool_get_pages.
- */
- if (map_chg && !gbl_chg) {
- gbl_reserve = hugepage_subpool_put_pages(spool, 1);
- hugetlb_acct_memory(h, -gbl_reserve);
+ if (map_chg) {
+ if (!gbl_chg) {
+ /* Full inverse when subpool_get_pages() consumed rsv_hpages. */
+ gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+ hugetlb_acct_memory(h, -gbl_reserve);
+ } else if (gbl_chg > 0 && spool && spool->min_hpages == -1 &&
+ spool->max_hpages != -1) {
+ unsigned long flags;
+
+ /*
+ * For max-only subpools, subpool_get_pages() took only a
+ * speculative used_hpages slot. Drop that slot directly.
+ */
+ spin_lock_irqsave(&spool->lock, flags);
+ if (spool->used_hpages > 0)
+ spool->used_hpages--;
+ unlock_or_release_subpool(spool, flags);
+ }
}
--
2.50.1 (Apple Git-155)
prev parent reply other threads:[~2026-04-28 11:30 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-27 14:52 [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure Catherine
2026-04-27 15:12 ` Andrew Morton
2026-04-27 15:19 ` Catherine
2026-04-27 21:12 ` Andrew Morton
2026-04-28 3:07 ` [PATCH v2] " Zhao Li
2026-04-28 9:08 ` Oscar Salvador
2026-04-28 11:30 ` Lance Yang
2026-04-28 11:41 ` Zhao Li
2026-04-28 11:41 ` Zhao Li
2026-04-28 11:30 ` Zhao Li [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428113037.88766-2-enderaoelyther@gmail.com \
--to=enderaoelyther@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=david@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mawupeng1@huawei.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox