All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure
@ 2026-04-27 14:52 Catherine
  2026-04-27 15:12 ` Andrew Morton
  2026-04-28  3:07 ` [PATCH v2] " Zhao Li
  0 siblings, 2 replies; 10+ messages in thread
From: Catherine @ 2026-04-27 14:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Muchun Song, Oscar Salvador, David Hildenbrand, linux-mm,
	linux-kernel, Catherine

alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
is set.  For subpools with max_hpages, that increments used_hpages even
when the returned gbl_chg is positive.

If a later hugetlb cgroup charge fails, the cleanup currently calls
hugepage_subpool_put_pages() only for !gbl_chg.  The gbl_chg > 0 path
therefore leaks one used_hpages charge per failure.

Always undo the subpool charge after a successful subpool get.  Keep the
global reservation accounting under !gbl_chg, because only that path
consumed a reservation from the subpool.

Signed-off-by: Catherine <enderaoelyther@gmail.com>
---
 mm/hugetlb.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24bf49be..b3ad024a0 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3026,12 +3026,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 						    h_cg);
 out_subpool_put:
 	/*
-	 * put page to subpool iff the quota of subpool's rsv_hpages is used
-	 * during hugepage_subpool_get_pages.
+	 * map_chg means hugepage_subpool_get_pages() succeeded above.
+	 * Always undo the subpool quota charge; only drop global reservation
+	 * accounting if the subpool consumed a reservation.
 	 */
-	if (map_chg && !gbl_chg) {
+	if (map_chg) {
 		gbl_reserve = hugepage_subpool_put_pages(spool, 1);
-		hugetlb_acct_memory(h, -gbl_reserve);
+		if (!gbl_chg)
+			hugetlb_acct_memory(h, -gbl_reserve);
 	}
 
 
-- 
2.50.1 (Apple Git-155)



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-27 14:52 [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure Catherine
@ 2026-04-27 15:12 ` Andrew Morton
  2026-04-27 15:19   ` Catherine
  2026-04-28  3:07 ` [PATCH v2] " Zhao Li
  1 sibling, 1 reply; 10+ messages in thread
From: Andrew Morton @ 2026-04-27 15:12 UTC (permalink / raw)
  To: Catherine
  Cc: Muchun Song, Oscar Salvador, David Hildenbrand, linux-mm,
	linux-kernel

On Mon, 27 Apr 2026 22:52:48 +0800 Catherine <enderaoelyther@gmail.com> wrote:

> alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
> is set.  For subpools with max_hpages, that increments used_hpages even
> when the returned gbl_chg is positive.
> 
> If a later hugetlb cgroup charge fails, the cleanup currently calls
> hugepage_subpool_put_pages() only for !gbl_chg.  The gbl_chg > 0 path
> therefore leaks one used_hpages charge per failure.
> 
> Always undo the subpool charge after a successful subpool get.  Keep the
> global reservation accounting under !gbl_chg, because only that path
> consumed a reservation from the subpool.

Thanks.

We do prefer full, real names for kernel alterations.  Can you please
provide that?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-27 15:12 ` Andrew Morton
@ 2026-04-27 15:19   ` Catherine
  2026-04-27 21:12     ` Andrew Morton
  0 siblings, 1 reply; 10+ messages in thread
From: Catherine @ 2026-04-27 15:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Catherine, Muchun Song, Oscar Salvador, David Hildenbrand,
	linux-mm, linux-kernel

Hi Andrew,

My real name is Zhao Li. Please use Zhao Li <enderaoelyther@gmail.com>
for authorship and Signed-off-by.

Thanks,
Catherine


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-27 15:19   ` Catherine
@ 2026-04-27 21:12     ` Andrew Morton
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Morton @ 2026-04-27 21:12 UTC (permalink / raw)
  To: Catherine
  Cc: Muchun Song, Oscar Salvador, David Hildenbrand, linux-mm,
	linux-kernel

On Mon, 27 Apr 2026 23:19:35 +0800 Catherine <enderaoelyther@gmail.com> wrote:

> Hi Andrew,
> 
> My real name is Zhao Li. Please use Zhao Li <enderaoelyther@gmail.com>
> for authorship and Signed-off-by.

Thanks.

AI review might have found an accounting imbalance:
	https://sashiko.dev/#/patchset/20260427145247.84157-2-enderaoelyther@gmail.com


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-27 14:52 [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure Catherine
  2026-04-27 15:12 ` Andrew Morton
@ 2026-04-28  3:07 ` Zhao Li
  2026-04-28  9:08   ` Oscar Salvador
  2026-04-28 11:30   ` [PATCH v3] mm/hugetlb: fix max-only subpool accounting on alloc_hugetlb_folio failure Zhao Li
  1 sibling, 2 replies; 10+ messages in thread
From: Zhao Li @ 2026-04-28  3:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Muchun Song, Oscar Salvador, David Hildenbrand, linux-mm,
	linux-kernel

alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
is set.  For subpools with max_hpages, that increments used_hpages.
If the later hugetlb cgroup charge fails, the unwind must undo that
charge even when gbl_chg > 0.

hugepage_subpool_put_pages() can also restore rsv_hpages if concurrent
frees move used_hpages below min_hpages between the get and put.  When
that happens on the gbl_chg > 0 path, restore the matching global
reservation as well.

Skip the gbl_chg > 0 put when max_hpages is unset.  For a min_size-only
subpool, get_pages() did not change subpool state and put_pages() would
create a false reservation.

Signed-off-by: Zhao Li <enderaoelyther@gmail.com>
---
Changes in v2:
- Handle rsv_hpages restoration when racing frees cross min_hpages.
- Skip gbl_chg > 0 put_pages() when max_hpages is unset.

 mm/hugetlb.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24bf49be047e..4065d66fdcb5c 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3026,12 +3026,23 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 						    h_cg);
 out_subpool_put:
 	/*
-	 * put page to subpool iff the quota of subpool's rsv_hpages is used
-	 * during hugepage_subpool_get_pages.
+	 * map_chg means hugepage_subpool_get_pages() succeeded above.
+	 * If max_hpages accounting was touched, undo it.  If racing frees
+	 * moved the subpool below min_hpages, the put path may restore a
+	 * subpool reservation.  Restore the matching global reservation too.
 	 */
-	if (map_chg && !gbl_chg) {
-		gbl_reserve = hugepage_subpool_put_pages(spool, 1);
-		hugetlb_acct_memory(h, -gbl_reserve);
+	if (map_chg) {
+		if (!gbl_chg) {
+			gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+			hugetlb_acct_memory(h, -gbl_reserve);
+		} else if (spool && spool->max_hpages != -1) {
+			gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+			if (!gbl_reserve) {
+				spin_lock_irq(&hugetlb_lock);
+				h->resv_huge_pages++;
+				spin_unlock_irq(&hugetlb_lock);
+			}
+		}
 	}


--
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-28  3:07 ` [PATCH v2] " Zhao Li
@ 2026-04-28  9:08   ` Oscar Salvador
  2026-04-28 11:30     ` Lance Yang
  2026-04-28 11:41     ` Zhao Li
  2026-04-28 11:30   ` [PATCH v3] mm/hugetlb: fix max-only subpool accounting on alloc_hugetlb_folio failure Zhao Li
  1 sibling, 2 replies; 10+ messages in thread
From: Oscar Salvador @ 2026-04-28  9:08 UTC (permalink / raw)
  To: Zhao Li
  Cc: Andrew Morton, Muchun Song, David Hildenbrand, linux-mm,
	linux-kernel

On Tue, Apr 28, 2026 at 11:07:13AM +0800, Zhao Li wrote:
> alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
> is set.  For subpools with max_hpages, that increments used_hpages.
> If the later hugetlb cgroup charge fails, the unwind must undo that
> charge even when gbl_chg > 0.

I found that last sentence misleading, because we do not really care
about hugetlb cgroup charge/uncharge (besides that being of the reasons
we end up on error path) but rather the fact that we fiddle with
subpool->used_hpages and we need to undo that when we rollback.

> hugepage_subpool_put_pages() can also restore rsv_hpages if concurrent
> frees move used_hpages below min_hpages between the get and put.  When
> that happens on the gbl_chg > 0 path, restore the matching global
> reservation as well.

Well, that does not quite explain the problem I think, at least not clear enough?
So the problem at hand (IIUC) is that

1) if we took a global reservation
2) and concurrent hugepage_subpool_put_pages operations made
   used_hpages be below min_hpages, and so subpool's reservation
   was incremented, but we need to increment the global one as well
   otherwise the next time we pull a page from the spool,
   global resv_huge_pages will be inbalanced


> Skip the gbl_chg > 0 put when max_hpages is unset.  For a min_size-only
> subpool, get_pages() did not change subpool state and put_pages() would
> create a false reservation.
> 
> Signed-off-by: Zhao Li <enderaoelyther@gmail.com>
> ---
> Changes in v2:
> - Handle rsv_hpages restoration when racing frees cross min_hpages.
> - Skip gbl_chg > 0 put_pages() when max_hpages is unset.
> 
>  mm/hugetlb.c | 21 ++++++++++++++++-----
>  1 file changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f24bf49be047e..4065d66fdcb5c 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3026,12 +3026,23 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>  						    h_cg);
>  out_subpool_put:
>  	/*
> -	 * put page to subpool iff the quota of subpool's rsv_hpages is used
> -	 * during hugepage_subpool_get_pages.
> +	 * map_chg means hugepage_subpool_get_pages() succeeded above.
> +	 * If max_hpages accounting was touched, undo it.  If racing frees
> +	 * moved the subpool below min_hpages, the put path may restore a
> +	 * subpool reservation.  Restore the matching global reservation too.

I would split the comment in two parts and place them within the block
they belong, otherwise it sounds confusing.
And maybe elaborate a little bit more.

Subpools, reservations and hugetlb make a very head-spinning situation, so let
us make our life easier.


-- 
Oscar Salvador
SUSE Labs


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3] mm/hugetlb: fix max-only subpool accounting on alloc_hugetlb_folio failure
  2026-04-28  3:07 ` [PATCH v2] " Zhao Li
  2026-04-28  9:08   ` Oscar Salvador
@ 2026-04-28 11:30   ` Zhao Li
  1 sibling, 0 replies; 10+ messages in thread
From: Zhao Li @ 2026-04-28 11:30 UTC (permalink / raw)
  To: Andrew Morton
  Cc: mawupeng1, Zhao Li, Muchun Song, Oscar Salvador,
	David Hildenbrand, linux-mm, linux-kernel, stable

alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
is set.  For a subpool with max_hpages != -1, that bumps used_hpages
regardless of whether it returns gbl_chg = 0 (rsv slot consumed) or
gbl_chg > 0 (used_hpages slot only).  If the allocation later fails
before a folio is returned, the unwind must undo the used_hpages
bump.  The old cleanup only ran for !gbl_chg, leaking used_hpages on
the gbl_chg > 0 path.

For gbl_chg > 0 on max-only subpools (max_hpages != -1, min_hpages
== -1), hugepage_subpool_get_pages() took only a speculative
used_hpages slot.  Drop that slot directly under spool->lock.  In
that configuration hugepage_subpool_put_pages() cannot restore
rsv_hpages, so the direct decrement is the exact inverse and is
race-free against concurrent puts.  This matches the used_hpages-only
part of hugetlb_reserve_pages()'s out_put_pages cleanup, but
restricts it to the max-only case where no rsv_hpages restoration is
possible.

Mounts with min_hpages != -1 are left unchanged for now.  v2's
approach (hugepage_subpool_put_pages() + h->resv_huge_pages++ to
back a restored rsv_hpages slot) double-counts global backing under
concurrent free_huge_folio() and creates phantom reservations under
concurrent hugetlb_unreserve_pages().  Safe cleanup of that quadrant
needs a coordinated fix across multiple call sites.

Reproduced on size=20M hugetlbfs with the faulting task in a hugetlb
cgroup whose limit is exceeded.  Vanilla leaks 6/8 hugepages of
subpool quota; this patch leaks 0/8.  Verified under QEMU.

Fixes: a833a693a490 ("mm: hugetlb: fix incorrect fallback for subpool")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Zhao Li <enderaoelyther@gmail.com>
---
Changes in v3:
- Replace v2's hugepage_subpool_put_pages() + h->resv_huge_pages++ on
  the gbl_chg > 0 branch with a direct used_hpages-- under spool->lock.
- Restrict the cleanup to (max_hpages != -1, min_hpages == -1) where
  the direct decrement is the exact inverse of the speculative bump.

Changes in v2:
- Skip the gbl_chg > 0 cleanup when max_hpages is unset.
- Add hugepage_subpool_put_pages() + h->resv_huge_pages++ on the
  gbl_chg > 0 branch.

 mm/hugetlb.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24bf49be047e..cfdeaf6394c5b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3025,13 +3025,24 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
 		hugetlb_cgroup_uncharge_cgroup_rsvd(idx, pages_per_huge_page(h),
 						    h_cg);
 out_subpool_put:
-	/*
-	 * put page to subpool iff the quota of subpool's rsv_hpages is used
-	 * during hugepage_subpool_get_pages.
-	 */
-	if (map_chg && !gbl_chg) {
-		gbl_reserve = hugepage_subpool_put_pages(spool, 1);
-		hugetlb_acct_memory(h, -gbl_reserve);
+	if (map_chg) {
+		if (!gbl_chg) {
+			/* Full inverse when subpool_get_pages() consumed rsv_hpages. */
+			gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+			hugetlb_acct_memory(h, -gbl_reserve);
+		} else if (gbl_chg > 0 && spool && spool->min_hpages == -1 &&
+			   spool->max_hpages != -1) {
+			unsigned long flags;
+
+			/*
+			 * For max-only subpools, subpool_get_pages() took only a
+			 * speculative used_hpages slot. Drop that slot directly.
+			 */
+			spin_lock_irqsave(&spool->lock, flags);
+			if (spool->used_hpages > 0)
+				spool->used_hpages--;
+			unlock_or_release_subpool(spool, flags);
+		}
 	}


--
2.50.1 (Apple Git-155)


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-28  9:08   ` Oscar Salvador
@ 2026-04-28 11:30     ` Lance Yang
  2026-04-28 11:41       ` Zhao Li
  2026-04-28 11:41     ` Zhao Li
  1 sibling, 1 reply; 10+ messages in thread
From: Lance Yang @ 2026-04-28 11:30 UTC (permalink / raw)
  To: osalvador, enderaoelyther
  Cc: akpm, muchun.song, david, linux-mm, linux-kernel, Lance Yang


On Tue, Apr 28, 2026 at 11:08:04AM +0200, Oscar Salvador wrote:
>On Tue, Apr 28, 2026 at 11:07:13AM +0800, Zhao Li wrote:
>> alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg
>> is set.  For subpools with max_hpages, that increments used_hpages.
>> If the later hugetlb cgroup charge fails, the unwind must undo that
>> charge even when gbl_chg > 0.
>
>I found that last sentence misleading, because we do not really care
>about hugetlb cgroup charge/uncharge (besides that being of the reasons
>we end up on error path) but rather the fact that we fiddle with
>subpool->used_hpages and we need to undo that when we rollback.
>
>> hugepage_subpool_put_pages() can also restore rsv_hpages if concurrent
>> frees move used_hpages below min_hpages between the get and put.  When
>> that happens on the gbl_chg > 0 path, restore the matching global
>> reservation as well.
>
>Well, that does not quite explain the problem I think, at least not clear enough?
>So the problem at hand (IIUC) is that
>
>1) if we took a global reservation
>2) and concurrent hugepage_subpool_put_pages operations made
>   used_hpages be below min_hpages, and so subpool's reservation
>   was incremented, but we need to increment the global one as well
>   otherwise the next time we pull a page from the spool,
>   global resv_huge_pages will be inbalanced
>
>
>> Skip the gbl_chg > 0 put when max_hpages is unset.  For a min_size-only
>> subpool, get_pages() did not change subpool state and put_pages() would
>> create a false reservation.
>> 
>> Signed-off-by: Zhao Li <enderaoelyther@gmail.com>
>> ---
>> Changes in v2:
>> - Handle rsv_hpages restoration when racing frees cross min_hpages.
>> - Skip gbl_chg > 0 put_pages() when max_hpages is unset.
>> 
>>  mm/hugetlb.c | 21 ++++++++++++++++-----
>>  1 file changed, 16 insertions(+), 5 deletions(-)
>> 
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index f24bf49be047e..4065d66fdcb5c 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3026,12 +3026,23 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>>  						    h_cg);
>>  out_subpool_put:
>>  	/*
>> -	 * put page to subpool iff the quota of subpool's rsv_hpages is used
>> -	 * during hugepage_subpool_get_pages.
>> +	 * map_chg means hugepage_subpool_get_pages() succeeded above.
>> +	 * If max_hpages accounting was touched, undo it.  If racing frees
>> +	 * moved the subpool below min_hpages, the put path may restore a
>> +	 * subpool reservation.  Restore the matching global reservation too.
>
>I would split the comment in two parts and place them within the block
>they belong, otherwise it sounds confusing.
>And maybe elaborate a little bit more.
>
>Subpools, reservations and hugetlb make a very head-spinning situation, so let
>us make our life easier.

Yep, the comment is confusing to me as well ...

+	if (map_chg) {
+		if (!gbl_chg) {
+			gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+			hugetlb_acct_memory(h, -gbl_reserve);
+		} else if (spool && spool->max_hpages != -1) {
+			gbl_reserve = hugepage_subpool_put_pages(spool, 1);
+			if (!gbl_reserve) {
+				spin_lock_irq(&hugetlb_lock);
+				h->resv_huge_pages++;
+				spin_unlock_irq(&hugetlb_lock);
+			}
+		}
 	}

IIUC, there are three cases:

1) !gbl_chg

hugepage_subpool_get_pages() consumed one reservation from
spool->rsv_hpages.  So the error path needs to call
hugepage_subpool_put_pages() and then drop the matching global
reservation with hugetlb_acct_memory().

2) gbl_chg > 0 && spool->max_hpages != -1

hugepage_subpool_get_pages() did not consume spool->rsv_hpages, but it
did increment spool->used_hpages after the max_hpages check passed.  So
the error path still needs to call hugepage_subpool_put_pages() to undo
that.

If hugepage_subpool_put_pages() returns 0 here, it restored one
reservation in spool->rsv_hpages, so we also need to increment
h->resv_huge_pages.

3) gbl_chg > 0 && spool->max_hpages == -1

hugepage_subpool_get_pages() did not change spool->rsv_hpages or
spool->used_hpages, so the error path should not call
hugepage_subpool_put_pages(), otherwise it could create a false
reservation in spool->rsv_hpages.

Hopefully I didn't miss anything :)
Lance


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-28  9:08   ` Oscar Salvador
  2026-04-28 11:30     ` Lance Yang
@ 2026-04-28 11:41     ` Zhao Li
  1 sibling, 0 replies; 10+ messages in thread
From: Zhao Li @ 2026-04-28 11:41 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: Zhao Li, Andrew Morton, Muchun Song, David Hildenbrand,
	Lance Yang, linux-mm, linux-kernel

On Tue, Apr 28, 2026 at 11:08:04AM +0200, Oscar Salvador wrote:
> I found that last sentence misleading, because we do not really
> care about hugetlb cgroup charge/uncharge (besides that being of
> the reasons we end up on error path) but rather the fact that we
> fiddle with subpool->used_hpages and we need to undo that when we
> rollback.

Agreed - reframed in v3.  The commit body now states the bug as
the unwind missing the used_hpages rollback, without pinning it to
the cgroup-charge case, and the subject is narrowed to "fix
max-only subpool accounting on alloc_hugetlb_folio failure".

> Well, that does not quite explain the problem I think, at least
> not clear enough?  [...]

Fair - that explanation got tangled because v2's design itself was
trying to compensate for racing min crossings.  v3 sidesteps it
entirely: the gbl_chg > 0 cleanup is now restricted to
(max_hpages != -1, min_hpages == -1).  In that configuration
hugepage_subpool_put_pages()'s min-restoration branch is dead, so a
direct used_hpages-- under spool->lock is the exact inverse of the
speculative bump - no h->resv_huge_pages++ needed, no rsv_hpages
publication, no racing-put reasoning.

Mounts with min_hpages != -1 are left at v1 behaviour for now.
That quadrant has an inherited race that also exists at
hugetlb_reserve_pages()'s out_put_pages cleanup, so a coordinated
fix belongs in a separate RFC rather than this stable backport.

> I would split the comment in two parts and place them within the
> block they belong, otherwise it sounds confusing.
>
> Subpools, reservations and hugetlb make a very head-spinning
> situation, so let us make our life easier.

Done - one short comment per branch placed inside the relevant
code block in v3.  Hopefully easier to follow now.

v3:
  https://lore.kernel.org/linux-mm/20260428113037.88766-2-enderaoelyther@gmail.com/

Thanks for the review.

--
Zhao Li


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure
  2026-04-28 11:30     ` Lance Yang
@ 2026-04-28 11:41       ` Zhao Li
  0 siblings, 0 replies; 10+ messages in thread
From: Zhao Li @ 2026-04-28 11:41 UTC (permalink / raw)
  To: Lance Yang
  Cc: Zhao Li, Oscar Salvador, Andrew Morton, Muchun Song,
	David Hildenbrand, linux-mm, linux-kernel

On Tue, Apr 28, 2026 at 07:30:59PM +0800, Lance Yang wrote:
> IIUC, there are three cases:
> [...]
> 2) gbl_chg > 0 && spool->max_hpages != -1
> [...]
> If hugepage_subpool_put_pages() returns 0 here, it restored one
> reservation in spool->rsv_hpages, so we also need to increment
> h->resv_huge_pages.

Thanks for working through the three cases - that's a clean
breakdown and matches what v2 was trying to do.  We ran into trouble
on case 2 specifically: the h->resv_huge_pages++ on the
put-returned-0 path looked right in isolation but turns out to be
unsafe once you put it next to concurrent hugetlb_unreserve_pages()
or free_huge_folio().  Two reachable orderings break it:

 * free_huge_folio() with HPageRestoreReserve set already does
   h->resv_huge_pages++ on its own.  v2's bump on the gbl_chg > 0
   cleanup double-counts against that.

 * hugetlb_unreserve_pages() does hugetlb_acct_memory(h, -X) which
   subtracts from h->resv_huge_pages and may also return surplus
   backing.  v2's bump then leaves rsv_hpages backed by no
   h->resv_huge_pages - a phantom reservation that the next
   subpool_get_pages() consumes without real backing.

v3 ended up going a different way: the gbl_chg > 0 cleanup is now
restricted to (max_hpages != -1, min_hpages == -1).  In that
configuration hugepage_subpool_put_pages()'s min-restoration branch
is dead, so a direct used_hpages-- under spool->lock is the exact
inverse of the speculative bump - no put_pages(), no
h->resv_huge_pages++, no concurrent-races to reason about.

Your case 3 (max_hpages == -1) is unchanged: cleanup is a no-op,
because get_pages() didn't touch any subpool field.

Mounts with min_hpages != -1 are left at v1 behaviour for now.  That
quadrant has an inherited race that also exists at
hugetlb_reserve_pages()'s out_put_pages cleanup; a coordinated fix
belongs in a separate RFC rather than this stable backport.

v3:
  https://lore.kernel.org/linux-mm/20260428113037.88766-2-enderaoelyther@gmail.com/

Thanks for the review.

--
Zhao Li


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-04-28 11:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 14:52 [PATCH] mm/hugetlb: fix subpool accounting after cgroup charge failure Catherine
2026-04-27 15:12 ` Andrew Morton
2026-04-27 15:19   ` Catherine
2026-04-27 21:12     ` Andrew Morton
2026-04-28  3:07 ` [PATCH v2] " Zhao Li
2026-04-28  9:08   ` Oscar Salvador
2026-04-28 11:30     ` Lance Yang
2026-04-28 11:41       ` Zhao Li
2026-04-28 11:41     ` Zhao Li
2026-04-28 11:30   ` [PATCH v3] mm/hugetlb: fix max-only subpool accounting on alloc_hugetlb_folio failure Zhao Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.