From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C87B2EBBA4 for ; Tue, 28 Apr 2026 11:31:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777375876; cv=none; b=G/Ej/CIEZyMnnEf3JzDUmQTWnouNPFQqYgutHXss91I1o8DTzmeMaNrGSkiY5jt6ozjhuwM1UYLC34vsLqDlWDSqEcMX2Vvill4hmGG9SP9Ng/QAwQwvJ+iv4AqM9VKkmOhjBxea4WDfd162yNaRCHaSXG5K6yv0GyUuo4zqt20= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777375876; c=relaxed/simple; bh=1J6+X9Rb6QbULlGnb0wA5flnss/x77Io3Oku/8gihjA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=b4UT9DjJHvLvL9WoHzFXfX6qCAMcFepdATD7YfVlchhUZTa5nmLrf7Fdn9X5n8nGNAReJSvx00V1PasbaEL8fVBmfSZoUtunxfgEY1Hq45aEh2mfyqgXZgL0Wnw0ERuBkKr/pjF47XEVmbv7FvBpKKPWzkyN8VU0D9Yf8eAJ1TM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=HPftHpxL; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="HPftHpxL" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1777375871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qH/WqjqHdF4QWd0fPGDq25R88RjqC3sslC7AHulH6HA=; b=HPftHpxL/Y4gPHOWy5kMQ4IpbMPDXMqd+GI3aFkKwB24OtY60ar681/C8KdGxJ4j94IWYS UXOyYXEhefeSNQICXx1BOxbtQ9dUOiaYS8vitO6IQeQkLdjNutC3u7ISewK9AfZsqCFgCq 7kvsY31A8Yo21kgcUEUbShKFr/5KQQ4= From: Lance Yang To: osalvador@suse.de, enderaoelyther@gmail.com Cc: akpm@linux-foundation.org, muchun.song@linux.dev, david@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang Subject: Re: [PATCH v2] mm/hugetlb: fix subpool accounting after cgroup charge failure Date: Tue, 28 Apr 2026 19:30:59 +0800 Message-Id: <20260428113059.79001-1-lance.yang@linux.dev> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT On Tue, Apr 28, 2026 at 11:08:04AM +0200, Oscar Salvador wrote: >On Tue, Apr 28, 2026 at 11:07:13AM +0800, Zhao Li wrote: >> alloc_hugetlb_folio() calls hugepage_subpool_get_pages() when map_chg >> is set. For subpools with max_hpages, that increments used_hpages. >> If the later hugetlb cgroup charge fails, the unwind must undo that >> charge even when gbl_chg > 0. > >I found that last sentence misleading, because we do not really care >about hugetlb cgroup charge/uncharge (besides that being of the reasons >we end up on error path) but rather the fact that we fiddle with >subpool->used_hpages and we need to undo that when we rollback. > >> hugepage_subpool_put_pages() can also restore rsv_hpages if concurrent >> frees move used_hpages below min_hpages between the get and put. When >> that happens on the gbl_chg > 0 path, restore the matching global >> reservation as well. > >Well, that does not quite explain the problem I think, at least not clear enough? >So the problem at hand (IIUC) is that > >1) if we took a global reservation >2) and concurrent hugepage_subpool_put_pages operations made > used_hpages be below min_hpages, and so subpool's reservation > was incremented, but we need to increment the global one as well > otherwise the next time we pull a page from the spool, > global resv_huge_pages will be inbalanced > > >> Skip the gbl_chg > 0 put when max_hpages is unset. For a min_size-only >> subpool, get_pages() did not change subpool state and put_pages() would >> create a false reservation. >> >> Signed-off-by: Zhao Li >> --- >> Changes in v2: >> - Handle rsv_hpages restoration when racing frees cross min_hpages. >> - Skip gbl_chg > 0 put_pages() when max_hpages is unset. >> >> mm/hugetlb.c | 21 ++++++++++++++++----- >> 1 file changed, 16 insertions(+), 5 deletions(-) >> >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >> index f24bf49be047e..4065d66fdcb5c 100644 >> --- a/mm/hugetlb.c >> +++ b/mm/hugetlb.c >> @@ -3026,12 +3026,23 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, >> h_cg); >> out_subpool_put: >> /* >> - * put page to subpool iff the quota of subpool's rsv_hpages is used >> - * during hugepage_subpool_get_pages. >> + * map_chg means hugepage_subpool_get_pages() succeeded above. >> + * If max_hpages accounting was touched, undo it. If racing frees >> + * moved the subpool below min_hpages, the put path may restore a >> + * subpool reservation. Restore the matching global reservation too. > >I would split the comment in two parts and place them within the block >they belong, otherwise it sounds confusing. >And maybe elaborate a little bit more. > >Subpools, reservations and hugetlb make a very head-spinning situation, so let >us make our life easier. Yep, the comment is confusing to me as well ... + if (map_chg) { + if (!gbl_chg) { + gbl_reserve = hugepage_subpool_put_pages(spool, 1); + hugetlb_acct_memory(h, -gbl_reserve); + } else if (spool && spool->max_hpages != -1) { + gbl_reserve = hugepage_subpool_put_pages(spool, 1); + if (!gbl_reserve) { + spin_lock_irq(&hugetlb_lock); + h->resv_huge_pages++; + spin_unlock_irq(&hugetlb_lock); + } + } } IIUC, there are three cases: 1) !gbl_chg hugepage_subpool_get_pages() consumed one reservation from spool->rsv_hpages. So the error path needs to call hugepage_subpool_put_pages() and then drop the matching global reservation with hugetlb_acct_memory(). 2) gbl_chg > 0 && spool->max_hpages != -1 hugepage_subpool_get_pages() did not consume spool->rsv_hpages, but it did increment spool->used_hpages after the max_hpages check passed. So the error path still needs to call hugepage_subpool_put_pages() to undo that. If hugepage_subpool_put_pages() returns 0 here, it restored one reservation in spool->rsv_hpages, so we also need to increment h->resv_huge_pages. 3) gbl_chg > 0 && spool->max_hpages == -1 hugepage_subpool_get_pages() did not change spool->rsv_hpages or spool->used_hpages, so the error path should not call hugepage_subpool_put_pages(), otherwise it could create a false reservation in spool->rsv_hpages. Hopefully I didn't miss anything :) Lance