From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-186.mta0.migadu.com (out-186.mta0.migadu.com [91.218.175.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64FA13DA5B1 for ; Mon, 13 Apr 2026 14:16:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.186 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776089815; cv=none; b=b+9X5Bmi30apXloqLcTkzYLghRW4VfJ4aMzndg7NWLCaNgF+8TI575FZHCZ1vTS0dY7UzEOjEjePLthC91KMJP/vVVf0h8vLbf3hSRFSjm43YOrPt9AVvFwkYDH+f95Zi8tSfe4YXRnMjv8qAHDvPCy5n8QZP1UT3DfVyJSW0Ns= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776089815; c=relaxed/simple; bh=xdpq9uW+B5fBMOHJLILpN9peGqzyiSZHYxX2qd8h5xo=; h=Content-Type:From:Mime-Version:Subject:Date:Message-Id:References: Cc:In-Reply-To:To; b=V8q64dlUoE0rR21SvtvyUNtXGjsTGuMLHppsKltv40ccW4t3M/09C6dxhitijs1JJGQf84k7OxA7rBpFlwPxDpJCxni4lJ1KSgD0MX+lU18Dc8vpeLLXtLTOWFEInJdIGLcNJrF+qEBbiyIxHDXnGWUOUsDv26qKBLCe0p2BXWk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=Tat+hmEl; arc=none smtp.client-ip=91.218.175.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="Tat+hmEl" Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1776089811; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Kzkhn3LFWO3+QSE/fQj/hH3qJPFehnDOa4OjhTu82w=; b=Tat+hmEl4LkYQIHCK1EKoRWrUNb7DtBDa9KMzVcByj1HdimbvKHuSURjjlcj/voNVEuEJu 17NRVMv2anbHCIs+VBh9J0mCkoI+wiTqWkYKSat0cu6ljmDFzRXwLJdVwBPFJUkJ1yTj83 hm/N7D4bqI1p4sipyGvCWKX/p/tPtMI= Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (1.0) Subject: Re: [PATCH 01/49] mm/sparse: fix vmemmap accounting imbalance on memory hotplug error Date: Mon, 13 Apr 2026 22:16:20 +0800 Message-Id: <48CF5603-D8E1-4C86-8554-E8BA03D195FB@linux.dev> References: <9642F4D9-8F1B-47A4-90BE-C72BB8DE9A11@linux.dev> Cc: Muchun Song , Andrew Morton , David Hildenbrand , Oscar Salvador , Michael Ellerman , Madhavan Srinivasan , Lorenzo Stoakes , Liam R Howlett , Vlastimil Babka , Suren Baghdasaryan , Michal Hocko , Nicholas Piggin , Christophe Leroy , aneesh.kumar@linux.ibm.com, joao.m.martins@oracle.com, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org In-Reply-To: <9642F4D9-8F1B-47A4-90BE-C72BB8DE9A11@linux.dev> To: Mike Rapoport X-Migadu-Flow: FLOW_OUT > On Apr 13, 2026, at 22:05, Muchun Song wrote: >=20 > =EF=BB=BF >=20 >> On Apr 13, 2026, at 21:36, Mike Rapoport wrote: >>=20 >> =EF=BB=BFHi Muchun, >=20 > Hi, >=20 >>=20 >> On Mon, Apr 13, 2026 at 08:47:45PM +0800, Muchun Song wrote: >>>>>> On Apr 13, 2026, at 20:05, Mike Rapoport wrote: >>>>>>>>>>=20 >>>>>>>>>> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c >>>>>>>>>> index 6eadb9d116e4..ee27d0c0efe2 100644 >>>>>>>>>> --- a/mm/sparse-vmemmap.c >>>>>>>>>> +++ b/mm/sparse-vmemmap.c >>>>>>>>>> @@ -822,11 +822,11 @@ static struct page * __meminit section_acti= vate(int nid, unsigned long pfn, >>>>>>>>>> return pfn_to_page(pfn); >>>>>>>>>>=20 >>>>>>>>>> memmap =3D populate_section_memmap(pfn, nr_pages, nid, altmap, pg= map); >>>>>>>>>> + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), P= AGE_SIZE)); >>>>>>>>>=20 >>>>>>>>> This logically belongs to success path in populate_section_memmap(= ). If we >>>>>>>>> update the counter there, we won't need to temporarily increase it= at all. >>>>>>>>=20 >>>>>>>> Not strictly related to this patchset, but it seems, we can have a s= ingle >>>>>>>> memmap_boot_pages_add() in memmap_alloc() rather than to update the= counter >>>>>>>> in memmap_alloc() callers. >>>>>>>=20 >>>>>>> It will indeed become simpler and is a good cleanup direction, but t= here >>>>>>> is a slight change in semantics: the page tables used for vmemmap pa= ge >>>>>>> mapping will also be counted in memmap_boot_pages_add(). This might n= ot >>>>>>> be an issue (after all, the size of the page tables is very small co= mpared >>>>>>> to struct pages, right?). >>>>>>>=20 >>>>>>> Additionally, I still lean toward making no changes to this patch, b= ecause >>>>>>> this is a pure bugfix patch =E2=80=94 of course, it is meant to faci= litate backporting >>>>>>> for those who need it. The cleanup would involve many more changes, s= o I >>>>>>> prefer to do that in a separate patch. What do you think? >>>>>=20 >>>>> For this patch and easy backporting I still think that cleaner to have= the >>>>> counter incremented in populate_section_memmap() rather immediately af= ter >>>>> it. >>>=20 >>> Hi Mike, >>>=20 >>> Alright, let=E2=80=99s revisit your solution. After we=E2=80=99ve moved t= he counter into the >>> populate_section_memmap(), we still need to increase the counter tempora= rily >>> (but in populate_section_memmap()) even if we fail to populate. That=E2=80= =99s >>> because section_deactivate() reduces the counter without exception, isn=E2= =80=99t it? >>> Just want to make sure we are on the same page on the meaning of =E2=80=9C= temporarily >>> increase=E2=80=9D. Maybe you do not mean =E2=80=9Ctemporarily=E2=80=9D i= n this case. >>=20 >> I suggest to increase the counter only if we succeeded to populate: >>=20 >> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c >> index 6eadb9d116e4..247fd54f1003 100644 >> --- a/mm/sparse-vmemmap.c >> +++ b/mm/sparse-vmemmap.c >> @@ -656,7 +656,13 @@ static struct page * __meminit populate_section_memm= ap(unsigned long pfn, >> unsigned long nr_pages, int nid, struct vmem_altmap *altmap, >> struct dev_pagemap *pgmap) >> { >> - return __populate_section_memmap(pfn, nr_pages, nid, altmap, pgmap);= >> + struct page *p =3D __populate_section_memmap(pfn, nr_pages, nid, alt= map, >> + pgmap); >> + >> + if (p) >> + memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PA= GE_SIZE)); >=20 > We don=E2=80=99t increase the counter on failure, then >=20 >> + >> + return p; >> } >>=20 >> static void depopulate_section_memmap(unsigned long pfn, unsigned long nr= _pages, >> @@ -826,7 +832,6 @@ static struct page * __meminit section_activate(int n= id, unsigned long pfn, >> section_deactivate(pfn, nr_pages, altmap); >=20 > here section_deactivate() is called, which decrease the counter unconditio= nally, > the issue still exists. We didn't fix anything. >=20 > Thanks. >=20 >> return ERR_PTR(-ENOMEM); >> } >> - memmap_pages_add(DIV_ROUND_UP(nr_pages * sizeof(struct page), PAGE_S= IZE)); >>=20 >> return memmap; >> } >>=20 >>=20 >> Then we'll better follow "all or nothing" principle and won't have >> exceptional cases in section_deactivate(). To follow "all or nothing" principle here, I think we should not call section_deactivate() to do the cleanup in section_activate(). After all, if section_activate() didn't succeed, how can we use section_deactivate() to release the resources? What do you think? Thanks. >>=20 >>> Thanks, >>> Muchun. >>=20 >> -- >> Sincerely yours, >> Mike.