From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-176.mta1.migadu.com (out-176.mta1.migadu.com [95.215.58.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F29D317715 for ; Thu, 22 Jan 2026 14:03:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=95.215.58.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769090606; cv=none; b=XNoch4WROlCld03i7xO13Wc2n4zbvqYg+b95T3GMNUQhWFKLpDVw/m8thXVbEjjB29jbGnfG3u7H92adXz8352kDdFx8VRCNNUZTmmAWfu0GxeOFcMvtzhdEzGOK3YmdOmt0dJopJ6V3CkQIqsotAumhSDoIG4hSVYS2WPGg9ek= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769090606; c=relaxed/simple; bh=uq13gSOoSiUEKCFe1FV0UXrXoGGcidHoF0PusZF7+6A=; h=Content-Type:From:Mime-Version:Subject:Date:Message-Id:References: Cc:In-Reply-To:To; b=T8ozFmfWZSKpITCNBSuBv5VBCSR5MPl9Swbxv06bZziGi1gXpiPn1sTz+LPIEIyIw4Wool/3CoN09TNMYdFFPhU7Pw9flFbvGJXu9NfrWQWCofx3D6Ib1eJZjwa6ywljk6zw0hjYJLwpEFx6y/Xh8PwDE6L+c5scs7ENMPvH6ig= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=mH868Ya+; arc=none smtp.client-ip=95.215.58.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="mH868Ya+" Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1769090592; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ag4pvlq0Xt7bkCn3IMv0eQk3VBjXmIBsOS9DJbsrrSM=; b=mH868Ya+RRo/6e2/zo1l+T5Mc7oqXypqdF2i+4A0qDTVtEoGBoMiEizqwIhwipHS9mLGBL SMcC08B0UYqDuGuKkgasiPm+PUmgui+ZfuWdruFQK4mVUfpC9jgKE8NsbWgHZ2gcirlmeb cfA8IscdZhJOBr2jsRvWWyx5SCStBrM= Content-Transfer-Encoding: quoted-printable X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (1.0) Subject: Re: [PATCHv4 07/14] mm/sparse: Check memmap alignment for compound_info_has_mask() Date: Thu, 22 Jan 2026 22:02:24 +0800 Message-Id: <554FD2AA-16B5-498B-9F79-296798194DF7@linux.dev> References: Cc: Andrew Morton , David Hildenbrand , Matthew Wilcox , Usama Arif , Frank van der Linden , Oscar Salvador , Mike Rapoport , Vlastimil Babka , Lorenzo Stoakes , Zi Yan , Baoquan He , Michal Hocko , Johannes Weiner , Jonathan Corbet , kernel-team@meta.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org In-Reply-To: To: Kiryl Shutsemau X-Migadu-Flow: FLOW_OUT > On Jan 22, 2026, at 20:43, Kiryl Shutsemau wrote: >=20 > =EF=BB=BFOn Thu, Jan 22, 2026 at 07:42:47PM +0800, Muchun Song wrote: >>=20 >>=20 >>>> On Jan 22, 2026, at 19:33, Muchun Song wrote: >>>=20 >>>=20 >>>=20 >>>> On Jan 22, 2026, at 19:28, Kiryl Shutsemau wrote: >>>>=20 >>>> On Thu, Jan 22, 2026 at 11:10:26AM +0800, Muchun Song wrote: >>>>>=20 >>>>>=20 >>>>>> On Jan 22, 2026, at 00:22, Kiryl Shutsemau wrote: >>>>>>=20 >>>>>> If page->compound_info encodes a mask, it is expected that memmap to b= e >>>>>> naturally aligned to the maximum folio size. >>>>>>=20 >>>>>> Add a warning if it is not. >>>>>>=20 >>>>>> A warning is sufficient as MAX_FOLIO_ORDER is very rarely used, so th= e >>>>>> kernel is still likely to be functional if this strict check fails. >>>>>>=20 >>>>>> Signed-off-by: Kiryl Shutsemau >>>>>> --- >>>>>> include/linux/mmzone.h | 1 + >>>>>> mm/sparse.c | 5 +++++ >>>>>> 2 files changed, 6 insertions(+) >>>>>>=20 >>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h >>>>>> index 390ce11b3765..7e4f69b9d760 100644 >>>>>> --- a/include/linux/mmzone.h >>>>>> +++ b/include/linux/mmzone.h >>>>>> @@ -91,6 +91,7 @@ >>>>>> #endif >>>>>>=20 >>>>>> #define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER) >>>>>> +#define MAX_FOLIO_SIZE (PAGE_SIZE << MAX_FOLIO_ORDER) >>>>>>=20 >>>>>> enum migratetype { >>>>>> MIGRATE_UNMOVABLE, >>>>>> diff --git a/mm/sparse.c b/mm/sparse.c >>>>>> index 17c50a6415c2..5f41a3edcc24 100644 >>>>>> --- a/mm/sparse.c >>>>>> +++ b/mm/sparse.c >>>>>> @@ -600,6 +600,11 @@ void __init sparse_init(void) >>>>>> BUILD_BUG_ON(!is_power_of_2(sizeof(struct mem_section))); >>>>>> memblocks_present(); >>>>>>=20 >>>>>> + if (compound_info_has_mask()) { >>>>>> + WARN_ON(!IS_ALIGNED((unsigned long)pfn_to_page(0), >>>>>> + MAX_FOLIO_SIZE / sizeof(struct page))); >>>>>=20 >>>>> I still have concerns about this. If certain architectures or configur= ations, >>>>> especially when KASLR is enabled, do not meet the requirements during t= he >>>>> boot stage, only specific folios larger than a certain size might end u= p with >>>>> incorrect struct page entries as the system runs. How can we detect is= sues >>>>> arising from either updating the struct page or making incorrect logic= al >>>>> judgments based on information retrieved from the struct page? >>>>>=20 >>>>> After all, when we see this warning, we don't know when or if a proble= m will >>>>> occur in the future. It's like a time bomb in the system, isn't it? Th= erefore, >>>>> I would like to add a warning check to the memory allocation place, fo= r >>>>> example: >>>>>=20 >>>>> WARN_ON(!IS_ALIGNED((unsigned long)&folio->page, folio_size / sizeof(s= truct page))); >>>>=20 >>>> I don't think it is needed. Any compound page usage would trigger the >>>> problem. It should happen pretty early. >>>=20 >>> Why would you think it would be discovered early? If the alignment of st= ruct page >>> can only meet the needs of 4M pages (i.e., the largest pages that buddy c= an >>> allocate), how can you be sure that there will be a similar path using C= MA >>> early on if the system allocates through CMA in the future (after all, C= MA >>> is used much less than buddy)? >=20 > True. >=20 >> Suppose we are more aggressive. If the alignment requirement of struct pa= ge >> cannot meet the needs of 2GB pages (which is an uncommon memory allocatio= n >> requirement), then users might not care about such a warning message afte= r >> the system boots. And if there is no allocation of pages greater than or >> equal to 2GB for a period of time in the future, the system will have no >> problems. But once some path allocates pages greater than or equal to 2GB= , >> the system will go into chaos. And by that time, the system log may no >> longer have this warning message. Is that not the case? >=20 > It is. >=20 > I expect the warning to be reported early if we have configurations that > do not satisfy the alignment requirement even in absence of the crash. If you=E2=80=99re saying the issue was only caught during testing, keep in mind that with KASLR enabled the warning is triggered at run-time; you can=E2=80=99t assume it will never appear in production. So I don't think the administrator will notice a warning in practice.=20 >=20 > Adding a check to the allocation path if way too high price for a > theoretical problem. Not theoretical. It=E2=80=99s a problem indeed. We should make the system works properly in production. And I don=E2=80=99t think it=E2=80=99s too high price, as I said only adding the check to CMA not buddy allocator, CMA allocation is not supposed to be in a hot path=20 (Please carefully read the reply plan I gave at the very beginning. ).=20 >=20 > -- > Kiryl Shutsemau / Kirill A. Shutemov