All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kiryl Shutsemau <kas@kernel.org>
To: Muchun Song <muchun.song@linux.dev>,
	 Matthew Wilcox <willy@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	Usama Arif <usamaarif642@gmail.com>,
	 Frank van der Linden <fvdl@google.com>,
	Oscar Salvador <osalvador@suse.de>,
	 Mike Rapoport <rppt@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Zi Yan <ziy@nvidia.com>, Baoquan He <bhe@redhat.com>,
	 Michal Hocko <mhocko@suse.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	 Jonathan Corbet <corbet@lwn.net>,
	kernel-team@meta.com, linux-mm@kvack.org,
	 linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org
Subject: Re: [PATCHv4 07/14] mm/sparse: Check memmap alignment for compound_info_has_mask()
Date: Fri, 23 Jan 2026 12:07:47 +0000	[thread overview]
Message-ID: <aXNkSpO0WsvPz6h4@thinkstation> (raw)
In-Reply-To: <BFECEA67-90EC-444A-87A3-A220168B1B67@linux.dev>

On Fri, Jan 23, 2026 at 10:32:28AM +0800, Muchun Song wrote:
> 
> 
> > On Jan 23, 2026, at 01:59, Kiryl Shutsemau <kas@kernel.org> wrote:
> > 
> > On Thu, Jan 22, 2026 at 10:02:24PM +0800, Muchun Song wrote:
> >> 
> >> 
> >>> On Jan 22, 2026, at 20:43, Kiryl Shutsemau <kas@kernel.org> wrote:
> >>> 
> >>> On Thu, Jan 22, 2026 at 07:42:47PM +0800, Muchun Song wrote:
> >>>> 
> >>>> 
> >>>>>> On Jan 22, 2026, at 19:33, Muchun Song <muchun.song@linux.dev> wrote:
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>>> On Jan 22, 2026, at 19:28, Kiryl Shutsemau <kas@kernel.org> wrote:
> >>>>>> 
> >>>>>> On Thu, Jan 22, 2026 at 11:10:26AM +0800, Muchun Song wrote:
> >>>>>>> 
> >>>>>>> 
> >>>>>>>> On Jan 22, 2026, at 00:22, Kiryl Shutsemau <kas@kernel.org> wrote:
> >>>>>>>> 
> >>>>>>>> If page->compound_info encodes a mask, it is expected that memmap to be
> >>>>>>>> naturally aligned to the maximum folio size.
> >>>>>>>> 
> >>>>>>>> Add a warning if it is not.
> >>>>>>>> 
> >>>>>>>> A warning is sufficient as MAX_FOLIO_ORDER is very rarely used, so the
> >>>>>>>> kernel is still likely to be functional if this strict check fails.
> >>>>>>>> 
> >>>>>>>> Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> >>>>>>>> ---
> >>>>>>>> include/linux/mmzone.h | 1 +
> >>>>>>>> mm/sparse.c            | 5 +++++
> >>>>>>>> 2 files changed, 6 insertions(+)
> >>>>>>>> 
> >>>>>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >>>>>>>> index 390ce11b3765..7e4f69b9d760 100644
> >>>>>>>> --- a/include/linux/mmzone.h
> >>>>>>>> +++ b/include/linux/mmzone.h
> >>>>>>>> @@ -91,6 +91,7 @@
> >>>>>>>> #endif
> >>>>>>>> 
> >>>>>>>> #define MAX_FOLIO_NR_PAGES (1UL << MAX_FOLIO_ORDER)
> >>>>>>>> +#define MAX_FOLIO_SIZE (PAGE_SIZE << MAX_FOLIO_ORDER)
> >>>>>>>> 
> >>>>>>>> enum migratetype {
> >>>>>>>> MIGRATE_UNMOVABLE,
> >>>>>>>> diff --git a/mm/sparse.c b/mm/sparse.c
> >>>>>>>> index 17c50a6415c2..5f41a3edcc24 100644
> >>>>>>>> --- a/mm/sparse.c
> >>>>>>>> +++ b/mm/sparse.c
> >>>>>>>> @@ -600,6 +600,11 @@ void __init sparse_init(void)
> >>>>>>>> BUILD_BUG_ON(!is_power_of_2(sizeof(struct mem_section)));
> >>>>>>>> memblocks_present();
> >>>>>>>> 
> >>>>>>>> +  if (compound_info_has_mask()) {
> >>>>>>>> +  WARN_ON(!IS_ALIGNED((unsigned long)pfn_to_page(0),
> >>>>>>>> +     MAX_FOLIO_SIZE / sizeof(struct page)));
> >>>>>>> 
> >>>>>>> I still have concerns about this. If certain architectures or configurations,
> >>>>>>> especially when KASLR is enabled, do not meet the requirements during the
> >>>>>>> boot stage, only specific folios larger than a certain size might end up with
> >>>>>>> incorrect struct page entries as the system runs. How can we detect issues
> >>>>>>> arising from either updating the struct page or making incorrect logical
> >>>>>>> judgments based on information retrieved from the struct page?
> >>>>>>> 
> >>>>>>> After all, when we see this warning, we don't know when or if a problem will
> >>>>>>> occur in the future. It's like a time bomb in the system, isn't it? Therefore,
> >>>>>>> I would like to add a warning check to the memory allocation place, for
> >>>>>>> example:
> >>>>>>> 
> >>>>>>> WARN_ON(!IS_ALIGNED((unsigned long)&folio->page, folio_size / sizeof(struct page)));
> >>>>>> 
> >>>>>> I don't think it is needed. Any compound page usage would trigger the
> >>>>>> problem. It should happen pretty early.
> >>>>> 
> >>>>> Why would you think it would be discovered early? If the alignment of struct page
> >>>>> can only meet the needs of 4M pages (i.e., the largest pages that buddy can
> >>>>> allocate), how can you be sure that there will be a similar path using CMA
> >>>>> early on if the system allocates through CMA in the future (after all, CMA
> >>>>> is used much less than buddy)?
> >>> 
> >>> True.
> >>> 
> >>>> Suppose we are more aggressive. If the alignment requirement of struct page
> >>>> cannot meet the needs of 2GB pages (which is an uncommon memory allocation
> >>>> requirement), then users might not care about such a warning message after
> >>>> the system boots. And if there is no allocation of pages greater than or
> >>>> equal to 2GB for a period of time in the future, the system will have no
> >>>> problems. But once some path allocates pages greater than or equal to 2GB,
> >>>> the system will go into chaos. And by that time, the system log may no
> >>>> longer have this warning message. Is that not the case?
> >>> 
> >>> It is.
> >>> 
> >>> I expect the warning to be reported early if we have configurations that
> >>> do not satisfy the alignment requirement even in absence of the crash.
> >> 
> >> If you’re saying the issue was only caught during
> >> testing, keep in mind that with KASLR enabled the
> >> warning is triggered at run-time; you can’t assume it
> >> will never appear in production.
> > 
> > Let's look at what architectures actually do with vmemmap.
> > 
> > On 64-bit machines, we want vmemmap to be naturally aligned to
> > accommodate 16GiB pages.
> > 
> > Assuming 64 byte struct page, it requires 256 MiB alignment for 4K
> > PAGE_SIZE, 64MiB for 16K PAGE_SIZE and 16MiB for 64K PAGE_SIZE.
> > 
> > Only 3 architectures support HVO (select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP):
> > loongarch, riscv and x86. We should make the feature conditional to HVO
> > to limit exposure.
> > 
> > I am not sure why arm64 is not in the club.
> > 
> > x86 aligns vmemmap to 1G - OK.
> > 
> > loongarch aligns vmemmap to PMD_SIZE does not fit us with 4K and 16K
> > PAGE_SIZE. It should be easily fixable. No KALSR.
> > 
> > riscv aligns vmemmap to section size (128MiB) which is not enough.
> > Again, easily fixable.
> > 
> 
> OK. After we fix all problems, I think changing WARN_ON to BUG_ON is fine.

David was explicitly against the BUG_ON() in the patch.

David, is it still the case?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2026-01-23 12:07 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-21 16:22 [PATCHv4 00/14] mm: Eliminate fake head pages from vmemmap optimization Kiryl Shutsemau
2026-01-21 16:22 ` [PATCHv4 01/14] mm: Move MAX_FOLIO_ORDER definition to mmzone.h Kiryl Shutsemau
2026-01-21 16:29   ` Zi Yan
2026-01-22  2:24   ` Muchun Song
2026-01-21 16:22 ` [PATCHv4 02/14] mm: Change the interface of prep_compound_tail() Kiryl Shutsemau
2026-01-21 16:32   ` Zi Yan
2026-01-21 16:22 ` [PATCHv4 03/14] mm: Rename the 'compound_head' field in the 'struct page' to 'compound_info' Kiryl Shutsemau
2026-01-21 16:34   ` Zi Yan
2026-01-21 16:22 ` [PATCHv4 04/14] mm: Move set/clear_compound_head() next to compound_head() Kiryl Shutsemau
2026-01-21 16:35   ` Zi Yan
2026-01-21 16:22 ` [PATCHv4 05/14] mm: Rework compound_head() for power-of-2 sizeof(struct page) Kiryl Shutsemau
2026-01-21 17:12   ` Zi Yan
2026-01-22 11:29     ` Kiryl Shutsemau
2026-01-22 11:52       ` Muchun Song
2026-01-21 16:22 ` [PATCHv4 06/14] mm: Make page_zonenum() use head page Kiryl Shutsemau
2026-01-21 16:28   ` Zi Yan
2026-01-21 16:22 ` [PATCHv4 07/14] mm/sparse: Check memmap alignment for compound_info_has_mask() Kiryl Shutsemau
2026-01-21 17:58   ` Zi Yan
2026-01-22 11:22     ` Kiryl Shutsemau
2026-01-22  3:10   ` Muchun Song
2026-01-22 11:28     ` Kiryl Shutsemau
2026-01-22 11:33       ` Muchun Song
2026-01-22 11:42         ` Muchun Song
2026-01-22 12:42           ` Kiryl Shutsemau
2026-01-22 14:02             ` Muchun Song
2026-01-22 17:59               ` Kiryl Shutsemau
2026-01-23  2:32                 ` Muchun Song
2026-01-23 12:07                   ` Kiryl Shutsemau [this message]
2026-01-21 16:22 ` [PATCHv4 08/14] mm/hugetlb: Refactor code around vmemmap_walk Kiryl Shutsemau
2026-01-22  8:08   ` Muchun Song
2026-01-21 16:22 ` [PATCHv4 09/14] mm/hugetlb: Remove fake head pages Kiryl Shutsemau
2026-01-22  7:00   ` Muchun Song
2026-01-27 14:51     ` Kiryl Shutsemau
2026-01-28  2:43       ` Muchun Song
2026-01-28 12:59         ` Kiryl Shutsemau
2026-01-29  3:04           ` Muchun Song
2026-01-21 16:22 ` [PATCHv4 10/14] mm: Drop fake head checks Kiryl Shutsemau
2026-01-21 18:16   ` Zi Yan
2026-01-22 12:48     ` Kiryl Shutsemau
2026-01-21 16:22 ` [PATCHv4 11/14] hugetlb: Remove VMEMMAP_SYNCHRONIZE_RCU Kiryl Shutsemau
2026-01-21 16:22 ` [PATCHv4 12/14] mm/hugetlb: Remove hugetlb_optimize_vmemmap_key static key Kiryl Shutsemau
2026-01-21 16:22 ` [PATCHv4 13/14] mm: Remove the branch from compound_head() Kiryl Shutsemau
2026-01-21 18:21   ` Zi Yan
2026-01-21 16:22 ` [PATCHv4 14/14] hugetlb: Update vmemmap_dedup.rst Kiryl Shutsemau
2026-01-22  2:22   ` Muchun Song
2026-01-21 18:44 ` [PATCHv4 00/14] mm: Eliminate fake head pages from vmemmap optimization Vlastimil Babka
2026-01-21 20:31   ` Zi Yan
2026-01-22 11:21     ` Kiryl Shutsemau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aXNkSpO0WsvPz6h4@thinkstation \
    --to=kas@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=fvdl@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@meta.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    --cc=rppt@kernel.org \
    --cc=usamaarif642@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.