Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "Oscar Salvador (SUSE)" <osalvador@kernel.org>
To: Karsten Desler <kdesler@soohrt.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [REGRESSION] x86/hugetlb: AMD F15h VA alignment offset breaks MAP_HUGETLB alignment
Date: Wed, 27 May 2026 17:53:06 +0200	[thread overview]
Message-ID: <ahcTYufV7P5Cmotq@localhost.localdomain> (raw)
In-Reply-To: <20260527143643.GO31091@soohrt.org>

On Wed, May 27, 2026 at 04:36:43PM +0200, Karsten Desler wrote:
> Hi,
> 
> I found a reproducible hugetlb regression on an AMD Family 15h system.
> 
> On some boots, mmap(MAP_HUGETLB) returns a virtual address that is not aligned
> to the hugepage size. The mapping is nevertheless installed as a hugetlb VMA.
> When the process exits, the kernel later BUGs in __unmap_hugepage_range().
> 
> 6.18.33 x86_64, AMD opteron 6238, 2M hugepages

Thanks Kartsten for reporting this.

Ooops, that would be me.

> Example bad mapping captured from /proc/$pid/maps:
> 
>   7fc67f604000-7fc67f804000 rw-p 00000000 00:0f 12340 /anon_hugepage (deleted)
> 
> The address has offset 0x4000 within a 2 MiB hugepage.
> 
> smaps confirms it is really hugetlb:
> 
>   KernelPageSize:     2048 kB
>   MMUPageSize:        2048 kB
>   Private_Hugetlb:    2048 kB
>   VmFlags: rd wr mr mw me de ht
> 
> Minimal reproducer:
> 
>   echo 1000 > /proc/sys/vm/nr_hugepages
> 
>   mmap(NULL, 1229824, PROT_READ|PROT_WRITE,
>        MAP_PRIVATE|MAP_ANONYMOUS|MAP_POPULATE|MAP_HUGETLB, -1, 0)
> 
> On bad boots this returns e.g.:
> 
>   mmap returned 0x7fc67f604000 aligned=no offset=16384
> 
> and exiting the process triggers:
> 
>   Kernel BUG at __unmap_hugepage_range+0x5ef/0x640
>   RIP: __unmap_hugepage_range+0x5ef/0x640
>   Fixing recursive fault but reboot is needed!
> 
> The following is AI work, sorry if that's total BS but at the very least,
> I can reproduce the kernelBUG and booting with
>   align_va_addr=off
> works around the issue.
> 
> This is boot-dependent. Some boots work, some fail. The reason appears
> to be the per-boot AMD F15h VA alignment offset.

I have to confess that I completely overlooked that scenario, so let me
apologyze.

> The old x86 hugetlb path in arch/x86/mm/hugetlbpage.c only set:
> 
>   info.align_mask = PAGE_MASK & ~huge_page_mask(h);
> 
> It did not add the AMD F15h align offset.
> 
> After the v6.13-rc1 hugetlb mmap rework, hugetlb mappings go through
> arch_get_unmapped_area*(), and x86 currently does:
> 
>   if (filp) {
>           info.align_mask = get_align_mask(filp);
>           info.align_offset += get_align_bits();
>   }

Ok, I see.

> 
> For hugetlb, get_align_mask(filp) correctly returns the hugepage alignment
> mask, but get_align_bits() can still return the AMD F15h per-boot offset,
> e.g. 0x4000. That produces a non-hugepage-aligned hugetlb VMA.
> 
> Likely introduced by the v6.13-rc1 series:
> 
>   1317a5e7f7b1 arch/x86: teach arch_get_unmapped_area_vmflags to handle hugetlb mappings
>   7bd3f1e1a9ae mm: make hugetlb mappings go through mm_get_unmapped_area_vmflags
>   cc92882ee218 mm: drop hugetlb_get_unmapped_area{_*} functions

Yes, that was part of a refactoring I did some time ago.

I will fix it up later today/early tomorrow.

Would you be available for a quick test once I have the patch?

 

-- 
Oscar Salvador
SUSE Labs


  reply	other threads:[~2026-05-27 15:53 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-27 14:36 [REGRESSION] x86/hugetlb: AMD F15h VA alignment offset breaks MAP_HUGETLB alignment Karsten Desler
2026-05-27 15:53 ` Oscar Salvador (SUSE) [this message]
2026-05-27 18:28   ` Oscar Salvador (SUSE)
2026-05-27 20:39     ` Karsten Desler
2026-05-27 21:04     ` Dave Hansen
2026-05-28  5:45       ` Oscar Salvador (SUSE)
2026-05-28 12:45         ` Oscar Salvador (SUSE)
2026-05-28 14:03           ` Oscar Salvador (SUSE)
2026-05-28 15:31             ` Borislav Petkov
2026-05-28 18:29               ` Oscar Salvador (SUSE)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ahcTYufV7P5Cmotq@localhost.localdomain \
    --to=osalvador@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kdesler@soohrt.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox