All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Lorenzo Stoakes <ljs@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>, Jann Horn <jannh@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion
Date: Wed, 13 May 2026 12:15:23 +0200	[thread overview]
Message-ID: <291cd4df-7c52-426e-a8cc-b0cf77654c52@kernel.org> (raw)
In-Reply-To: <20260513085658.45264-1-ljs@kernel.org>

On 5/13/26 10:56, Lorenzo Stoakes wrote:
> Commit 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not
> before") changed the locking model around hugetlbfs PMD unsharing on VMA
> split, but did not update the function which asserts the locks,
> hugetlb_vma_assert_locked().
> 
> This function asserts that either the hugetlb VMA lock is held (if a shared
> mapping) or that the reservation map lock is held (if private).
> 
> If you get an unfortunate race between something which results in one of
> these locks being released and a hugetlb split and you have CONFIG_LOCKDEP

"hugetlb split": I assume you used that terminology because of hugetlb_split().
Which is all just rather nasty #justhugetlbthings

"hugetlb VMA split" is probably easier to get.

> enabled, you can therefore see a false positive assertion arise when there
> is in fact no issue.
> 
> Since this change introduced a new take_locks parameter to
> hugetlb_unshare_pmds(), which, when set to false, indicates that locking is
> sufficient, simply pass this to the unsharing logic and predicate the
> lock assertions on this.
> 
> This is safe, as we already asserted the file rmap lock and the VMA write
> lock prior to this (implying exclusive mmap write lock), so we cannot be
> raced by either rmap or page fault page table walkers which the asserted
> locks are intended to protect against (we don't mind GUP-fast).
> 
> Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a
> check_locks parameter, and update hugetlb_unshare_pmds() to pass this
> parameter to it.
> 
> This leaves all other callers of huge_pmd_unshare() still correctly
> asserting the locks.
> 
> The below reproducer will trigger the assert in a kernel with
> CONFIG_LOCKDEP enabled by racing process teardown (which will release the
> hugetlb lock) against a hugetlb split.
> 
> void execute_one(void)
> {
> 	void *ptr;
> 	pid_t pid;
> 
> 	/*
> 	 * Create a hugetlb mapping spanning a PUD entry.
> 	 *
> 	 * We force the hugetlb page allocation with populate and
> 	 * noreserve.
> 	 *
> 	 * |---------------------|
> 	 * |                     |
> 	 * |---------------------|
> 	 * 0                 PUD boundary
> 	 */
> 	ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE,
> 		   MAP_FIXED | MAP_SHARED | MAP_ANON |
> 		   MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE,
> 		   -1, 0);
> 	if (ptr == MAP_FAILED) {
> 		perror("mmap");
> 		exit(EXIT_FAILURE);
> 	}
> 
> 	/*
> 	 * Fork but with a bogus stack pointer so we try to execute code in
> 	 * a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap().
> 	 *
> 	 * The clone will cause PMD page table sharing between the
> 	 * processes first via:
> 	 * copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share()
> 	 *
> 	 * Then tear down and release the hugetlb 'VMA' lock via:
> 	 * exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free()
> 	 */
> 	pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0);
> 	if (pid < 0) {
> 		perror("clone");
> 		exit(EXIT_FAILURE);
> 	} if (pid == 0) {
> 		/* Pop stack... */
> 		return;
> 	}
> 
> 	/*
> 	 * We are the parent process.
> 	 *
> 	 * Race the child process's teardown with a PMD unshare.
> 	 *
> 	 * We do this by triggering:
> 	 *
> 	 * __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds()
> 	 *
> 	 * Which, importantly, doesn't hold the hugetlb VMA lock (nor can
> 	 * it), meaning we assert in hugetlb_vma_assert_locked().
> 	 *
> 	 *            .
> 	 * |----------.----------|
> 	 * |          .          |
> 	 * |----------.----------|
> 	 * 0          .     PUD boundary
> 	 */
> 	mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE,
> 	     MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
> }
> 
> int main(void)
> {
> 	int i;
> 
> 	/* Kick off fork children. */
> 	for (i = 0; i < NUM_FORKS; i++) {
> 		pid_t pid = fork();
> 
> 		if (pid < 0) {
> 			perror("fork");
> 			exit(EXIT_FAILURE);
> 		}
> 
> 		/* Fork children do their work and exit. */
> 		if (!pid) {
> 			int j;
> 
> 			for (j = 0; j < NUM_ITERS; j++)
> 				execute_one();
> 			return EXIT_SUCCESS;
> 		}
> 	}
> 
> 	/* If we succeeded, wait on children. */
> 	for (i = 0; i < NUM_FORKS; i++)
> 		wait(NULL);
> 
> 	return EXIT_SUCCESS;
> }
> 
> Fixes: 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not before")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
> ---

LGTM, all rather nasty with "take_locks" parameters ...

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David

  reply	other threads:[~2026-05-13 10:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  8:56 [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion Lorenzo Stoakes
2026-05-13 10:15 ` David Hildenbrand (Arm) [this message]
2026-05-13 11:02   ` Lorenzo Stoakes
2026-05-13 11:30 ` Oscar Salvador
2026-05-14  9:48 ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=291cd4df-7c52-426e-a8cc-b0cf77654c52@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.