Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Lorenzo Stoakes <ljs@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Muchun Song <muchun.song@linux.dev>,
	Oscar Salvador <osalvador@suse.de>, Jann Horn <jannh@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion
Date: Wed, 13 May 2026 12:15:23 +0200	[thread overview]
Message-ID: <291cd4df-7c52-426e-a8cc-b0cf77654c52@kernel.org> (raw)
In-Reply-To: <20260513085658.45264-1-ljs@kernel.org>

On 5/13/26 10:56, Lorenzo Stoakes wrote:
> Commit 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not
> before") changed the locking model around hugetlbfs PMD unsharing on VMA
> split, but did not update the function which asserts the locks,
> hugetlb_vma_assert_locked().
> 
> This function asserts that either the hugetlb VMA lock is held (if a shared
> mapping) or that the reservation map lock is held (if private).
> 
> If you get an unfortunate race between something which results in one of
> these locks being released and a hugetlb split and you have CONFIG_LOCKDEP

"hugetlb split": I assume you used that terminology because of hugetlb_split().
Which is all just rather nasty #justhugetlbthings

"hugetlb VMA split" is probably easier to get.

> enabled, you can therefore see a false positive assertion arise when there
> is in fact no issue.
> 
> Since this change introduced a new take_locks parameter to
> hugetlb_unshare_pmds(), which, when set to false, indicates that locking is
> sufficient, simply pass this to the unsharing logic and predicate the
> lock assertions on this.
> 
> This is safe, as we already asserted the file rmap lock and the VMA write
> lock prior to this (implying exclusive mmap write lock), so we cannot be
> raced by either rmap or page fault page table walkers which the asserted
> locks are intended to protect against (we don't mind GUP-fast).
> 
> Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a
> check_locks parameter, and update hugetlb_unshare_pmds() to pass this
> parameter to it.
> 
> This leaves all other callers of huge_pmd_unshare() still correctly
> asserting the locks.
> 
> The below reproducer will trigger the assert in a kernel with
> CONFIG_LOCKDEP enabled by racing process teardown (which will release the
> hugetlb lock) against a hugetlb split.
> 
> void execute_one(void)
> {
> 	void *ptr;
> 	pid_t pid;
> 
> 	/*
> 	 * Create a hugetlb mapping spanning a PUD entry.
> 	 *
> 	 * We force the hugetlb page allocation with populate and
> 	 * noreserve.
> 	 *
> 	 * |---------------------|
> 	 * |                     |
> 	 * |---------------------|
> 	 * 0                 PUD boundary
> 	 */
> 	ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE,
> 		   MAP_FIXED | MAP_SHARED | MAP_ANON |
> 		   MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE,
> 		   -1, 0);
> 	if (ptr == MAP_FAILED) {
> 		perror("mmap");
> 		exit(EXIT_FAILURE);
> 	}
> 
> 	/*
> 	 * Fork but with a bogus stack pointer so we try to execute code in
> 	 * a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap().
> 	 *
> 	 * The clone will cause PMD page table sharing between the
> 	 * processes first via:
> 	 * copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share()
> 	 *
> 	 * Then tear down and release the hugetlb 'VMA' lock via:
> 	 * exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free()
> 	 */
> 	pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0);
> 	if (pid < 0) {
> 		perror("clone");
> 		exit(EXIT_FAILURE);
> 	} if (pid == 0) {
> 		/* Pop stack... */
> 		return;
> 	}
> 
> 	/*
> 	 * We are the parent process.
> 	 *
> 	 * Race the child process's teardown with a PMD unshare.
> 	 *
> 	 * We do this by triggering:
> 	 *
> 	 * __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds()
> 	 *
> 	 * Which, importantly, doesn't hold the hugetlb VMA lock (nor can
> 	 * it), meaning we assert in hugetlb_vma_assert_locked().
> 	 *
> 	 *            .
> 	 * |----------.----------|
> 	 * |          .          |
> 	 * |----------.----------|
> 	 * 0          .     PUD boundary
> 	 */
> 	mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE,
> 	     MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0);
> }
> 
> int main(void)
> {
> 	int i;
> 
> 	/* Kick off fork children. */
> 	for (i = 0; i < NUM_FORKS; i++) {
> 		pid_t pid = fork();
> 
> 		if (pid < 0) {
> 			perror("fork");
> 			exit(EXIT_FAILURE);
> 		}
> 
> 		/* Fork children do their work and exit. */
> 		if (!pid) {
> 			int j;
> 
> 			for (j = 0; j < NUM_ITERS; j++)
> 				execute_one();
> 			return EXIT_SUCCESS;
> 		}
> 	}
> 
> 	/* If we succeeded, wait on children. */
> 	for (i = 0; i < NUM_FORKS; i++)
> 		wait(NULL);
> 
> 	return EXIT_SUCCESS;
> }
> 
> Fixes: 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not before")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
> ---

LGTM, all rather nasty with "take_locks" parameters ...

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


  reply	other threads:[~2026-05-13 10:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  8:56 [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion Lorenzo Stoakes
2026-05-13 10:15 ` David Hildenbrand (Arm) [this message]
2026-05-13 11:02   ` Lorenzo Stoakes
2026-05-13 11:30 ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=291cd4df-7c52-426e-a8cc-b0cf77654c52@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox