From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4728837BE86 for ; Wed, 13 May 2026 11:30:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778671840; cv=none; b=bjNpfc4+j46JzCMsch3LcekgNuO/jvSw01NQprxaYqeC0e7yak78WTBCx3dHnea8slu/Rf4FOZvvmhiQ3Nq2GfdrPRruqJqggm2zu+3QOw6VZlDD4Q07PJq3oAHxrWo1IccguaHvjeDhIVgsvEqmPE85euFXCohp6oKsll8pPBY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778671840; c=relaxed/simple; bh=M9Nkl68Tud7tVx8v0FJJypxuI81RraYm6QCjXyTJ1ng=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hnJ7xBUGECJx9cW+gBfAjcGTH+ynt00kaibFSjHXLIZKKAy6LjUioSYOvoYHa5URzOln33QZTDVZVo61YKUMu2tjyKn+D3NfWJXT1/7rbQpuud/4Vj1MtNi940XI+IoiJvSIJND5GhgiD1LmB4my7qjWvR6ZmepTCNHAhwgvpEE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=r1xx/9wY; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=7xooceHp; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=r1xx/9wY; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=7xooceHp; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="r1xx/9wY"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="7xooceHp"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="r1xx/9wY"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="7xooceHp" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5C1797611B; Wed, 13 May 2026 11:30:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1778671837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uqQ/I3lrTsawC6GGrrwXVJNBkJMZCSf4vZnRlxubQT8=; b=r1xx/9wYWMyKW9+Gs5tj71/q83qvfYUQePfK7BK+7xbS+TTr5pSoAis3f3ZXXGFQIpWCxx 72werDiio6Nw6Mn2+J03BKQ6zXX9C9mPmAkVBOmOAjXtjw2UfmQBKqO0mklTYSO0k2mVRY JrHwpdpxt0HkssPiUfKx2WGWB5Z7G50= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1778671837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uqQ/I3lrTsawC6GGrrwXVJNBkJMZCSf4vZnRlxubQT8=; b=7xooceHpU2vZpcxN6M94AhY64XXsWw1OQ1dm7TjoOsKZCnb+xEaiWUuTqVgcXwJOqGq8hA zaMzGcYX4XuoCwCw== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b="r1xx/9wY"; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=7xooceHp DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1778671837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uqQ/I3lrTsawC6GGrrwXVJNBkJMZCSf4vZnRlxubQT8=; b=r1xx/9wYWMyKW9+Gs5tj71/q83qvfYUQePfK7BK+7xbS+TTr5pSoAis3f3ZXXGFQIpWCxx 72werDiio6Nw6Mn2+J03BKQ6zXX9C9mPmAkVBOmOAjXtjw2UfmQBKqO0mklTYSO0k2mVRY JrHwpdpxt0HkssPiUfKx2WGWB5Z7G50= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1778671837; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=uqQ/I3lrTsawC6GGrrwXVJNBkJMZCSf4vZnRlxubQT8=; b=7xooceHpU2vZpcxN6M94AhY64XXsWw1OQ1dm7TjoOsKZCnb+xEaiWUuTqVgcXwJOqGq8hA zaMzGcYX4XuoCwCw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E17CA593A9; Wed, 13 May 2026 11:30:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 5LyNNNxgBGrpZAAAD6G6ig (envelope-from ); Wed, 13 May 2026 11:30:36 +0000 Date: Wed, 13 May 2026 13:30:35 +0200 From: Oscar Salvador To: Lorenzo Stoakes Cc: Andrew Morton , Muchun Song , David Hildenbrand , Jann Horn , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH mm-hotfixes] mm/hugetlb: avoid false positive lockdep assertion Message-ID: References: <20260513085658.45264-1-ljs@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260513085658.45264-1-ljs@kernel.org> X-Spam-Level: X-Rspamd-Action: no action X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; RCVD_TLS_ALL(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_SEVEN(0.00)[7]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:email,imap1.dmz-prg2.suse.org:rdns,imap1.dmz-prg2.suse.org:helo]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Queue-Id: 5C1797611B X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spam-Flag: NO X-Spam-Score: -4.51 On Wed, May 13, 2026 at 09:56:58AM +0100, Lorenzo Stoakes wrote: > Commit 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not > before") changed the locking model around hugetlbfs PMD unsharing on VMA > split, but did not update the function which asserts the locks, > hugetlb_vma_assert_locked(). > > This function asserts that either the hugetlb VMA lock is held (if a shared > mapping) or that the reservation map lock is held (if private). > > If you get an unfortunate race between something which results in one of > these locks being released and a hugetlb split and you have CONFIG_LOCKDEP > enabled, you can therefore see a false positive assertion arise when there > is in fact no issue. > > Since this change introduced a new take_locks parameter to > hugetlb_unshare_pmds(), which, when set to false, indicates that locking is > sufficient, simply pass this to the unsharing logic and predicate the > lock assertions on this. > > This is safe, as we already asserted the file rmap lock and the VMA write > lock prior to this (implying exclusive mmap write lock), so we cannot be > raced by either rmap or page fault page table walkers which the asserted > locks are intended to protect against (we don't mind GUP-fast). > > Separate out huge_pmd_unshare() into __huge_pmd_unshare() to add a > check_locks parameter, and update hugetlb_unshare_pmds() to pass this > parameter to it. > > This leaves all other callers of huge_pmd_unshare() still correctly > asserting the locks. > > The below reproducer will trigger the assert in a kernel with > CONFIG_LOCKDEP enabled by racing process teardown (which will release the > hugetlb lock) against a hugetlb split. > > void execute_one(void) > { > void *ptr; > pid_t pid; > > /* > * Create a hugetlb mapping spanning a PUD entry. > * > * We force the hugetlb page allocation with populate and > * noreserve. > * > * |---------------------| > * | | > * |---------------------| > * 0 PUD boundary > */ > ptr = mmap(0, PUD_SIZE, PROT_READ | PROT_WRITE, > MAP_FIXED | MAP_SHARED | MAP_ANON | > MAP_NORESERVE | MAP_HUGETLB | MAP_POPULATE, > -1, 0); > if (ptr == MAP_FAILED) { > perror("mmap"); > exit(EXIT_FAILURE); > } > > /* > * Fork but with a bogus stack pointer so we try to execute code in > * a non-VM_EXEC VMA, causing segfault + teardown via exit_mmap(). > * > * The clone will cause PMD page table sharing between the > * processes first via: > * copy_process() -> ... -> huge_pte_alloc() -> huge_pmd_share() > * > * Then tear down and release the hugetlb 'VMA' lock via: > * exit_mmap() -> ... -> vma_close() -> hugetlb_vma_lock_free() > */ > pid = syscall(__NR_clone, 0, 2 * PMD_SIZE, 0, 0, 0); > if (pid < 0) { > perror("clone"); > exit(EXIT_FAILURE); > } if (pid == 0) { > /* Pop stack... */ > return; > } > > /* > * We are the parent process. > * > * Race the child process's teardown with a PMD unshare. > * > * We do this by triggering: > * > * __split_vma() -> hugetlb_split() -> hugetlb_unshare_pmds() > * > * Which, importantly, doesn't hold the hugetlb VMA lock (nor can > * it), meaning we assert in hugetlb_vma_assert_locked(). > * > * . > * |----------.----------| > * | . | > * |----------.----------| > * 0 . PUD boundary > */ > mmap(0, PUD_SIZE / 2, PROT_READ | PROT_WRITE, > MAP_FIXED | MAP_ANON | MAP_PRIVATE, -1, 0); > } > > int main(void) > { > int i; > > /* Kick off fork children. */ > for (i = 0; i < NUM_FORKS; i++) { > pid_t pid = fork(); > > if (pid < 0) { > perror("fork"); > exit(EXIT_FAILURE); > } > > /* Fork children do their work and exit. */ > if (!pid) { > int j; > > for (j = 0; j < NUM_ITERS; j++) > execute_one(); > return EXIT_SUCCESS; > } > } > > /* If we succeeded, wait on children. */ > for (i = 0; i < NUM_FORKS; i++) > wait(NULL); > > return EXIT_SUCCESS; > } > > Fixes: 081056dc00a2 ("mm/hugetlb: unshare page tables during VMA split, not before") > Cc: > Signed-off-by: Lorenzo Stoakes I had to re-read the flow a few times because it is getting a bit confusing but here we are :-) Acked-by: Oscar Salvador Thanks! -- Oscar Salvador SUSE Labs