From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1067D280CCF; Tue, 27 May 2025 17:56:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748368584; cv=none; b=mVqvrY3bhIPsD3hCJpO12ePeVMlJ9FFUUSxbsGuVRoOuDidRS+zmsJPdbkuPYQC8F9zZ9ObskOitcoNbz4/8EXvRAv5YV+KZkb1VxGhGajmQvvB6VAuycDATSpExIoDlNO2DnP3GSLaE/8tdcmvvULHa0byhMShuhGiZtuv8QRM= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748368584; c=relaxed/simple; bh=23F5uHw7YfyR8VgiheFDar6XN5/ZpMMDV2Nk0cSKXCw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Uyh5esWG+336Czqd2gcELOLEMITC7V4IU8hlHpQ/22SOKcV+fBCMg0eohsddj4CMqlXQuXA/e0GTmHNlXgFkjLLCQNFuha5br1p37xTqx8OFacZ3/D8MfnwRFZ1i4GCEWV9r9iSi8iJM5sV3pC4Yru6H640uydgGQ1O4XZTV+CQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=SkRWa9lb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="SkRWa9lb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9573EC4CEEA; Tue, 27 May 2025 17:56:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1748368583; bh=23F5uHw7YfyR8VgiheFDar6XN5/ZpMMDV2Nk0cSKXCw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SkRWa9lbfQtdFAzl10tU4DI93TuJWq5C39pFM9dbWBU12Juq2VWrDXNVs3ke4OZ/r cvyo2PDQakNygsi/xOqWwFuWfPWlmG+bjDUKtqMWycDl8o4Uc9FrFuEgJzpzXTYk1k uoEiORvcPG/Kqb/1NLIyB5LEx+XYc8Crik4RVS1Q= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Ge Yang , Muchun Song , Oscar Salvador , Baolin Wang , Barry Song <21cnbao@gmail.com>, David Hildenbrand , Andrew Morton Subject: [PATCH 6.14 755/783] mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios Date: Tue, 27 May 2025 18:29:12 +0200 Message-ID: <20250527162543.871882181@linuxfoundation.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250527162513.035720581@linuxfoundation.org> References: <20250527162513.035720581@linuxfoundation.org> User-Agent: quilt/0.68 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 6.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Ge Yang commit 113ed54ad276c352ee5ce109bdcf0df118a43bda upstream. A kernel crash was observed when replacing free hugetlb folios: BUG: kernel NULL pointer dereference, address: 0000000000000028 PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 PREEMPT(voluntary) RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0 RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000 RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000 RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000 R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000 R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004 FS: 00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0 Call Trace: replace_free_hugepage_folios+0xb6/0x100 alloc_contig_range_noprof+0x18a/0x590 ? srso_return_thunk+0x5/0x5f ? down_read+0x12/0xa0 ? srso_return_thunk+0x5/0x5f cma_range_alloc.constprop.0+0x131/0x290 __cma_alloc+0xcf/0x2c0 cma_alloc_write+0x43/0xb0 simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110 debugfs_attr_write+0x46/0x70 full_proxy_write+0x62/0xa0 vfs_write+0xf8/0x420 ? srso_return_thunk+0x5/0x5f ? filp_flush+0x86/0xa0 ? srso_return_thunk+0x5/0x5f ? filp_close+0x1f/0x30 ? srso_return_thunk+0x5/0x5f ? do_dup2+0xaf/0x160 ? srso_return_thunk+0x5/0x5f ksys_write+0x65/0xe0 do_syscall_64+0x64/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e There is a potential race between __update_and_free_hugetlb_folio() and replace_free_hugepage_folios(): CPU1 CPU2 __update_and_free_hugetlb_folio replace_free_hugepage_folios folio_test_hugetlb(folio) -- It's still hugetlb folio. __folio_clear_hugetlb(folio) hugetlb_free_folio(folio) h = folio_hstate(folio) -- Here, h is NULL pointer When the above race condition occurs, folio_hstate(folio) returns NULL, and subsequent access to this NULL pointer will cause the system to crash. To resolve this issue, execute folio_hstate(folio) under the protection of the hugetlb_lock lock, ensuring that folio_hstate(folio) does not return NULL. Link: https://lkml.kernel.org/r/1747884137-26685-1-git-send-email-yangge1116@126.com Fixes: 04f13d241b8b ("mm: replace free hugepage folios after migration") Signed-off-by: Ge Yang Reviewed-by: Muchun Song Reviewed-by: Oscar Salvador Cc: Baolin Wang Cc: Barry Song <21cnbao@gmail.com> Cc: David Hildenbrand Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman --- mm/hugetlb.c | 8 ++++++++ 1 file changed, 8 insertions(+) --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2926,12 +2926,20 @@ int replace_free_hugepage_folios(unsigne while (start_pfn < end_pfn) { folio = pfn_folio(start_pfn); + + /* + * The folio might have been dissolved from under our feet, so make sure + * to carefully check the state under the lock. + */ + spin_lock_irq(&hugetlb_lock); if (folio_test_hugetlb(folio)) { h = folio_hstate(folio); } else { + spin_unlock_irq(&hugetlb_lock); start_pfn++; continue; } + spin_unlock_irq(&hugetlb_lock); if (!folio_ref_count(folio)) { ret = alloc_and_dissolve_hugetlb_folio(h, folio,