From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E905F4418FF; Tue, 16 Jun 2026 15:26:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781623585; cv=none; b=BfMXlomPJW5zk86AfPmQhfxHv7oZNS8jRCnAlBawHAN76kaqFIO42BugIDKg3hELiERQfeIqksd+PTUzP+lFV/ftcCfjOkk/qFrXgGx7lUYokrGvglPMJc9Yzh82kLhR5vv5ujOPi/1T9FEVtvFg3maBwjJmBI5Rbee75rxFGv4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781623585; c=relaxed/simple; bh=rUlRbMnj8tR0NHwypHC20ujjF2LyD/12WEVkzQEhr+Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ou8rGl8UR1oxQRf3C0szof5PRl4dONF/33lTkTNsnppvkrW8KXrPobILsMXT0DlZ8cr47NfDrWOPj8qTy2zS4p+4vuTdVcgWHKA7RgyrYoR7liI2leFjGbJcvYjfBHAsqQpXyN8Xbza8ohXkHBNkqoqHRD8/iHKxby0VGzsiBoA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b=aPp84kHn; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linuxfoundation.org header.i=@linuxfoundation.org header.b="aPp84kHn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F4401F00A3D; Tue, 16 Jun 2026 15:26:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linuxfoundation.org; s=korg; t=1781623583; bh=31kMuXgR7jSMgny7Q854u3gLgpxULUjK4FYXnbD8x3M=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=aPp84kHnbX18Tc4k6eFMC1oSF/Au7PHGgk3YrPsYnfa/VUw8xIXvm2R/PxbiC/bWq XWVe6VK3DpFzBee8s98xQEbnHfjjE9OfIcr0BFFlaWUF0WVr8BF9mlTIaCQ7AHQwV8 TSLgWLHhrDKbX6a4yUOn3R+LEpjD6fsh2KmbaKDk= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Wupeng Ma , "Oscar Salvador (SUSE)" , Muchun Song , Kefeng Wang , Miaohe Lin , David Hildenbrand , Liam Howlett , Lorenzo Stoakes , Michal Hocko , Mike Rapoport , Naoya Horiguchi , Suren Baghdasaryan , Vlastimil Babka , Andrew Morton Subject: [PATCH 7.0 175/378] mm/memory-failure: fix hugetlb_lock AA deadlock in get_huge_page_for_hwpoison Date: Tue, 16 Jun 2026 20:26:46 +0530 Message-ID: <20260616145119.572036585@linuxfoundation.org> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260616145109.744539446@linuxfoundation.org> References: <20260616145109.744539446@linuxfoundation.org> User-Agent: quilt/0.69 X-stable: review X-Patchwork-Hint: ignore Precedence: bulk X-Mailing-List: patches@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit 7.0-stable review patch. If anyone has any objections, please let me know. ------------------ From: Wupeng Ma commit 3c2d42b8ee345b17a4ba56b0f6492d1ff4c1178e upstream. Two concurrent madvise(MADV_HWPOISON) calls on the same hugetlb page can trigger a recursive spinlock self-deadlock (AA deadlock) on hugetlb_lock when racing with a concurrent unmap: thread#0 thread#1 -------- -------- madvise(folio, MADV_HWPOISON) -> poisons the folio successfully madvise(folio, MADV_HWPOISON) unmap(folio) try_memory_failure_hugetlb get_huge_page_for_hwpoison spin_lock_irq(&hugetlb_lock) <- held __get_huge_page_for_hwpoison hugetlb_update_hwpoison() -> MF_HUGETLB_FOLIO_PRE_POISONED goto out: folio_put() refcount: 1 -> 0 free_huge_folio() spin_lock_irqsave(&hugetlb_lock) -> AA DEADLOCK! The out: path in __get_huge_page_for_hwpoison() calls folio_put() to drop the GUP reference while the hugetlb_lock is still held by the hugetlb.c wrapper get_huge_page_for_hwpoison(). If concurrent unmap has released the page table mapping reference, folio_put() drops the folio refcount to zero, triggering free_huge_folio() which attempts to re-acquire the non-recursive hugetlb_lock. Fix this by moving hugetlb_lock acquisition from the hugetlb.c wrapper into get_huge_page_for_hwpoison(). Place spin_unlock_irq() before the folio_put() at the out: label so the folio is always released outside the lock. [akpm@linux-foundation.org: fix race, rename label per Miaohe] Link: https://sashiko.dev/#/patchset/20260522010305.4099834-1-mawupeng1@huawei.com Link: https://lore.kernel.org/f39f405e-4b4b-8f79-70fe-a2b5b62114eb@huawei.com Link: https://lore.kernel.org/20260522010305.4099834-1-mawupeng1@huawei.com Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()") Signed-off-by: Wupeng Ma Acked-by: Oscar Salvador (SUSE) Acked-by: Muchun Song Reviewed-by: Kefeng Wang Acked-by: Miaohe Lin Cc: David Hildenbrand Cc: Liam Howlett Cc: Lorenzo Stoakes Cc: Michal Hocko Cc: Mike Rapoport Cc: Naoya Horiguchi Cc: Suren Baghdasaryan Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman --- include/linux/hugetlb.h | 8 -------- include/linux/mm.h | 8 -------- mm/hugetlb.c | 11 ----------- mm/memory-failure.c | 19 ++++++++++--------- 4 files changed, 10 insertions(+), 36 deletions(-) --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -153,8 +153,6 @@ long hugetlb_unreserve_pages(struct inod long freed); bool folio_isolate_hugetlb(struct folio *folio, struct list_head *list); int get_hwpoison_hugetlb_folio(struct folio *folio, bool *hugetlb, bool unpoison); -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void folio_putback_hugetlb(struct folio *folio); void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int reason); void hugetlb_fix_reserve_counts(struct inode *inode); @@ -420,12 +418,6 @@ static inline int get_hwpoison_hugetlb_f { return 0; } - -static inline int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - return 0; -} static inline void folio_putback_hugetlb(struct folio *folio) { --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4601,8 +4601,6 @@ extern int soft_offline_page(unsigned lo */ extern const struct attribute_group memory_failure_attr_group; extern void memory_failure_queue(unsigned long pfn, int flags); -extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared); void num_poisoned_pages_inc(unsigned long pfn); void num_poisoned_pages_sub(unsigned long pfn, long i); #else @@ -4610,12 +4608,6 @@ static inline void memory_failure_queue( { } -static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - return 0; -} - static inline void num_poisoned_pages_inc(unsigned long pfn) { } --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7160,17 +7160,6 @@ int get_hwpoison_hugetlb_folio(struct fo return ret; } -int get_huge_page_for_hwpoison(unsigned long pfn, int flags, - bool *migratable_cleared) -{ - int ret; - - spin_lock_irq(&hugetlb_lock); - ret = __get_huge_page_for_hwpoison(pfn, flags, migratable_cleared); - spin_unlock_irq(&hugetlb_lock); - return ret; -} - /** * folio_putback_hugetlb - unisolate a hugetlb folio * @folio: the isolated hugetlb folio --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1966,20 +1966,19 @@ void folio_clear_hugetlb_hwpoison(struct folio_free_raw_hwp(folio, true); } -/* - * Called from hugetlb code with hugetlb_lock held. - */ -int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, +static int get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared) { struct page *page = pfn_to_page(pfn); - struct folio *folio = page_folio(page); + struct folio *folio; bool count_increased = false; int ret, rc; + spin_lock_irq(&hugetlb_lock); + folio = page_folio(page); if (!folio_test_hugetlb(folio)) { ret = MF_HUGETLB_NON_HUGEPAGE; - goto out; + goto out_unlock; } else if (flags & MF_COUNT_INCREASED) { ret = MF_HUGETLB_IN_USED; count_increased = true; @@ -1995,13 +1994,13 @@ int __get_huge_page_for_hwpoison(unsigne } else { ret = MF_HUGETLB_RETRY; if (!(flags & MF_NO_RETRY)) - goto out; + goto out_unlock; } rc = hugetlb_update_hwpoison(folio, page); if (rc >= MF_HUGETLB_FOLIO_PRE_POISONED) { ret = rc; - goto out; + goto out_unlock; } /* @@ -2013,8 +2012,10 @@ int __get_huge_page_for_hwpoison(unsigne *migratable_cleared = true; } + spin_unlock_irq(&hugetlb_lock); return ret; -out: +out_unlock: + spin_unlock_irq(&hugetlb_lock); if (count_increased) folio_put(folio); return ret;