* + fix-page-corruption-caused-by-racy-check-in-__free_pages.patch added to mm-hotfixes-unstable branch
@ 2023-02-09 22:01 Andrew Morton
0 siblings, 0 replies; only message in thread
From: Andrew Morton @ 2023-02-09 22:01 UTC (permalink / raw)
To: mm-commits, willy, stable, david.chen, akpm
The patch titled
Subject: mm/page_alloc.c: fix page corruption caused by racy check in __free_pages
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
fix-page-corruption-caused-by-racy-check-in-__free_pages.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/fix-page-corruption-caused-by-racy-check-in-__free_pages.patch
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: David Chen <david.chen@nutanix.com>
Subject: mm/page_alloc.c: fix page corruption caused by racy check in __free_pages
Date: Thu, 9 Feb 2023 17:48:28 +0000
When we upgraded our kernel, we started seeing some page corruption like
the following consistently:
BUG: Bad page state in process ganesha.nfsd pfn:1304ca
page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 in=
dex:0x0 pfn:0x1304ca
flags: 0x17ffffc0000000()
raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000
raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000
page dumped because: nonzero mapcount
CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O =
5.10.158-1.nutanix.20221209.el7.x86_64 #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Referenc=
e Platform, BIOS 6.00 04/05/2016
Call Trace:
dump_stack+0x74/0x96
bad_page.cold+0x63/0x94
check_new_page_bad+0x6d/0x80
rmqueue+0x46e/0x970
get_page_from_freelist+0xcb/0x3f0
? _cond_resched+0x19/0x40
__alloc_pages_nodemask+0x164/0x300
alloc_pages_current+0x87/0xf0
skb_page_frag_refill+0x84/0x110
...
Sometimes, it would also show up as corruption in the free list pointer and
cause crashes.
After bisecting the issue, we found the issue started from e320d3012d25:
if (put_page_testzero(page))
free_the_page(page, order);
else if (!PageHead(page))
while (order-- > 0)
free_the_page(page + (1 << order), order);
So the problem is the check PageHead is racy because at this point we
already dropped our reference to the page. So even if we came in with
compound page, the page can already be freed and PageHead can return false
and we will end up freeing all the tail pages causing double free.
Link: https://lkml.kernel.org/r/Message-ID:
Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages")
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
mm/page_alloc.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/mm/page_alloc.c~fix-page-corruption-caused-by-racy-check-in-__free_pages
+++ a/mm/page_alloc.c
@@ -5631,9 +5631,12 @@ EXPORT_SYMBOL(get_zeroed_page);
*/
void __free_pages(struct page *page, unsigned int order)
{
+ /* get PageHead before we drop reference */
+ int head =3D PageHead(page);
+
if (put_page_testzero(page))
free_the_page(page, order);
- else if (!PageHead(page))
+ else if (!head)
while (order-- > 0)
free_the_page(page + (1 << order), order);
}
_
Patches currently in -mm which might be from david.chen@nutanix.com are
fix-page-corruption-caused-by-racy-check-in-__free_pages.patch
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2023-02-09 22:01 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-09 22:01 + fix-page-corruption-caused-by-racy-check-in-__free_pages.patch added to mm-hotfixes-unstable branch Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).