From: Nam Cao <namcao@linutronix.de>
To: "Björn Töpel" <bjorn@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>,
Andreas Dilger <adilger@dilger.ca>,
Al Viro <viro@zeniv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Jan Kara <jack@suse.cz>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
linux-riscv@lists.infradead.org, Theodore Ts'o <tytso@mit.edu>,
Ext4 Developers List <linux-ext4@vger.kernel.org>,
Conor Dooley <conor@kernel.org>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Anders Roxell <anders.roxell@linaro.org>
Subject: Re: riscv32 EXT4 splat, 6.8 regression?
Date: Tue, 16 Apr 2024 08:44:17 +0200 [thread overview]
Message-ID: <20240416084417.569356d3@namcao> (raw)
In-Reply-To: <87le5e393x.fsf@all.your.base.are.belong.to.us>
On 2024-04-15 Björn Töpel wrote:
> Thanks for getting back! Spent some more time one it today.
>
> It seems that the buddy allocator *can* return a page with a VA that can
> wrap (0xfffff000 -- pointed out by Nam and myself).
>
> Further, it seems like riscv32 indeed inserts a page like that to the
> buddy allocator, when the memblock is free'd:
>
> | [<c024961c>] __free_one_page+0x2a4/0x3ea
> | [<c024a448>] __free_pages_ok+0x158/0x3cc
> | [<c024b1a4>] __free_pages_core+0xe8/0x12c
> | [<c0c1435a>] memblock_free_pages+0x1a/0x22
> | [<c0c17676>] memblock_free_all+0x1ee/0x278
> | [<c0c050b0>] mem_init+0x10/0xa4
> | [<c0c1447c>] mm_core_init+0x11a/0x2da
> | [<c0c00bb6>] start_kernel+0x3c4/0x6de
>
> Here, a page with VA 0xfffff000 is a added to the freelist. We were just
> lucky (unlucky?) that page was used for the page cache.
I just educated myself about memory mapping last night, so the below
may be complete nonsense. Take it with a grain of salt.
In riscv's setup_bootmem(), we have this line:
max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
I think this is the root cause: max_low_pfn indicates the last page
to be mapped. Problem is: nothing prevents PFN_DOWN(phys_ram_end) from
getting mapped to the last page (0xfffff000). If max_low_pfn is mapped
to the last page, we get the reported problem.
There seems to be some code to make sure the last page is not used
(the call to memblock_set_current_limit() right above this line). It is
unclear to me why this still lets the problem slip through.
The fix is simple: never let max_low_pfn gets mapped to the last page.
The below patch fixes the problem for me. But I am not entirely sure if
this is the correct fix, further investigation needed.
Best regards,
Nam
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index fa34cf55037b..17cab0a52726 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -251,7 +251,8 @@ static void __init setup_bootmem(void)
}
min_low_pfn = PFN_UP(phys_ram_base);
- max_low_pfn = max_pfn = PFN_DOWN(phys_ram_end);
+ max_low_pfn = PFN_DOWN(memblock_get_current_limit());
+ max_pfn = PFN_DOWN(phys_ram_end);
high_memory = (void *)(__va(PFN_PHYS(max_low_pfn)));
dma32_phys_limit = min(4UL * SZ_1G, (unsigned long)PFN_PHYS(max_low_pfn));
next prev parent reply other threads:[~2024-04-16 6:44 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-12 14:57 riscv32 EXT4 splat, 6.8 regression? Björn Töpel
2024-04-12 15:43 ` Theodore Ts'o
2024-04-12 16:59 ` Björn Töpel
2024-04-13 4:35 ` Theodore Ts'o
2024-04-13 10:01 ` Conor Dooley
2024-04-13 14:43 ` Nam Cao
2024-04-14 0:24 ` Theodore Ts'o
2024-04-14 1:46 ` Andreas Dilger
2024-04-14 2:04 ` Theodore Ts'o
2024-04-14 2:18 ` Al Viro
2024-04-14 2:15 ` Al Viro
2024-04-14 4:16 ` Andreas Dilger
2024-04-14 14:08 ` Björn Töpel
2024-04-15 13:04 ` Christian Brauner
2024-04-15 16:04 ` Björn Töpel
2024-04-16 6:44 ` Nam Cao [this message]
2024-04-16 8:25 ` Christian Brauner
2024-04-16 11:02 ` Björn Töpel
2024-04-16 14:24 ` Mike Rapoport
2024-04-16 15:17 ` Nam Cao
2024-04-16 15:30 ` Nam Cao
2024-04-16 15:56 ` Björn Töpel
2024-04-16 16:19 ` Nam Cao
2024-04-16 16:31 ` Mike Rapoport
2024-04-16 17:00 ` Matthew Wilcox
2024-04-16 18:34 ` Mike Rapoport
2024-04-16 22:36 ` Nam Cao
2024-04-17 15:31 ` Theodore Ts'o
2024-04-17 18:06 ` Nam Cao
2024-04-17 19:34 ` Mike Rapoport
2024-04-17 22:09 ` Andreas Dilger
2024-04-18 9:17 ` Nam Cao
2024-04-16 18:05 ` Björn Töpel
2024-04-16 18:09 ` Nam Cao
2024-04-16 16:19 ` Mike Rapoport
2024-04-16 16:31 ` Matthew Wilcox
2024-04-16 18:18 ` Mike Rapoport
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240416084417.569356d3@namcao \
--to=namcao@linutronix.de \
--cc=adilger@dilger.ca \
--cc=anders.roxell@linaro.org \
--cc=bjorn@kernel.org \
--cc=brauner@kernel.org \
--cc=conor@kernel.org \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).