From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69CCDC5475B for ; Fri, 1 Mar 2024 10:23:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:References:Cc:To:From: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=E+Q9boxXyji7AsDQRMQde/BIr8vesU1oNO0H/5s5SjM=; b=gWOhKtwkSG8VAh hZRryhb1pG8/kcOc9XydVvdSNOo0NV3ShNcPw1TeeLHH4y1RcrEEjhyBPcA3W2AOwDU2ZFrzu2HSN 3i2wD2UJCOeExklCXHehXvls61uVQP7g093AV+vx0Wblf7xHBRtvtBgUkJwq+KxulDhutyRO3hEGq V43r96n5FHY4uyQWX5eSbYPrc5GlaNuLM56BPU2lj/B50qTMzRKjPlgQYY4leLqIc2GnKpX1GIqYo qMffQN//mdehQrPUQ/GhlGwkUkpEwL0PtwxArANpLswkp/GT7OO7q7F1DR9DXCndz+uB55SqlCDGx DuxN3ol8KxHR0HGKB/MA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rg02a-0000000HLH0-3ud3; Fri, 01 Mar 2024 10:22:56 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rg02X-0000000HLFt-45bV for linux-arm-kernel@lists.infradead.org; Fri, 01 Mar 2024 10:22:56 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7876B1FB; Fri, 1 Mar 2024 02:23:28 -0800 (PST) Received: from [10.57.68.58] (unknown [10.57.68.58]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9BE513F762; Fri, 1 Mar 2024 02:22:47 -0800 (PST) Message-ID: <13b41ec7-8e98-45e4-a364-2b2e0ad3c2d6@arm.com> Date: Fri, 1 Mar 2024 10:22:47 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 36/43] arm64: mm: Add support for folding PUDs at runtime Content-Language: en-GB From: Ryan Roberts To: Ard Biesheuvel Cc: Nathan Chancellor , Matthew Wilcox , Ard Biesheuvel , linux-arm-kernel@lists.infradead.org, Catalin Marinas , Will Deacon , Marc Zyngier , Mark Rutland , Anshuman Khandual , Kees Cook , Aishwarya TCV , Mark Brown References: <20240214122845.2033971-45-ardb+git@google.com> <20240214122845.2033971-81-ardb+git@google.com> <20240229230159.GA1001503@dev-arch.thelio-3990X> <133a91b8-bc8b-42b3-afcd-595d7f45db84@arm.com> In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240301_022254_152746_5E8CB5ED X-CRM114-Status: GOOD ( 31.53 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 01/03/2024 09:47, Ryan Roberts wrote: > On 01/03/2024 09:37, Ard Biesheuvel wrote: >> On Fri, 1 Mar 2024 at 10:10, Ard Biesheuvel wrote: >>> >>> On Fri, 1 Mar 2024 at 09:54, Ryan Roberts wrote: >>>> >>>> + Matthew >>>> >>>> >>>> On 29/02/2024 23:01, Nathan Chancellor wrote: >>>>> On Thu, Feb 29, 2024 at 02:17:52PM +0000, Ryan Roberts wrote: >>>>>> Hi Ard, >>>>>> >>>>>> On 14/02/2024 12:29, Ard Biesheuvel wrote: >>>>>>> From: Ard Biesheuvel >>>>>>> >>>>>>> In order to support LPA2 on 16k pages in a way that permits non-LPA2 >>>>>>> systems to run the same kernel image, we have to be able to fall back to >>>>>>> at most 48 bits of virtual addressing. >>>>>>> >>>>>>> Falling back to 48 bits would result in a level 0 with only 2 entries, >>>>>>> which is suboptimal in terms of TLB utilization. So instead, let's fall >>>>>>> back to 47 bits in that case. This means we need to be able to fold PUDs >>>>>>> dynamically, similar to how we fold P4Ds for 48 bit virtual addressing >>>>>>> on LPA2 with 4k pages. >>>>>> >>>>>> I'm seeing a panic during boot in today's linux-next (20240229) and bisect seems pretty confident that this commit is the offender. That said, its the merge commit that shows up as the problem commit: >>>>>> >>>>>> 26843fe8fa72 Merge branch 'for-next/core' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux >>>>>> >>>>>> but when testing the arm64's for-next/core, the problem doesn't exist. So I rebased the branch into linux-next and bisected again. That time, it fingers this patch. So I guess there is some interaction between this and other changes in next? >>>>> <...> >>>>>> [ 0.161062] debug_vm_pgtable: [debug_vm_pgtable ]: Validating architecture page table helpers >>>>>> [ 0.161416] BUG: Bad page state in process swapper/0 pfn:18a65d >>>>>> [ 0.161634] page does not match folio >>>>>> [ 0.161753] page: refcount:0 mapcount:-512 mapping:0000000000000000 index:0x0 pfn:0x18a65d >>>>>> [ 0.162046] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 >>>>>> [ 0.162332] Mem abort info: >>>>>> [ 0.162427] ESR = 0x0000000096000004 >>>>>> [ 0.162559] EC = 0x25: DABT (current EL), IL = 32 bits >>>>>> [ 0.162723] SET = 0, FnV = 0 >>>>>> [ 0.162827] EA = 0, S1PTW = 0 >>>>>> [ 0.162933] FSC = 0x04: level 0 translation fault >>>>>> [ 0.163089] Data abort info: >>>>>> [ 0.163189] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 >>>>>> [ 0.163370] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 >>>>>> [ 0.163539] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 >>>>>> [ 0.163719] [0000000000000008] user address but active_mm is swapper >>>>>> [ 0.163934] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP >>>>>> [ 0.164143] Modules linked in: >>>>>> [ 0.164251] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.8.0-rc6-00966-gde701dc1f7f8 #25 >>>>>> [ 0.164516] Hardware name: linux,dummy-virt (DT) >>>>>> [ 0.164704] pstate: 21400005 (nzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) >>>>>> [ 0.165052] pc : get_pfnblock_flags_mask+0x3c/0x68 >>>>>> [ 0.165281] lr : __dump_page+0x1a0/0x408 >>>>>> [ 0.165504] sp : ffff80008007b8f0 >>>>>> [ 0.165715] x29: ffff80008007b8f0 x28: 0000000000ffffc0 x27: 0000000000000000 >>>>>> [ 0.166047] x26: ffff80008007b950 x25: 0000000000000000 x24: 00000000fffffdff >>>>>> [ 0.166358] x23: ffffba8a417ba000 x22: 000000000018a65d x21: ffffba8a41601bf8 >>>>>> [ 0.166701] x20: ffff80008007b950 x19: ffff80008007b950 x18: 0000000000000006 >>>>>> [ 0.167036] x17: 78303a7865646e69 x16: 2030303030303030 x15: 0720072007200720 >>>>>> [ 0.167365] x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 >>>>>> [ 0.167693] x11: 0720072007200720 x10: ffffba8a4269c038 x9 : ffffba8a3fb0d0b8 >>>>>> [ 0.168017] x8 : 00000000ffffefff x7 : ffffba8a4269c038 x6 : 80000000fffff000 >>>>>> [ 0.168346] x5 : 000003fffff81de4 x4 : 0001fffffc0ef230 x3 : 0000000000000000 >>>>>> [ 0.168699] x2 : 0000000000000007 x1 : fffffe0779181ee5 x0 : 00000000001fffff >>>>>> [ 0.169041] Call trace: >>>>>> [ 0.169164] get_pfnblock_flags_mask+0x3c/0x68 >>>>>> [ 0.169413] dump_page+0x2c/0x70 >>>>>> [ 0.169565] bad_page+0x84/0x130 >>>>>> [ 0.169734] free_page_is_bad_report+0xa0/0xb8 >>>>>> [ 0.169958] free_unref_page_prepare+0x350/0x428 >>>>>> [ 0.170132] free_unref_page+0x50/0x1f0 >>>>>> [ 0.170278] __free_pages+0x11c/0x160 >>>>>> [ 0.170417] free_pages.part.0+0x6c/0x88 >>>>>> [ 0.170576] free_pages+0x1c/0x38 >>>>>> [ 0.170703] destroy_args+0x1c8/0x330 >>>>>> [ 0.170890] debug_vm_pgtable+0xae8/0x10f8 >>>>>> [ 0.171059] do_one_initcall+0x60/0x2c0 >>>>>> [ 0.171222] kernel_init_freeable+0x1ec/0x3d8 >>>>>> [ 0.171406] kernel_init+0x28/0x1f0 >>>>>> [ 0.171557] ret_from_fork+0x10/0x20 >>>>>> [ 0.171712] Code: d37b1884 f100007f 8b040064 9a831083 (f9400460) >>>>>> [ 0.171963] ---[ end trace 0000000000000000 ]--- >>>>>> [ 0.172156] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b >>>>>> [ 0.172383] SMP: stopping secondary CPUs >>>>>> [ 0.172649] Kernel Offset: 0x3a89bf800000 from 0xffff800080000000 >>>>>> [ 0.173923] PHYS_OFFSET: 0xfffff76180000000 >>>>>> [ 0.174585] CPU features: 0x0,00000000,2004454a,13867723 >>>>>> [ 0.175707] Memory Limit: none >>>>>> [ 0.176261] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- >>>>> >>>>> I did a second bisection by merging https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/stage1-lpa2 >>>>> on top of the merges before for-next/core and eventually landed on: >>>>> >>>>> d67cd9f23139ddfd7e0ef1e18474c16445188433 is the first bad commit >>>>> commit d67cd9f23139ddfd7e0ef1e18474c16445188433 >>>>> Author: Matthew Wilcox (Oracle) >>>>> Date: Tue Feb 27 19:23:31 2024 +0000 >>>>> >>>>> mm: add __dump_folio() >>>>> >>>>> Turn __dump_page() into a wrapper around __dump_folio(). Snapshot the >>>>> page & folio into a stack variable so we don't hit BUG_ON() if an >>>>> allocation is freed under us and what was a folio pointer becomes a >>>>> pointer to a tail page. >>>>> >>>>> Link: https://lkml.kernel.org/r/20240227192337.757313-5-willy@infradead.org >>>>> Signed-off-by: Matthew Wilcox (Oracle) >>>>> Signed-off-by: Andrew Morton >>>> >>>> So is that suggesting that Ard's patch is doing something that the old >>>> __dump_page() was ok with but the new version doesn't like? I don't think so, >>>> because the bad page detection has already happened before we get to __dump_page(). >>>> >>> >>> Yes, there are clearly two different issues at play here. The NULL >>> dereference might be an issue in the __dump_page() patch, but going >>> down that code path in the first place seems like it might be a >>> problem with mine. >>> >>> The mapcount of -512 looks interesting as well. >>> >>>> So I'm not really sure how this patch is involved? I'm hoping that either Ard or >>>> Matthew may be able to take a look and advise. >>>> >>> >> >> The crash does not reproduce for me, but the warning can be fixed by >> >> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h >> index aeba2cf15a25..78f30b782889 100644 >> --- a/arch/arm64/include/asm/pgalloc.h >> +++ b/arch/arm64/include/asm/pgalloc.h >> @@ -61,7 +61,7 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud) >> if (!pgtable_l4_enabled()) >> return; >> BUG_ON((unsigned long)pud & (PAGE_SIZE-1)); >> - free_page((unsigned long)pud); >> + __pud_free(mm, pud); >> } >> #else >> static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot) >> >> I'll send this out as a patch shortly. > > Great thanks! I'm pretty sure I've found the bug in Matthew's patch - it is > copying the struct page to the stack to avoid a potential race, but later some > macros are hiding a page_to_pfn(). Since the page's address is on the stack, I > reckon that's giving a bogus pfn. Just confirming and writing it up and will > send against the original patch shortly. > OK confirmed. When I fix Matthew's patch, the panic gets converted to a warning, and when I add the fix for your patch, there is no warning at all. See write up for Matthew's bug here: https://lore.kernel.org/linux-mm/6de0d026-cd8d-4152-97ca-d33d2a4e2e84@arm.com/ Thanks, Ryan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel