From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [78.32.30.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBE7C39B964 for ; Fri, 10 Apr 2026 11:18:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=78.32.30.218 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775819906; cv=none; b=Ku64jYSnf+0e8gKTPmn5Zw8bgAQh67h6p0B8Dymx9dxES8S2KscArse9XQYYLJuS8fP02csqP1dTPA9o8CIAa/vZ7oHHTZ/8pKwGh5NJm9DfzOFQVpb9UBF1+umCFkx+S8v18Y5T3fKuih6KJTs93xuHBG7mzA0QRJSxaoqPP7A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775819906; c=relaxed/simple; bh=42DxM9zUF/3MBs8UEFO5CxGIfEr/WJts+pOfTF3HDak=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=V6XPedi2xythWo6Fx5SrsbzJjUUgkCbvG+FILP3VINWpmkqrauAxNB7KvbhG/ReuhHJBtInTCKrLKdE312pQT+rNymPgOOWKCS9Ov2nbtev1s8oXFhbm64W2JkRiTrwebi6Zt2kntHXbdqKYeokb0dpprfw0WplTj/WMvHSy2h4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=armlinux.org.uk; spf=none smtp.mailfrom=armlinux.org.uk; dkim=pass (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b=SqyXaaNM; arc=none smtp.client-ip=78.32.30.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=armlinux.org.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=armlinux.org.uk Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b="SqyXaaNM" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=dmNbR1bJ/kcSUFG1hDL1Mddu+47MWFT9b9TQ0fiwSaQ=; b=SqyXaaNMA/1GZUz4J+Xb872qHC RSiOgQTVi/+G0KOF9zAdWiPn4WsSRHyfmkB+e0L9cCYEYbZLDQdSrqJykQ+QB19tV+9HPDAedoRC2 Ohu1gCqupbDXAi86Aqm+5VatNQZFiRZSKclNzDe6OSPERqBDjWUX5ESJ2tT1LdBKhsTGW6DB8FIT1 f7A5j0VZ3kin0zFgML0FOQZMprRJbF01wTDrGUL2+cdEootUntCPlDpu4DsE/NZw8a/UnGX4AFZg5 7njhoI+3TygNUP1bBkUCcf6aOBOqUxnplvKiuoaivmK3bE8TLVVBKhmTDfaHi7Yzcu+XG8bf5WtzZ jdHqMjYw==; Received: from shell.armlinux.org.uk ([fd8f:7570:feb6:1:5054:ff:fe00:4ec]:37780) by pandora.armlinux.org.uk with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wB9sK-000000004nB-42Nb; Fri, 10 Apr 2026 12:18:13 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.98.2) (envelope-from ) id 1wB9sJ-000000005ZR-23Wv; Fri, 10 Apr 2026 12:18:11 +0100 Date: Fri, 10 Apr 2026 12:18:11 +0100 From: "Russell King (Oracle)" To: Brian Ruley Cc: Will Deacon , Steve Capper , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm/arm: pgtable: remove young bit check for pte_valid_user Message-ID: References: <20260409125446.981747-1-brian.ruley@gehealthcare.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: Russell King (Oracle) On Fri, Apr 10, 2026 at 02:01:41PM +0300, Brian Ruley wrote: > Thank you for the clarification, this is very educational for me. > I understand your scepticism, and I can't explain what's going on based > on what you replied. However, I do honestly believe there is a problem > here. I'll share the exact testing details and the instrumentation > we added that convinced us to reach out at the end. One idea we also > had was that could cache aliasing be happening here. > > To clarify any potential misunderstanding, we've observed the > following: > > - Sporadic SIGILL and SIGSEGV under memory pressure > - Scales with core count, i.e., quad core more likely to reproduce > than dual core. We haven't observed an issue on single core. > - Coredumps show valid instructions at the faulting PC. > The CPU executed something different from what's in memory. > This pointed us to stale I-cache. > - Instrumentation indicates a correlation. > A per-CPU ring buffer tracking exec page migrations was dumped on > SIGILL. The faulting PC matched a recently migrated pages. > - We started seeing this after upgrade 6.1->6.12->6.18. We bisected > two commits which had an impact, but we weren't convinced that > either was the root cause: 5dfab109d5193e6c224d96cabf90e9cc2c039884 > and 6faea3422e3b4e8de44a55aa3e6e843320da66d2. > - Failed processes include systemd, tar, bash, ... > - Debug options, e.g., page poisoning, seems to hide the bug > > > > So you're saying that stress-ng doesn't reproduce this bug but > triggers the OOM-killer... confused. > > Apologies for the confusion. I meant that with `stress-ng' we created > the memory pressure and OOM might have played a role in exposing the > "bug" as we (at the time) believed that anything that would trigger > memory free/reclaims and page migration was the key. One note I'll add > is that in our test we invoked stress-ng for 2 minutes (--timeout 2m) > and after each we would reboot the device. We had observed that reboots > seemed to have a discernible effect on the occurence in earlier testing > so we kept that in. I'm beginning to doubt if it had an effect now, > and unfortunately it's all anecdotal. > > One more thing, even if you don't accept the patch, is this patch > harmful in any way or is it just sub-optimal? > > I'll send the instrumentation patch as a follow-up, migh be there's a > flaw in it. I'll try it - I have Cortex A9 systems (some which I rely on...) Please can you also try to track the history of what happens for the PTEs corresponding to the old and new PFN? Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!