From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B0461C433F5 for ; Mon, 18 Apr 2022 09:18:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=DGgR9oxYrhwXtNjMYF1pm605sEbkLDQvfHXRhs6jjro=; b=DBfUT0g61mRA4g 1gZ0pNaqZbPs+l48Nz5AzpsoCE4EbaNoRXfFFqGMw1CXHZFqNg5/qX9RlKskhFRJTP2PoKrWnd9VB 1qSM90a3jnosOhjGbP+g1YsyEy0KMom/NVGZrkzJfrxpgNiBr+oTkxwUMYhStE3Aek/9QCwMzbE9i +FHyrkKGfgJbN3UB+Fs7hnTWNG/Hckr1cLXW34vdQhItdVpj6oMwmUDxjug0A0gZk+mEsdm5nDqOE IT4GqNQBlNcBQaPeU33rbmD9RG0WKGRtNzOBdmmPp/N8WfZhTWPH09nE1EstzAMcSe8wB4dZSKX4+ B4N4RVUWq+xjKGU09NgQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ngNVN-00G9vn-MW; Mon, 18 Apr 2022 09:17:09 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ngNVJ-00G9vB-HC for linux-arm-kernel@lists.infradead.org; Mon, 18 Apr 2022 09:17:07 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 49D8CB80E40; Mon, 18 Apr 2022 09:17:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0E96BC385A1; Mon, 18 Apr 2022 09:17:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650273421; bh=NVOZp6CGSVmMrZP1rqXyT6TEDtjDGZjPtOcdJWLwaOg=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=suF7PDQG8KRCr1eDDQLgZ4S7ZLjK+TYDr6WooPbHi1iR2kShEBKWPDp8iy7z2UaBK ztiIedqUWg/HBTh6s7MRA5qY8564g6Yq48DK/RA2aJpTMfwAlBfLqRyz43dULBba6h lVNSAtGuvQig7MFIood0xfbLAkPupU2F1Um6z1H6nlaAbTNReNjlgg1NbrwyeNvwOY oVLvgpmL6LPVfXlEFmA6yhkGQlim9ARk5IJ8cAkycte8Ipb8qWz/NF9pw/htIt5MIP zwZiW6NLXVRBPWKQYhEXm53bLjfsg1/NI09TW2KxwJQ5vzgWiJI7lv43O06epoX6Ay 4oEJGOK5E9LYw== Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ngNVC-004yMD-LZ; Mon, 18 Apr 2022 10:16:58 +0100 Date: Mon, 18 Apr 2022 10:16:58 +0100 Message-ID: <87y202agz9.wl-maz@kernel.org> From: Marc Zyngier To: Shanker Donthineni Cc: Catalin Marinas , Will Deacon , Mark Rutland , , , "Ard\ Biesheuvel" , Vikram Sethi , "Thierry\ Reding" , Anshuman Khandual Subject: Re: [PATCH] arm64: head: Fix cache inconsistency of the identity-mapped region In-Reply-To: <20220415170504.3781878-1-sdonthineni@nvidia.com> References: <20220415170504.3781878-1-sdonthineni@nvidia.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: sdonthineni@nvidia.com, catalin.marinas@arm.com, will@kernel.org, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, ardb@kernel.org, vsethi@nvidia.com, treding@nvidia.com, anshuman.khandual@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220418_021705_920156_98C99FD0 X-CRM114-Status: GOOD ( 27.99 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Shanker, On Fri, 15 Apr 2022 18:05:03 +0100, Shanker Donthineni wrote: > > The secondary cores boot is stuck due to data abort while executing the > instruction 'ldr x8, =__secondary_switched'. The RELA value of this > instruction was updated by a primary boot core from __relocate_kernel() > but those memory updates are not visible to CPUs after calling > switch_to_vhe() causing problem. > > The cacheable/shareable attributes of the identity-mapped regions are > different while CPU executing in EL1 (MMU enabled) and for a short period > of time in hyp-stub (EL2-MMU disabled). As per the ARM-ARM specification > (DDI0487G_b), this is not allowed. > > G5.10.3 Cache maintenance requirement: > "If the change affects the cacheability attributes of the area of memory, > including any change between Write-Through and Write-Back attributes, > software must ensure that any cached copies of affected locations are > removed from the caches, typically by cleaning and invalidating the > locations from the levels of cache that might hold copies of the locations > affected by the attribute change." > > Clean+invalidate the identity-mapped region till PoC before switching to > VHE world to fix the cache inconsistency. > > Problem analysis with disassembly (vmlinux): > 1) Both __primary_switch() and enter_vhe() are part of the identity region > 2) RELA entries and enter_vhe() are sharing the same cache line fff800010970480 > 3) Memory ffff800010970484-ffff800010970498 is updated with EL1-MMU enabled > 4) CPU fetches intrsuctions of enter_vhe() with EL2-MMU disabled > - Non-coherent access causing the cache line fff800010970480 drop Non-coherent? You mean non-cacheable, right? At this stage, we only have a single CPU, so I'm not sure coherency is the problem here. When you say 'drop', is that an eviction? Replaced by what? By the previous version of the cache line, containing the stale value? It is also unclear to me how the instruction fetches directly influence what happens *on the other CPUs*. Is this line kept at a level beyond the PoU? Are we talking of a system cache here? It would really help if you could describe your cache topology. > 5) Secondary core executes 'ldr x8, __secondary_switched' > - Getting data abort because of the incorrect value at ffff800010970488 My interpretation of the above is as follows: - CPU0 performs the RELA update with the MMU on - A switch to EL2 with the MMU off results in the cache line sitting beyond the PoU and containing the RELA update to be replaced with the *stale* version (the fetch happening on the i-side). - CPU1 (with its MMU on) fetches the stale data from the cache Is this correct? What is unclear to me is why the eviction occurs. Yes, this is of course allowed, and the code is wrong for assuming any odd behaviour. But then, why only clean the idmap? I have the feeling that by this standard, we should see this a lot more often. Or are we just lucky that we don't have any other examples of data and instructions sharing the same cache line and accessed with different cacheability attributes? Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel