From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 518822135AC; Thu, 12 Dec 2024 15:48:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734018531; cv=none; b=W9sSl4O2L2U2GkPE9vnkT394xhOL7a/dfmAqG+xH7bfz7TpkUj3AsgMRk0vqdpW+t9UOx0qHD42bhcqXtsdANNKtI+hEhiVHT/NcE1rXdlShhUXCCln+edTIGieYIBYf5qeMV3tdkOGYKbGoAcR4UOlf5NChcOUknZZIG0mzgiQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734018531; c=relaxed/simple; bh=mPQRjGnXG+0+zoGZDl5Y97k8+OqQv9RO2G0rAjvoyKY=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=ai9R4f+MLldIxOV4HamYo4G93ubeYQq+3SbTCoIsIP8zZ2fSM2r/ygC0npK1BuTBa9tR8XpaidW2qWMRqDQWweM7KewC0ZYaLowcAXfcYAPe5oETZSzmBXLO/m8G0yVP5sn4mCiKU6OiZ9mMYKs/2+4EjAiHOuaeZhwClWk2EZ0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=t/bUYKJb; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="t/bUYKJb" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DC8A2C4CECE; Thu, 12 Dec 2024 15:48:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734018530; bh=mPQRjGnXG+0+zoGZDl5Y97k8+OqQv9RO2G0rAjvoyKY=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=t/bUYKJbZsL4NErnW6sI1MyJPkUK7Hv57rjXC85kfmtivjpCAObRUlSFkXC+d9CXB t2Bamp1jnF20qAQkkikv6oIQDOGtDdaKPPI01VOEwAUaXOPZsdjIYZWtBFTeuAEz+k vmnFsk0YDzr7PRMdphvaSzrXRJ80OEguZGCPY0s7uHQan0wEr0BU6iiGXVp0r9YPr7 0ORymP+FrOe80StbudP5EhOKebE8G9WA4PepWvqhxQt19C9TP9XtKS5hncChOkCFcF k3IggTfO0HwhXoQ7f1hANm2IpfegpgSdXW2XCp5J0Blf1dNpT3BnLa3yRfjFxc+oqd 5MkejfcODGxpA== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1tLlQl-0037eT-Ge; Thu, 12 Dec 2024 15:48:48 +0000 Date: Thu, 12 Dec 2024 15:48:47 +0000 Message-ID: <86h678sy00.wl-maz@kernel.org> From: Marc Zyngier To: Ryan Roberts Cc: =?UTF-8?B?TWlrb8WCYWo=?= Lenczewski , catalin.marinas@arm.com, will@kernel.org, corbet@lwn.net, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev Subject: Re: [RESEND RFC PATCH v1 2/5] arm64: Add BBM Level 2 cpu feature In-Reply-To: <2b1cc228-a8d5-4383-ab25-abbbcccd2e2c@arm.com> References: <20241211160218.41404-1-miko.lenczewski@arm.com> <20241211160218.41404-3-miko.lenczewski@arm.com> <87cyhxs3xq.wl-maz@kernel.org> <084c5ada-51af-4c1a-b50a-4401e62ddbd6@arm.com> <86ikrprn7w.wl-maz@kernel.org> <2b1cc228-a8d5-4383-ab25-abbbcccd2e2c@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.4 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: ryan.roberts@arm.com, miko.lenczewski@arm.com, catalin.marinas@arm.com, will@kernel.org, corbet@lwn.net, oliver.upton@linux.dev, joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Thu, 12 Dec 2024 15:05:24 +0000, Ryan Roberts wrote: > > On 12/12/2024 14:26, Marc Zyngier wrote: > > On Thu, 12 Dec 2024 10:55:45 +0000, > > Ryan Roberts wrote: > >> > >> On 12/12/2024 08:25, Marc Zyngier wrote: > >>>> + > >>>> + local_flush_tlb_all(); > >>> > >>> The elephant in the room: if TLBs are in such a sorry state, what > >>> guarantees we can make it this far? > >> > >> I'll leave Miko to respond to your other comments, but I wanted to address this > >> one, since it's pretty fundamental. We went around this loop internally and > >> concluded that what we are doing is architecturally sound. > >> > >> The expectation is that a conflict abort can only be generated as a result of > >> the change in patch 4 (and patch 5). That change makes it possible for the TLB > >> to end up with a multihit. But crucially that can only happen for user space > >> memory because that change only operates on user memory. And while the TLB may > >> detect the conflict at any time, the conflict abort is only permitted to be > >> reported when an architectural access is prevented by the conflict. So we never > >> do anything that would allow a conflict for a kernel memory access and a user > >> memory conflict abort can never be triggered as a result of accessing kernel memory. > >> > >> Copy/pasting comment from AlexC on the topic, which explains it better than I can: > >> > >> """ > >> The intent is certainly that in cases where the hardware detects a TLB conflict > >> abort, it is only permitted to report it (by generating an exception) if it > >> applies to an access that is being attempted architecturally. ... that property > >> can be built from the following two properties: > >> > >> 1. The TLB conflict can only be reported as an Instruction Abort or a Data Abort > >> > >> 2. Those two exception types must be reported synchronously and precisely. > >> """ > > > > I totally agree with this. The issue is that nothing says that the > > abort is in any way related to userspace. > > > >>> > >>> I honestly don't think you can reliably handle a TLB Conflict abort in > >>> the same translation regime as the original fault, given that we don't > >>> know the scope of that fault. You are probably making an educated > >>> guess that it is good enough on the CPUs you know of, but I don't see > >>> anything in the architecture that indicates the "blast radius" of a > >>> TLB conflict. > >> > >> OK, so I'm claiming that the blast radius is limited to the region of memory > >> that we are operating on in contpte_collapse() in patch 4. Perhaps we need to go > >> re-read the ARM and come back with the specific statements that led us to that > >> conclusion? > > From the ARM: > """ > RFCPSG: If level 1 or level 2 is supported and the Contiguous bit in a set of > Block descriptors or Page descriptors is changed, then a TLB conflict abort can > be generated because multiple translation table entries might exist within a TLB > that translates the same IA. > """ > > Although I guess it's not totally explicit, I've interpretted that as saying > that conflicting TLB entries can only arise for the IA range for which the > contiguous bits have been modified in the translation tables. Right, that's reassuring, thanks for digging that one. > Given we are only fiddling with the contiguous bits for user space mappings in > this way, that's why I'm asserting we will only get a conflict abort for user > space mappings... assuming the absence of kernel bugs, anyway... For now. But if you dare scanning the list, you'll find a lot of people willing to do far more than just that. Including changing the shape of the linear map. > > > > > But we don't know for sure what caused this conflict by the time we > > arrive in the handler. It could equally be because we have a glaring > > bug somewhere on the kernel side, even if you are *now* only concerned > > with userspace. > > OK I see what you are saying; previously a conflict abort would have led to > calling do_bad(), which returns 1, which causes do_mem_abort() to either kill > the kernel or the process depending on the origin of the abort. (although if it > came from kernel due to bug, we're just hoping that the conflict doesn't affect > the path through the handler). With this change, we always assume we can fix it > with the TLBI. > > How about this change to ensure we still die for issues originating from the kernel? > > if (!user_mode(regs) || !system_supports_bbml2()) > return do_bad(far, esr, regs); That wouldn't catch a TLB conflict on get_user(), would it? > > If anything, this should absolutely check for FAR_EL1 and assert that > > this is indeed caused by such change. > > I'm not really sure how we would check this reliably? Without patch 5, the > problem is somewhat constrained; we could have as many changes in flight as > there are CPUs so we could keep a list of all the {mm_struct, VA-range} that are > being modified. But if patch 5 is confirmed to be architecturally sound, then > there is no "terminating tlbi" so there is no bound on the set of {mm_struct, > VA-range}'s that could legitimately cause a conflict abort. I didn't mean to imply that we should identify the exact cause of the abort. I was hoping to simply check that FAR_EL1 reports a userspace VA. Why wouldn't that work? M. -- Without deviation from the norm, progress is not possible.