From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id CCAB03DAC13 for ; Tue, 16 Jun 2026 06:13:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781590407; cv=none; b=Er0nmVEKckG9LFWLVeRoNVP+AS1uyDFWVRQCp8I0lB847CKj7IZVDJuoWCupTVd/aM25ruf8tmzcD2wI812kIQUdDIeDSsaY2q2edgecQCU6FIiMGTmWJ+yoM5YzvQ287EXlAut1Rcm7d/DIJWjGMPmksYSvhJSu0wzOeCMhWqI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781590407; c=relaxed/simple; bh=exst6UNP21nLNsBJWnmpOUh16QRNJ3gQ5GYP670UxIc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VHQTMLqF1wn/ViscDYNtapo3ndCclS7zDfL6RSESk73+l9b/MqGKRx+yCisJP/8+QzVGo9ZPTQjqSYU0/X5nLyDsvdoo5P6NGQalJEOoCOdm3PlaiAu38gNUyn8Ow8+ZfCwNdKSO/o0zwXoFeYmHT3Sjs8RqbaPo4MDfeIUgTPk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b=KzALwvei; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b="KzALwvei" Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 34C3B1A25; Mon, 15 Jun 2026 23:13:19 -0700 (PDT) Received: from J2N7QTR9R3 (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B49CF3F915; Mon, 15 Jun 2026 23:13:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1781590403; bh=exst6UNP21nLNsBJWnmpOUh16QRNJ3gQ5GYP670UxIc=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=KzALwveidHDsEbArVnisKQZ3yaixccqQmiUQh3STPwxjLqhCecxAP+ZevZwqvgRHU UvtAC6KOwm3o6BoIqw02MNkzCN/fPxrJdmPDkrgoynqcjlnDDSHVqE7JLMz24hbrW3 l8De9EYtRrR/OzTf0fOg/cDl/H3pCPAWGDG9kn0U= Date: Tue, 16 Jun 2026 07:13:07 +0100 From: Mark Rutland To: Will Deacon Cc: Linu Cherian , Catalin Marinas , Ryan Roberts , Kevin Brodsky , Anshuman Khandual , Yang Shi , Huang Ying , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] arm64: tlbflush: Don't broadcast if mm was only active on local cpu Message-ID: References: <20260523134710.3827956-1-linu.cherian@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: On Mon, Jun 15, 2026 at 03:44:20PM +0100, Will Deacon wrote: > On Mon, Jun 15, 2026 at 01:39:43PM +0100, Mark Rutland wrote: > > On Sun, Jun 14, 2026 at 12:04:44PM +0100, Will Deacon wrote: > > > On Sat, May 23, 2026 at 07:17:10PM +0530, Linu Cherian wrote: > > > > > > static inline void flush_tlb_mm(struct mm_struct *mm) > > > > { > > > > unsigned long asid; > > > > + bool local; > > > > > > > > - dsb(ishst); > > > > + local = flush_tlb_user_pre(mm, TLBF_NONE); > > > > asid = __TLBI_VADDR(0, ASID(mm)); > > > > - __tlbi(aside1is, asid); > > > > - __tlbi_user(aside1is, asid); > > > > - __tlbi_sync_s1ish(mm); > > > > + if (local) { > > > > + __tlbi(aside1, asid); > > > > + __tlbi_user(aside1, asid); > > > > + dsb(nsh); > > > > + } else { > > > > + __tlbi(aside1is, asid); > > > > + __tlbi_user(aside1is, asid); > > > > + __tlbi_sync_s1ish(mm); > > > > + } > > > > + flush_tlb_user_post(local); > > > > > > I think you've changed this since Ryan's original patch, but why are you > > > only calling __tlbi_sync_s1ish() for the !local case? Doesn't that break > > > the erratum workaround when running as a VM if the vCPU is migrated? > > > > The errata mitigated by __tlbi_sync_s1ish() only affect broadcast > > maintenance (the 'ish' in the name was intended to convey that). No > > workaround is necessary for local TLB maintenance; aside from anything > > else, when some PE executes the DSB to complete the maintenance, that > > DSB alone is sufficient to complete memory accesses made by that PE. > > > > If it would make things clearer, we could add a __tlbi_sync_s1nsh() > > helper for the local case, which would boil down to a DSB NSH. > > No, I don't think that's what I'm concerned about. I *think* you're missing the shape of the errata; more on that below. > > Regardless of the erratum, to correctly handle a vCPU being migrated > > from pCPU-x to pCPU-y, we rely on: > > > > * The host to set HCR_EL2.FB to ensure that TLB maintenance is > > broadcast to the ISH domain. > > > > * The host to set HCR_EL2.BSU to ensure the DSB is upgrade to ISH such > > that any guest-issued DSB NSH will it can complete any TLB maintenance > > that was upgraded to ISH. > > > > * The host to issue a DSB ISH on pCPU-x before the vCPU can run on > > pCPU-y, to complete any outstanding maintenance that was issued on > > pCPU-x. IIUC a DSB ISH on pCPU-y is not architecturally sufficient; it > > must be executed on the same CPU which issued the TLB maintenance. > > > > ... but as above, all of that should be independent of any of the errata > > that require the workaround. > > Yes, I understand all of the above but the case I'm struggling with is > where a vCPU runs on a system that needs the TLB invalidation to be > performed twice. For non-broadcast invalidation (from the guest > perspective), this patch will mean that it only performs the > invalidation once. So if the vCPU migrates to another physical CPU, can > that effectively undo the HCR_EL2.FB upgrade unless KVM issues TLB > invalidation as well as a DSB on migration? > > Maybe I'm missing something, as it looks like upstream already elides > the call to __tlbi_sync_s1ish() for the NOBROADCAST case. The key thing is that these errata only affect the completion of memory accesses, and only those accesses made by other (physical) PEs. A single TLBI will correctly remove the actual TLB entries, and HCR_EL2.{FB,BSU} will still ensure that TLB entries are removed from the TLBs of other PEs. The errata only prevent completion of memory accesses made on other (physical) PEs, and: * For accesses made by the vCPU which is issuing the TLBI(s): - Regardless of the errata, the hypervisor has to ensure that when a vCPU is migrated from pCPU-x to pCPU-y, any prior CMOs or TLBIs are completed, which requires the host to execute a DSB ISH on pCPU-x before the vCPU can be run on pCPU-y. Maybe we have a latent bug here? - Within the context of the vCPU thread, a DSB {NSH,ISH,OSH} will complete all prior accesses made by the vCPU *regardless* of any TLB invalidation. * For accesses made by *other* vCPUs, either: - Software in the VM intended to complete concurrency accesses made by other vCPUs. In which case, regardless of the errata, using a local TLBI alone is a software bug since that's not guaranteed to affect other PEs. - Software did not intend to complete accesses made by other vCPUs. In which case, it's fine that they may have uncompleted accesses. ... but maybe I'm still missing your concern? Mark.