From: Catalin Marinas <catalin.marinas@arm.com>
To: Will Deacon <will@kernel.org>
Cc: Vladimir Murzin <vladimir.murzin@arm.com>,
linux-arm-kernel@lists.infradead.org,
Marc Zyngier <maz@kernel.org>, Oliver Upton <oupton@kernel.org>,
Lorenzo Pieralisi <lpieralisi@kernel.org>,
Sudeep Holla <sudeep.holla@kernel.org>,
James Morse <james.morse@arm.com>,
Mark Rutland <mark.rutland@arm.com>,
Mark Brown <broonie@kernel.org>,
kvmarm@lists.linux.dev
Subject: Re: [PATCH 3/4] arm64: errata: Work around early CME DVMSync acknowledgement
Date: Fri, 13 Mar 2026 15:48:57 +0000 [thread overview]
Message-ID: <abQx6Q_FJH92cjXj@arm.com> (raw)
In-Reply-To: <abLT05Nq-9J-uBEY@willie-the-truck>
On Thu, Mar 12, 2026 at 02:55:15PM +0000, Will Deacon wrote:
> On Tue, Mar 10, 2026 at 03:35:19PM +0000, Catalin Marinas wrote:
> > On Mon, Mar 09, 2026 at 10:13:20AM +0000, Vladimir Murzin wrote:
> > > On 3/6/26 12:00, Catalin Marinas wrote:
> > > >>> @@ -1358,6 +1360,85 @@ void do_sve_acc(unsigned long esr, struct pt_regs *regs)
> > > >>> put_cpu_fpsimd_context();
> > > >>> }
> > > >>>
> > > >>> +#ifdef CONFIG_ARM64_ERRATUM_SME_DVMSYNC
> > > >>> +
> > > >>> +/*
> > > >>> + * SME/CME erratum handling
> > > >>> + */
> > > >>> +static cpumask_var_t sme_dvmsync_cpus;
> > > >>> +static cpumask_var_t sme_active_cpus;
> > > >>> +
> > > >>> +void sme_set_active(unsigned int cpu)
> > > >>> +{
> > > >>> + if (!cpus_have_final_cap(ARM64_WORKAROUND_SME_DVMSYNC))
> > > >>> + return;
> > > >>> + if (!cpumask_test_cpu(cpu, sme_dvmsync_cpus))
> > > >>> + return;
> > > >>> +
> > > >>> + if (!test_bit(ilog2(MMCF_SME_DVMSYNC), ¤t->mm->context.flags))
> > > >>> + set_bit(ilog2(MMCF_SME_DVMSYNC), ¤t->mm->context.flags);
> > > >>> +
> > > >>> + cpumask_set_cpu(cpu, sme_active_cpus);
> > > >>> +
> > > >>> + /*
> > > >>> + * Ensure subsequent (SME) memory accesses are observed after the
> > > >>> + * cpumask and the MMCF_SME_DVMSYNC flag setting.
> > > >>> + */
> > > >>> + smp_mb();
> > > >>
> > > >> I can't convince myself that a DMB is enough here, as the whole issue
> > > >> is that the SME memory accesses can be observed _after_ the TLB
> > > >> invalidation. I'd have thought we'd need a DSB to ensure that the flag
> > > >> updates are visible before the exception return.
> > > >
> > > > This is only to ensure that the sme_active_cpus mask is observed before
> > > > any SME accesses. The mask is later used to decide whether to send the
> > > > IPI. We have something like this:
> > > >
> > > > P0
> > > > STSET [sme_active_cpus]
> > > > DMB
> > > > SME access to [addr]
> > > >
> > > > P1
> > > > TLBI [addr]
> > > > DSB
> > > > LDR [sme_active_cpus]
> > > > CBZ out
> > > > Do IPI
> > > > out:
> > > >
> > > > If P1 did not observe the STSET to [sme_active_cpus], P0 should have
> > > > received and acknowledged the DVMSync before the STSET. Is your concern
> > > > that P1 can observe the subsequent SME access but not the STSET?
> > > >
> > > > No idea whether herd can model this (I only put this in TLA+ for the
> > > > main logic check but it doesn't do subtle memory ordering).
> > >
> > > JFYI, herd support for SME is still work-in-progress (specifically it misses
> > > updates in cat), yet it can model VMSA.
> > >
> > > IIUC, expectation here is that either
> > > - P1 observes sme_active_cpus, so we have to do_IPI or
> > > - P0 observes TLBI (say shutdown, so it must fault)
> > >
> > > anything else is unexpected/forbidden.
> > >
> > > AArch64 A
> > > variant=vmsa
> > > {
> > > int x=0;
> > > int active=0;
> > >
> > > 0:X1=active;
> > > 0:X3=x;
> > >
> > > 1:X0=(valid:0);
> > > 1:X1=PTE(x);
> > > 1:X2=x;
> > > 1:X3=active;
> > >
> > > }
> > > P0 | P1 ;
> > > MOV W0,#1 | STR X0,[X1] ;
> > > STR W0,[X1] (* sme_active_cpus *) | DSB ISH ;
> > > DMB SY | LSR X9,X2,#12 ;
> > > LDR W2,[X3] (* access to [addr] *) | TLBI VAAE1IS,X9 (* [addr] *) ;
> > > | DSB ISH ;
> > > | LDR W4,[X3] (* sme_active_cpus *) ;
> > >
> > > exists ~(1:X4=1 \/ fault(P0,x))
> > >
> > > Is that correct understanding? Have I missed anything?
> >
> > Yes, I think that's correct. Another tweak specific to this erratum
> > would be for P1 to do a store to x via another mapping after the
> > TLBI+DSB and the P0 load should not see it.
> >
> > Even with the CPU erratum, if the P1 DVMSync is received/acknowledged by
> > P0 before its STR to sme_active_cpus, I don't see how the subsequent SME
> > load would overtake the STR given the DMB. The erratum messed up the
> > DVMSync acknowledgement, not the barriers.
>
> I'm still finding this hard to reason about.
>
> Why can't:
>
> 1. P0 translates its SME load and puts the valid translation into its TLB
> 2. P1 runs to completion, sees sme_active_cpus as 0 and so doesn't IPI
> 3. P0 writes to sme_active_cpus and then does the SME load using the
> translation from (1)
>
> I guess it's diving into ugly corners of what the erratum actually is...
From discussing with the microarchitects at the time, a DMB ISH was
sufficient on the ERET path. Whether they thought about your scenario,
not sure. Memory ordering isn't broken by this bug, only the DVMSync
acknowledgement not waiting for the CME unit (shared by multiple CPUs)
to complete an in-flight memory access. My assumption is that step (1)
won't actually start until the STR in (3) is issued and this would
include the TLB lookup.
Anyway, I'll ask them again to be sure.
--
Catalin
next prev parent reply other threads:[~2026-03-13 15:49 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-02 16:57 [PATCH 0/4] arm64: Work around C1-Pro erratum 4193714 (CVE-2026-0995) Catalin Marinas
2026-03-02 16:57 ` [PATCH 1/4] arm64: tlb: Use __tlbi_sync_s1ish_kernel() for kernel TLB maintenance Catalin Marinas
2026-03-03 13:12 ` Mark Rutland
2026-03-05 11:27 ` Catalin Marinas
2026-03-09 12:12 ` Mark Rutland
2026-03-02 16:57 ` [PATCH 2/4] arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() Catalin Marinas
2026-03-05 14:33 ` Will Deacon
2026-03-05 19:19 ` Catalin Marinas
2026-03-06 11:15 ` Catalin Marinas
2026-03-12 15:00 ` Will Deacon
2026-03-13 16:27 ` Catalin Marinas
2026-03-02 16:57 ` [PATCH 3/4] arm64: errata: Work around early CME DVMSync acknowledgement Catalin Marinas
2026-03-05 14:32 ` Will Deacon
2026-03-06 12:00 ` Catalin Marinas
2026-03-06 12:19 ` Catalin Marinas
2026-03-09 10:13 ` Vladimir Murzin
2026-03-10 15:35 ` Catalin Marinas
2026-03-12 14:55 ` Will Deacon
2026-03-13 15:48 ` Catalin Marinas [this message]
2026-03-13 15:58 ` Will Deacon
2026-03-17 12:09 ` Mark Rutland
2026-03-02 16:57 ` [PATCH 4/4] KVM: arm64: Add SMC hook for SME dvmsync erratum Catalin Marinas
2026-03-05 14:32 ` Will Deacon
2026-03-06 12:52 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abQx6Q_FJH92cjXj@arm.com \
--to=catalin.marinas@arm.com \
--cc=broonie@kernel.org \
--cc=james.morse@arm.com \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=lpieralisi@kernel.org \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=oupton@kernel.org \
--cc=sudeep.holla@kernel.org \
--cc=vladimir.murzin@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox