Re: [PATCH v4 4/4] arm64: errata: Work around early CME DVMSync acknowledgement

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Catalin Marinas <catalin.marinas@arm.com>
To: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will@kernel.org>, James Morse <james.morse@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Mark Brown <broonie@kernel.org>
Subject: Re: [PATCH v4 4/4] arm64: errata: Work around early CME DVMSync acknowledgement
Date: Fri, 3 Apr 2026 12:37:12 +0100	[thread overview]
Message-ID: <ac-maGr18CPKvh0X@arm.com> (raw)
In-Reply-To: <20260402101246.3870036-5-catalin.marinas@arm.com>

Some sashiko.dev feedback below:

On Thu, Apr 02, 2026 at 11:12:44AM +0100, Catalin Marinas wrote:
> +static inline void sme_dvmsync_add_pending(struct arch_tlbflush_unmap_batch *batch,
> +					   struct mm_struct *mm)
> +{
> +	if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714))
> +		return;
> +
> +	/*
> +	 * Order the mm_cpumask() read after the hardware DVMSync.
> +	 */
> +	dsb(ish);
> +	if (cpumask_empty(mm_cpumask(mm)))
> +		return;

Mentioned in the cover letter already but sashiko highlighted it as
well: the dsb here adds a possible overhead. I did not notice any
difference in some hand/AI-crafted benchmarks using
madvise(MADV_PAGEOUT). In practice, this erratum affects systems with a
small number of CPUs, so the eager DVMSync won't matter.

> +void sme_enable_dvmsync(void)
> +{
> +	/*
> +	 * stop_machine() will invoke this function concurrently on all
> +	 * affected CPUs. Serialise the initialisation.
> +	 */
> +	raw_spin_lock(&sme_dvmsync_init_lock);
> +	if (!cpumask_available(sme_dvmsync_cpus) &&
> +	    !zalloc_cpumask_var(&sme_dvmsync_cpus, GFP_ATOMIC))
> +		panic("Unable to allocate cpumasks for the SME DVMSync erratum");
> +	raw_spin_unlock(&sme_dvmsync_init_lock);
> +
> +	cpumask_set_cpu(smp_processor_id(), sme_dvmsync_cpus);
> +}

I don't think sashiko is correct here. It said that zalloc_cpumask_var()
may sleep on PREEMPT_RT kernels but I thought passing GFP_ATOMIC should
be sufficient.

> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 489554931231..88426d8ae11c 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -26,6 +26,7 @@
>  #include <linux/reboot.h>
>  #include <linux/interrupt.h>
>  #include <linux/init.h>
> +#include <linux/cpumask.h>
>  #include <linux/cpu.h>
>  #include <linux/elfcore.h>
>  #include <linux/pm.h>
> @@ -339,8 +340,41 @@ void flush_thread(void)
>  	flush_gcs();
>  }
>  
> +#ifdef CONFIG_ARM64_ERRATUM_4193714
> +
> +static int arch_dup_tlbbatch_mask(struct task_struct *dst)
> +{
> +	if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714))
> +		return 0;
> +
> +	if (!zalloc_cpumask_var(&dst->tlb_ubc.arch.cpumask, GFP_KERNEL))
> +		return -ENOMEM;
> +
> +	return 0;
> +}
> +
> +static void arch_release_tlbbatch_mask(struct task_struct *tsk)
> +{
> +	if (alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714))
> +		free_cpumask_var(tsk->tlb_ubc.arch.cpumask);
> +}
> +
> +#else
> +
> +static int arch_dup_tlbbatch_mask(struct task_struct *dst)
> +{
> +	return 0;
> +}
> +
> +static void arch_release_tlbbatch_mask(struct task_struct *tsk)
> +{
> +}
> +
> +#endif /* CONFIG_ARM64_ERRATUM_4193714 */
> +
>  void arch_release_task_struct(struct task_struct *tsk)
>  {
> +	arch_release_tlbbatch_mask(tsk);
>  	fpsimd_release_task(tsk);
>  }
>  
> @@ -356,6 +390,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
>  
>  	*dst = *src;
>  
> +	if (arch_dup_tlbbatch_mask(dst))
> +		return -ENOMEM;

This may indeed leak if the caller of arch_dup_task_struct() fails.
dup_task_struct() calls free_task_struct() on failure but not the
arch_release_task_struct().

The simplest fix is to just allocate the tlbbatch mask lazily via
arch_tlbbatch_add_pending(). The downside is that we need a GFP_ATOMIC
in there but that's only theoretical, such systems are built with
CPUMASK_OFFSTACK=n already and no allocation necessary anyway. The diff
on top would be:

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 88426d8ae11c..88904e47c7d9 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -342,15 +342,14 @@ void flush_thread(void)
 
 #ifdef CONFIG_ARM64_ERRATUM_4193714
 
-static int arch_dup_tlbbatch_mask(struct task_struct *dst)
+static void arch_dup_tlbbatch_mask(struct task_struct *dst)
 {
-	if (!alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714))
-		return 0;
-
-	if (!zalloc_cpumask_var(&dst->tlb_ubc.arch.cpumask, GFP_KERNEL))
-		return -ENOMEM;
-
-	return 0;
+	/*
+	 * Clear any inherited batch state. The cpumask is allocated lazily if
+	 * CPUMASK_OFFSTACK=y.
+	 */
+	if (alternative_has_cap_unlikely(ARM64_WORKAROUND_4193714))
+		memset(&dst->tlb_ubc.arch, 0, sizeof(dst->tlb_ubc.arch));
 }
 
 static void arch_release_tlbbatch_mask(struct task_struct *tsk)
@@ -361,9 +360,8 @@ static void arch_release_tlbbatch_mask(struct task_struct *tsk)
 
 #else
 
-static int arch_dup_tlbbatch_mask(struct task_struct *dst)
+static void arch_dup_tlbbatch_mask(struct task_struct *dst)
 {
-	return 0;
 }
 
 static void arch_release_tlbbatch_mask(struct task_struct *tsk)
@@ -390,8 +388,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
 
 	*dst = *src;
 
-	if (arch_dup_tlbbatch_mask(dst))
-		return -ENOMEM;
+	arch_dup_tlbbatch_mask(dst);
 
 	/*
 	 * Drop stale reference to src's sve_state and convert dst to

-- 
Catalin

     prev parent reply	other threads:[~2026-04-03 11:37 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-02 10:12 [PATCH v4 0/4] arm64: Work around C1-Pro erratum 4193714 (CVE-2026-0995) Catalin Marinas
2026-04-02 10:12 ` [PATCH v4 1/4] arm64: tlb: Introduce __tlbi_sync_s1ish_{kernel,batch}() for TLB maintenance Catalin Marinas
2026-04-02 10:12 ` [PATCH v4 2/4] arm64: tlb: Pass the corresponding mm to __tlbi_sync_s1ish() Catalin Marinas
2026-04-02 10:12 ` [PATCH v4 3/4] arm64: cputype: Add C1-Pro definitions Catalin Marinas
2026-04-02 10:12 ` [PATCH v4 4/4] arm64: errata: Work around early CME DVMSync acknowledgement Catalin Marinas
2026-04-03 11:37   ` Catalin Marinas [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:88426d8ae11 dfblob:88904e47c7d )
 OR (
bs:"Re: [PATCH v4 4/4] arm64: errata: Work around early CME DVMSync acknowledgement" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ac-maGr18CPKvh0X@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=broonie@kernel.org \
    --cc=james.morse@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.