All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Ashish Kalra" <Ashish.Kalra@amd.com>
Cc: kvm@vger.kernel.org
Subject: Re: [PATCH v7 3/6] x86/sev: Add support to perform RMP optimizations asynchronously
Date: Mon, 08 Jun 2026 19:12:51 +0000	[thread overview]
Message-ID: <20260608191251.80E181F00893@smtp.kernel.org> (raw)
In-Reply-To: <fc14f6cb80c37d08d308460d2f13279bf3a32ccb.1780903370.git.ashish.kalra@amd.com>

Thank you for your contribution! Sashiko AI review found 5 potential issue(s) to consider:
- [High] Acquiring cpus_read_lock() while migrate_disable() is active creates an AB-BA deadlock with CPU hotplug.
- [High] Executing the expensive RMP hardware scan via synchronous IPI risks extreme IRQ latency and hard lockups.
- [High] Ignoring the IPI return value allows the fallback leader scan to silently fail, forcing all follower CPUs to execute the slow path simultaneously.
- [High] Holding `cpus_read_lock()` continuously across the entire multi-gigabyte memory scan loop causes severe system-wide stalls.
- [Medium] Repeatedly calling `INIT_DELAYED_WORK()` on the statically allocated `rmpopt_delayed_work` triggers debugobjects warnings.
--

commit e93db67a4733d3052f6ffa4972745fd9c571f883
Author: Ashish Kalra <ashish.kalra@amd.com>
Subject: x86/sev: Add support to perform RMP optimizations asynchronously

When SEV-SNP is enabled, all writes to memory are checked to ensure
integrity of SNP guest memory. This imposes performance overhead on the
whole system.

[ ... ]

> diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c
> index 482008bb07e48..b42788a66d400 100644
> --- a/arch/x86/virt/svm/sev.c
> +++ b/arch/x86/virt/svm/sev.c

[ ... ]

> @@ -599,6 +620,146 @@ void snp_clear_rmpopt_configured(void)
>  	rmpopt_configured = false;
>  }
>  
> +/*
> + * RMPOPT: F2 0F 01 FC
> + *   Input:  RAX = system physical address (1GB aligned)
> + *           RCX = operation type
> + *   Output: CF set if the range was optimized
> + */

[ ... ]

> +static void rmpopt_work_handler(struct work_struct *work)
> +{
> +	cpumask_var_t follower_mask;
> +	phys_addr_t pa;
> +	int this_cpu;

[ ... ]

> +	/*
> +	 * Pin the worker to the current CPU for the leader loop so that
> +	 * this_cpu remains valid and the RMPOPT instruction executes on
> +	 * the correct CPU.
> +	 *
> +	 * Use migrate_disable() rather than get_cpu() to prevent
> +	 * migration while still allowing preemption.
> +	 */
> +	migrate_disable();
> +	this_cpu = smp_processor_id();
> +
> +	if (cpumask_test_cpu(this_cpu, follower_mask)) {

[ ... ]

> +	} else {
> +		/*
> +		 * Current CPU does not have RMPOPT_BASE MSR programmed.
> +		 * Pick an explicit leader from the cpumask to avoid #UD.
> +		 */
> +		int leader_cpu = cpumask_first(follower_mask);
> +
> +		if (WARN_ON_ONCE(leader_cpu >= nr_cpu_ids)) {
> +			migrate_enable();
> +			goto out;
> +		}
> +
> +		cpumask_clear_cpu(leader_cpu, follower_mask);
> +
> +		cpus_read_lock();

[Severity: High]
Does acquiring cpus_read_lock() while migrate_disable() is active create an
AB-BA deadlock with CPU hotplug?

The worker thread calls migrate_disable(), then attempts to acquire
cpus_read_lock() in the fallback leader path. A CPU hotplug thread acquires
the hotplug write lock and then waits for migration-disabled tasks to leave
the CPU, while the worker thread blocks on the hotplug lock waiting for the
write lock to be released.

> +		for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
> +			smp_call_function_single(leader_cpu, rmpopt_smp,
> +						 (void *)pa, true);

[Severity: High]
Will executing the expensive RMP hardware scan via synchronous IPI risk
extreme IRQ latency and hard lockups?

Both here and in the followers path, smp_call_function_single() and
on_each_cpu_mask() execute the rmpopt_smp callback in hardirq (IPI) context.
The commit message states the leader's scan is the most expensive part.
Executing a long-running hardware memory scan in hardirq context blocks all
interrupts on the target CPU for the duration of the scan, violating kernel
context rules.

[Severity: High]
Can ignoring the IPI return value allow the fallback leader scan to silently
fail?

If the leader CPU goes offline prior to the dispatch,
smp_call_function_single() returns an error (like -ENXIO) and rmpopt_smp is
never executed. Because this error is ignored, the required cached result
isn't generated. When the followers execute on_each_cpu_mask(), all primary
threads might simultaneously fall back to the slow, expensive scan path in
hardirq context, causing system-wide lockups.

> +			cond_resched();
> +		}
> +		cpus_read_unlock();
> +	}
> +
> +	migrate_enable();
> +
> +	/* Followers: run RMPOPT on remaining cores */
> +	cpus_read_lock();
> +	for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {

[Severity: High]
Does holding cpus_read_lock() continuously across the entire multi-gigabyte
memory scan loop cause severe system-wide stalls?

The loop processing up to 2TB of physical memory in 1GB chunks holds
cpus_read_lock(). While the loop calls cond_resched(), this only yields the
CPU and does not release the cpus_read_lock() rwsem. Since the operation
involves thousands of synchronous IPIs and can take multiple seconds, any
attempt by a writer (such as CPU hotplug) to acquire the lock will block,
subsequently blocking all other readers system-wide.

> +		on_each_cpu_mask(follower_mask, rmpopt_smp,
> +				 (void *)pa, true);
> +
> +		 /* Give a chance for other threads to run */
> +		cond_resched();
> +	}
> +	cpus_read_unlock();
> +
> +out:
> +	free_cpumask_var(follower_mask);
> +}
> +
>  void snp_setup_rmpopt(void)
>  {

[ ... ]

> @@ -607,11 +768,35 @@ void snp_setup_rmpopt(void)
>  		return;
>  
> +	guard(mutex)(&rmpopt_wq_mutex);

[ ... ]

>  	cpus_read_lock();
>  
>  	/*
>  	 * The RMPOPT_BASE MSR is per-core, so only one thread per core needs
> -	 * to set up the RMPOPT_BASE MSR.
> +	 * to set up the RMPOPT_BASE MSR. Likewise, only one thread per core
> +	 * needs to issue the RMPOPT instruction.

[ ... ]

>  	cpus_read_unlock();
> +
> +	INIT_DELAYED_WORK(&rmpopt_delayed_work, rmpopt_work_handler);

[Severity: Medium]
Could repeatedly calling INIT_DELAYED_WORK() on the statically allocated
rmpopt_delayed_work trigger debugobjects warnings?

The INIT_DELAYED_WORK() macro unconditionally initializes the work structure
and its underlying timer. Because rmpopt_delayed_work is statically
allocated and not dynamically freed, calling it multiple times (which occurs
if SEV-SNP is disabled and re-enabled) re-initializes an already tracked
debug object, triggering an 'ODEBUG: init active' warning.

> +
> +	rmpopt_pa_end = ALIGN(PFN_PHYS(max_pfn), SZ_1G);
> +
> +	/* Limit memory scanning to 2TB of RAM */
> +	if ((rmpopt_pa_end - rmpopt_pa_start) > SZ_2T) {
> +		pr_info("RMPOPT coverage limited to 2TB; memory above 0x%llx not optimized\n",
> +			rmpopt_pa_start + SZ_2T);
> +		rmpopt_pa_end = rmpopt_pa_start + SZ_2T;
> +	}
> +
> +	/*
> +	 * Once all per-CPU RMPOPT tables have been configured, enable RMPOPT
> +	 * optimizations on all physical memory.
> +	 */
> +	queue_delayed_work(rmpopt_wq, &rmpopt_delayed_work, 0);
>  }
>  EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/cover.1780903370.git.ashish.kalra@amd.com?part=3

  reply	other threads:[~2026-06-08 19:12 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-08 18:56 [PATCH v7 0/6] Add RMPOPT support Ashish Kalra
2026-06-08 18:56 ` [PATCH v7 1/6] x86/cpufeatures: Add X86_FEATURE_RMPOPT feature flag Ashish Kalra
2026-06-08 18:56 ` [PATCH v7 2/6] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
2026-06-08 19:17   ` sashiko-bot
2026-06-08 18:56 ` [PATCH v7 3/6] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
2026-06-08 19:12   ` sashiko-bot [this message]
2026-06-08 18:57 ` [PATCH v7 4/6] x86/sev: Add interface to re-enable RMP optimizations Ashish Kalra
2026-06-08 18:57 ` [PATCH v7 5/6] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
2026-06-08 19:12   ` sashiko-bot
2026-06-08 18:57 ` [PATCH v7 6/6] x86/sev: Add debugfs support for RMPOPT Ashish Kalra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260608191251.80E181F00893@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=Ashish.Kalra@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.