Re: [PATCH v4 4/7] x86/sev: Add support to perform RMP optimizations asynchronously

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Kalra, Ashish" <ashish.kalra@amd.com>
To: Ackerley Tng <ackerleytng@google.com>,
	tglx@kernel.org, mingo@redhat.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	seanjc@google.com, peterz@infradead.org, thomas.lendacky@amd.com,
	herbert@gondor.apana.org.au, davem@davemloft.net,
	ardb@kernel.org
Cc: pbonzini@redhat.com, aik@amd.com, Michael.Roth@amd.com,
	KPrateek.Nayak@amd.com, Tycho.Andersen@amd.com,
	Nathan.Fontenot@amd.com, jackyli@google.com, pgonda@google.com,
	rientjes@google.com, jacobhxu@google.com, xin@zytor.com,
	pawan.kumar.gupta@linux.intel.com, babu.moger@amd.com,
	dyoung@redhat.com, nikunj@amd.com, john.allen@amd.com,
	darwi@linutronix.de, linux-kernel@vger.kernel.org,
	linux-crypto@vger.kernel.org, kvm@vger.kernel.org,
	linux-coco@lists.linux.dev
Subject: Re: [PATCH v4 4/7] x86/sev: Add support to perform RMP optimizations asynchronously
Date: Tue, 5 May 2026 15:30:34 -0500	[thread overview]
Message-ID: <c6235a3a-3ede-4f76-b5d6-67cee78fcca5@amd.com> (raw)
In-Reply-To: <CAEvNRgFRJNRyUf3T9TTWr9-xt76E=Z28vSKsdZ46QK3UAEd8dA@mail.gmail.com>

Hello Ackerley,

On 5/1/2026 1:57 PM, Ackerley Tng wrote:
> Ashish Kalra <Ashish.Kalra@amd.com> writes:
> 
>>
>> [...snip...]
>>
>> +/*
>> + * 'val' is a system physical address.
>> + */
>> +static void rmpopt_smp(void *val)
>> +{
>> +	u64 rax = ALIGN_DOWN((u64)val, SZ_1G);
>> +	u64 rcx = RMPOPT_FUNC_VERIFY_AND_REPORT_STATUS;
>> +
>> +	__rmpopt(rax, rcx);
>> +}
>> +
>> +static void rmpopt(u64 pa)
>> +{
>> +	u64 rax = ALIGN_DOWN(pa, SZ_1G);
>> +	u64 rcx = RMPOPT_FUNC_VERIFY_AND_REPORT_STATUS;
>> +
>> +	__rmpopt(rax, rcx);
>> +}
>> +
> 
> Could rmpopt_smp() call rmpopt() to remove duplicate code?

Yes. 

> 
>> +/*
>> + * RMPOPT optimizations skip RMP checks at 1GB granularity if this
>> + * range of memory does not contain any SNP guest memory.
>> + */
>> +static void rmpopt_work_handler(struct work_struct *work)
>> +{
>> +	bool current_cpu_cleared = false;
>> +	phys_addr_t pa;
>> +
>> +	pr_info("Attempt RMP optimizations on physical address range @1GB alignment [0x%016llx - 0x%016llx]\n",
>> +		rmpopt_pa_start, rmpopt_pa_end);
>> +
>> +	/*
>> +	 * RMPOPT scans the RMP table, stores the result of the scan in the
>> +	 * reserved processor memory. The RMP scan is the most expensive
>> +	 * part. If a second RMPOPT occurs, it can skip the expensive scan
>> +	 * if they can see a cached result in the reserved processor memory.
>> +	 *
>> +	 * Do RMPOPT on one CPU alone. Then, follow that up with RMPOPT
>> +	 * on every other primary thread. This potentially allows the
>> +	 * followers to use the "cached" scan results to avoid repeating
>> +	 * full scans.
> 
> Out of curiosity, how does this caching work? Is it possible to do it
> once and then synchronize the cache to the other CPUs?

The first CPU does the full RMP scan and stores the result of the scan in reserved processor memory.
And other CPUs can skip the scan if they can see a cached result in the reserved processor memory.
So i believe the other CPUs would *still* have to issue the RMPOPT instruction, but then they will 
avoid the full RMP scan and use the cached results.

> 
>> +	 */
>> +
>> +	if (cpumask_test_cpu(smp_processor_id(), &rmpopt_cpumask)) {
>> +		cpumask_clear_cpu(smp_processor_id(), &rmpopt_cpumask);
>> +		current_cpu_cleared = true;
>> +	}
>> +
>> +	/* current CPU */
>> +	for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G)
>> +		rmpopt(pa);
>> +
>> +	for (pa = rmpopt_pa_start; pa < rmpopt_pa_end; pa += SZ_1G) {
>> +		on_each_cpu_mask(&rmpopt_cpumask, rmpopt_smp,
>> +				 (void *)pa, true);
>> +
>> +		 /* Give a chance for other threads to run */
>> +		cond_resched();
>> +
>> +	}
>> +
>> +	if (current_cpu_cleared)
>> +		cpumask_set_cpu(smp_processor_id(), &rmpopt_cpumask);
> 
> Sashiko [1] pointed this out: after cond_resched(), this code might be
> on a different cpu so smp_processor_id() would return a different cpu,
> that would mess up the global cpumask.
> 
> Perhaps it's better to store the id on a stack? Or actually, what if we
> give on_each_cpu_mask a copy of rmpopt_cpumask with the current cpu
> unset?
> 
> [1] https://sashiko.dev/#/patchset/cover.1775874970.git.ashish.kalra%40amd.com
>

Yes, i think it makes sense to store the id on the stack.

Additionally, i will be moving the computing of the cpumask within this workitem,
so cpumask won't be global.
 
>> +}
>> +
>>  void snp_setup_rmpopt(void)
>>  {
>>  	u64 rmpopt_base;
>> @@ -568,9 +656,20 @@ void snp_setup_rmpopt(void)
>>  	if (!cpu_feature_enabled(X86_FEATURE_RMPOPT))
>>  		return;
>>
>> +	/*
>> +	 * Create an RMPOPT-specific workqueue to avoid scheduling
>> +	 * RMPOPT workitem on the global system workqueue.
>> +	 */
>> +	rmpopt_wq = alloc_workqueue("rmpopt_wq", WQ_UNBOUND, 1);
>> +	if (!rmpopt_wq) {
>> +		setup_clear_cpu_cap(X86_FEATURE_RMPOPT);
>> +		return;
>> +	}
>> +
>>  	/*
>>  	 * RMPOPT_BASE MSR is per-core, so only one thread per core needs to
>> -	 * setup RMPOPT_BASE MSR.
>> +	 * setup RMPOPT_BASE MSR. Additionally only one thread per core needs
>> +	 * to issue the RMPOPT instruction.
>>  	 */
>>
>>  	for_each_online_cpu(cpu) {
>> @@ -590,6 +689,20 @@ void snp_setup_rmpopt(void)
>>  	 * up to 2 TB of system RAM on all CPUs.
>>  	 */
>>  	wrmsrq_on_cpus(&rmpopt_cpumask, MSR_AMD64_RMPOPT_BASE, rmpopt_base);
>> +
>> +	INIT_DELAYED_WORK(&rmpopt_delayed_work, rmpopt_work_handler);
>> +
>> +	rmpopt_pa_end = ALIGN(PFN_PHYS(max_pfn), SZ_1G);
>> +
>> +	/* Limit memory scanning to the first 2 TB of RAM */
> 
> I think this is better phrased as "limit memory scanning to 2TB",
> 

Ok.

>> +	if ((rmpopt_pa_end - rmpopt_pa_start) > SZ_2T)
>> +		rmpopt_pa_end = rmpopt_pa_start + SZ_2T;
> 
> and then this could be
> 
>     rmpopt_pa_end = min(rmpopt_pa_end, rmpopt_pa_start + SZ_2T);
>

Yes.

Thanks,
Ashish

>> +
>> +	/*
>> +	 * Once all per-CPU RMPOPT tables have been configured, enable RMPOPT
>> +	 * optimizations on all physical memory.
>> +	 */
>> +	queue_delayed_work(rmpopt_wq, &rmpopt_delayed_work, 0);
>>  }
>>  EXPORT_SYMBOL_FOR_MODULES(snp_setup_rmpopt, "ccp");
>>
> 
> Reviewed-by: Ackerley Tng <ackerleytng@google.com>

next prev parent reply	other threads:[~2026-05-05 20:30 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-13 19:42 [PATCH v4 0/7] Add RMPOPT support Ashish Kalra
2026-04-13 19:42 ` [PATCH v4 1/7] x86/cpufeatures: Add X86_FEATURE_AMD_RMPOPT feature flag Ashish Kalra
2026-05-01 16:37   ` Ackerley Tng
2026-04-13 19:42 ` [PATCH v4 2/7] x86/msr: add wrmsrq_on_cpus helper Ashish Kalra
2026-05-01 18:33   ` Ackerley Tng
2026-04-13 19:43 ` [PATCH v4 3/7] x86/sev: Initialize RMPOPT configuration MSRs Ashish Kalra
2026-05-01 18:12   ` Ackerley Tng
2026-05-05 20:04     ` Kalra, Ashish
2026-04-13 19:43 ` [PATCH v4 4/7] x86/sev: Add support to perform RMP optimizations asynchronously Ashish Kalra
2026-05-01 18:57   ` Ackerley Tng
2026-05-05 20:30     ` Kalra, Ashish [this message]
2026-04-13 19:43 ` [PATCH v4 5/7] x86/sev: Add interface to re-enable RMP optimizations Ashish Kalra
2026-05-01 19:04   ` Ackerley Tng
2026-04-13 19:44 ` [PATCH v4 6/7] KVM: SEV: Perform RMP optimizations on SNP guest shutdown Ashish Kalra
2026-05-01 19:12   ` Ackerley Tng
2026-05-05 20:34     ` Kalra, Ashish
2026-04-13 19:44 ` [PATCH v4 7/7] x86/sev: Add debugfs support for RMPOPT Ashish Kalra
2026-04-29 23:07 ` [PATCH v4 0/7] Add RMPOPT support Kalra, Ashish

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c6235a3a-3ede-4f76-b5d6-67cee78fcca5@amd.com \
    --to=ashish.kalra@amd.com \
    --cc=KPrateek.Nayak@amd.com \
    --cc=Michael.Roth@amd.com \
    --cc=Nathan.Fontenot@amd.com \
    --cc=Tycho.Andersen@amd.com \
    --cc=ackerleytng@google.com \
    --cc=aik@amd.com \
    --cc=ardb@kernel.org \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=darwi@linutronix.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=dyoung@redhat.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=jackyli@google.com \
    --cc=jacobhxu@google.com \
    --cc=john.allen@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nikunj@amd.com \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pgonda@google.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=thomas.lendacky@amd.com \
    --cc=x86@kernel.org \
    --cc=xin@zytor.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox