From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 359F93845A7 for ; Tue, 2 Jun 2026 20:46:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780433221; cv=none; b=rjYekY3m95HXXOP4hln0S3LZWfMXLiE1XE5nz2IH3n/Obh0ntT+YJQZXYGAGSZe5hFrEQywxmMrLS7nJSONZiQQc/CB7iW9KHCLY8AZaHkbZ+PqQW0NOlvOVTGg+ryQhmqTvoW/Z30VMPnj6SJEuCIvYjwA8UWW4QtY1XlfRsGY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780433221; c=relaxed/simple; bh=lb2XIbZlTjNGkUPWPpWm95ReJ5WNTIwuCYxz3VxOe7I=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=dUvZDU+DJ6IIIEfvVwyl7BYujFasm6MTPqwHsZaqIDR5fwKZ1bkTxVp6f423IxedS3JIPubbNAFxjZg6VOsIV/ldlqWCUZ1oI87LduY7mOMR+mm7zH/wk730aRZG5XpFVFCqrdn/JZSMV2Q8TdeXWqhfnPQTu6yyVXAxwZrC+Xo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QURh7Tw2; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QURh7Tw2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 582D41F00893; Tue, 2 Jun 2026 20:46:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1780433219; bh=MNdrsdm4Xhuoe6Hxf4H2i4bTonAA9JUAmIy3e+Mf3jI=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date; b=QURh7Tw2Mnk4jFYxo56kblxp1j2e6SS+WFDHXRwMaLz3lgl2wDvibUOkc+8p6jXZ7 UFexkkdDcP4RwwoPCPoKoZ/Y6dIag6QvXxGeS/Sh6PvBp5YjSlzrDqfN4KDXfhLa9Z 7Nf/mwEa+BprrW1zk4Qjil7H9YxisKWlW1YqGT8O7fxIHZklbO0saXc3OxfGJAT6iL j5qI0FOpCh7ZTF+1OCI3l7peoCvDrgpYFvQshwXz0SX70TSOYCikz6x8AKsS4d+4Vj RR+wa9ajhdeyvFz6JyMT9an/X6IY8OuOEvke1ddrNJx9NqVXafQc8oEhOIGJZabbKW iAAZXiVdVNY9g== From: sashiko-bot@kernel.org Subject: Re: [PATCH v6 3/6] x86/sev: Add support to perform RMP optimizations asynchronously Reply-To: sashiko-reviews@lists.linux.dev To: "Ashish Kalra" Cc: kvm@vger.kernel.org In-Reply-To: <29adedba9e093ec0c9aaafa006f4e822ac872d8a.1780427587.git.ashish.kalra@amd.com> References: <29adedba9e093ec0c9aaafa006f4e822ac872d8a.1780427587.git.ashish.kalra@amd.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Tue, 02 Jun 2026 20:46:58 +0000 Message-Id: <20260602204659.582D41F00893@smtp.kernel.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Thank you for your contribution! Sashiko AI review found 2 potential issue(= s) to consider: - [High] Missing `cond_resched()` in the leader RMPOPT loops inside `rmpopt= _work_handler` can trigger soft lockups. - [High] The inline assembly for `__rmpopt()` fails to mark the RAX registe= r as clobbered or output, which can corrupt the caller's loop variable. -- commit dea16dc76b50a8acabc16915c0d451449f0d1849 Author: Ashish Kalra x86/sev: Add support to perform RMP optimizations asynchronously This patch introduces support for the RMPOPT instruction to minimize RMP ch= eck overhead by skipping 1GB regions not containing SEV-SNP guest memory. It ad= ds a dedicated workqueue to process up to 2TB of RAM asynchronously. > diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c > index 089c9a14edc74..d7e40a5fe5ca6 100644 > --- a/arch/x86/virt/svm/sev.c > +++ b/arch/x86/virt/svm/sev.c [ ... ] > +static inline bool __rmpopt(u64 pa_start, u64 op_type) > +{ > + bool optimized; > + > + asm volatile(".byte 0xf2, 0x0f, 0x01, 0xfc" > + : "=3D@ccc" (optimized) > + : "a" (pa_start), "c" (op_type) > + : "memory", "cc"); [Severity: High] Does this inline assembly need to mark the RAX register as an output or clobber? Like other SEV-SNP instructions like PVALIDATE or RMPUPDATE, RMPOPT returns a status code in RAX. Since the inline assembly block does not mark RAX as modified by using "+a" (pa_start), the compiler is unaware that the register's value has changed. When rmpopt is inlined into the scanning loop, the compiler might map the loop variable pa to RAX. If the hardware overwrites RAX with a status code, could this corrupt the memory scan logic by resetting the physical address being scanned? > + > + return optimized; > +} [ ... ] > + if (cpumask_test_cpu(this_cpu, follower_mask)) { > + /* > + * Current CPU is a primary thread in rmpopt_cpumask. > + * Run leader locally and remove from follower mask. > + */ > + cpumask_clear_cpu(this_cpu, follower_mask); > + > + for (pa =3D rmpopt_pa_start; pa < rmpopt_pa_end; pa +=3D SZ_1G) > + rmpopt(pa); [Severity: High] Could this loop run for a long time without yielding and trigger soft locku= ps? Since rmpopt_pa_end can be up to 2TB, this loop might execute up to 2048 times. The follower loop further down includes a cond_resched(), but this leader loop and the sibling thread leader loop below omit it entirely. > + } else if (cpumask_intersects(topology_sibling_cpumask(this_cpu), > + follower_mask)) { > + /* > + * Current CPU is a sibling thread whose primary is in > + * rmpopt_cpumask. RMPOPT_BASE MSR is per-core, so it > + * is safe to run the leader locally. Remove the sibling's > + * primary from the follower mask as this core is already > + * covered by the leader. > + */ > + cpumask_andnot(follower_mask, follower_mask, > + topology_sibling_cpumask(this_cpu)); > + > + for (pa =3D rmpopt_pa_start; pa < rmpopt_pa_end; pa +=3D SZ_1G) > + rmpopt(pa); > + } else { > + /* > + * Current CPU does not have RMPOPT_BASE MSR programmed. > + * Pick an explicit leader from the cpumask to avoid #UD. > + */ > + int leader_cpu =3D cpumask_first(follower_mask); > + > + if (WARN_ON_ONCE(leader_cpu >=3D nr_cpu_ids)) { > + migrate_enable(); > + goto out; > + } > + > + cpumask_clear_cpu(leader_cpu, follower_mask); > + > + cpus_read_lock(); > + for (pa =3D rmpopt_pa_start; pa < rmpopt_pa_end; pa +=3D SZ_1G) > + smp_call_function_single(leader_cpu, rmpopt_smp, > + (void *)pa, true); [Severity: High] Does this loop also need to yield? It sends an IPI and busy-waits with wait=3Dtrue for the expensive scan up to 2048 times without yielding. --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/cover.1780427587.gi= t.ashish.kalra@amd.com?part=3D3