From: Sean Christopherson <seanjc@google.com>
To: Zheyun Shen <szy0127@sjtu.edu.cn>
Cc: Tom Lendacky <thomas.lendacky@amd.com>,
pbonzini@redhat.com, tglx@linutronix.de,
Kevin Loughlin <kevinloughlin@google.com>,
mingo@redhat.com, bp@alien8.de, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v7 3/3] KVM: SVM: Flush cache only on CPUs running SEV guest
Date: Wed, 26 Feb 2025 23:58:44 +0000 [thread overview]
Message-ID: <Z7-qtIjS0bdc5J0r@google.com> (raw)
In-Reply-To: <D32EF18F-7C4D-466B-9058-1EBD4C378EFC@sjtu.edu.cn>
On Wed, Feb 26, 2025, Zheyun Shen wrote:
> I'm very sorry that the formatting of my previous email was messed up due to
> an issue with the email client. I am sending a new email with the same
> content.
Something is still not quite right, as your mails aren't hitting lore.kernel.org,
i.e. are getting dropped by the lists.
> > Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Jan 28, 2025, Zheyun Shen wrote:
> > Changed the time of recording the CPUs from pre_sev_run() to vcpu_load().
> >
> > Why? If there's a good reason, then that absolutely, positively belongs in the
> > changelog and in the code as a comment. If there's no good reason, then...
> >
> The reason I moved the timing of CPU recording from pre_sev_run() to
> vcpu_load() is that I found vcpu_load() is always present in the call path of
> kvm_arch_vcpu_ioctl_run(). Moreover, whenever a vCPU migration occurs, the
> control flow will reach vcpu_load() again to ensure the correctness of CPU
> recording. On the other hand, recording information in pre_sev_run() would
> result in recording the CPU number every time before entering the guest.
> Without vCPU migration, only the first time to record is effective and the
> subsequent records are redundant and thus waste time. This would result in
> each VM exit taking longer (although the additional time may be very short).
So long as KVM performs a lockless test to see if the CPU, the cost should be
minimal. This:
if (!cpumask_test_cpu(cpu, to_kvm_sev_info(kvm)->have_run_cpus))
cpumask_set_cpu(cpu, to_kvm_sev_info(kvm)->have_run_cpus);
Generates a bt+jb to guard the locked bts, i.e. *should* only add 1-2 uops to the
entry flow, once CPU's predictors warmed up enough to not prefetch the RMW and
bound the cache line.
0x0000000000016603 <+67>: bt %r8,0xa428(%rdx)
0x000000000001660b <+75>: jb 0x16616 <pre_sev_run+86>
0x000000000001660d <+77>: lock bts %r8,0xa428(%rdx)
If it turns out that marking a CPU as having run that ASID becomes a hot spot,
then I'm definitely open to moving things around. But for a first go at this,
and especially without evidence that it's problematic, I want to go with the
approach that's most obviously correct.
> > Unless I hear otherwise, my plan is to move this back to pre_sev_run().
> >
>
> Another issue in the v3 version is that I incorrectly cleared the recorded
> mask after each cache flushing. The mask bits should be cleared and changed
> at the time of vCPU migration rather than after a cache flushing.
The bits can't be cleared at vCPU migration, they can only be cleared when a
flush is issued. A vCPU that did VMRUN with an ASID in the past could still
have dirty data cached for that ASID.
KVM could send an IPI to the previous CPU to do a cache flush on migration,
similar to what KVM already does on VMX to VMCLEAR the VMCS. But the math and
usage for WB(NO)INVD is wildly different. For VMCLEAR, the IPI approach is the
lazy approach; KVM *must* VMCLEAR the VMCS if the CPU changes, using an IPI to
defer VMCLEAR allows KVM to skip doing VMCLEAR whenever a vCPU is scheduled out.
For WB(NO)INVD, the IPI approach would be an eager approach. Deferring WB(NO)INVD
until it's necessary allows KVM to skip the WB(NO)INVD when a vCPU is migrated.
In short, no optimization is obviously better than another at this point. E.g.
if a VM is not hard pinned 1:1, but vCPUs are affined to certain cores, then
flushing on migration is a terrible idea because the total set of CPUs that need
to flush caches is more or less static.
Anyways, as above, I definitely want to go with the simplest implementation
(make the bitmap "sticky"), and then iterate, optimize, and add complexity if and
only if it's truly justified.
next prev parent reply other threads:[~2025-02-26 23:58 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-28 1:53 [PATCH v7 0/3] KVM: SVM: Flush cache only on CPUs running SEV guest Zheyun Shen
2025-01-28 1:53 ` [PATCH v7 1/3] KVM: x86: Add a wbinvd helper Zheyun Shen
2025-02-06 22:03 ` Tom Lendacky
2025-02-26 0:59 ` Sean Christopherson
2025-01-28 1:53 ` [PATCH v7 2/3] KVM: SVM: Remove wbinvd in sev_vm_destroy() Zheyun Shen
2025-02-06 22:04 ` Tom Lendacky
2025-01-28 1:53 ` [PATCH v7 3/3] KVM: SVM: Flush cache only on CPUs running SEV guest Zheyun Shen
2025-02-06 22:05 ` Tom Lendacky
2025-02-26 1:20 ` Sean Christopherson
2025-02-26 3:26 ` Zheyun Shen
2025-02-26 23:58 ` Sean Christopherson [this message]
2025-02-26 1:37 ` [PATCH v7 0/3] " Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z7-qtIjS0bdc5J0r@google.com \
--to=seanjc@google.com \
--cc=bp@alien8.de \
--cc=kevinloughlin@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=szy0127@sjtu.edu.cn \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox