From: Sean Christopherson <seanjc@google.com>
To: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Zheyun Shen <szy0127@sjtu.edu.cn>,
pbonzini@redhat.com, tglx@linutronix.de, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest
Date: Wed, 6 Mar 2024 08:45:17 -0800 [thread overview]
Message-ID: <ZeidnSrRcRkUe7gh@google.com> (raw)
In-Reply-To: <76744e4b-361d-43ae-9a52-6a410ed57303@amd.com>
On Wed, Mar 06, 2024, Tom Lendacky wrote:
> On 3/5/24 18:19, Sean Christopherson wrote:
> > On Tue, Mar 05, 2024, Tom Lendacky wrote:
> > > On 3/4/24 11:55, Sean Christopherson wrote:
> > > > +Tom
> > > >
> > > > "KVM: SVM:" for the shortlog scope.
> > > >
> > > > On Fri, Mar 01, 2024, Zheyun Shen wrote:
> > > > > On AMD CPUs without ensuring cache consistency, each memory page reclamation in
> > > > > an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
> > > > > performance of other programs on the host.
> > > > >
> > > > > Typically, an AMD server may have 128 cores or more, while the SEV guest might only
> > > > > utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
> > > > > to specific physical CPUs.
> > > > >
> > > > > Therefore, keeping a record of the physical core numbers each time a vCPU runs
> > > > > can help avoid flushing the cache for all CPUs every time.
> > > >
> > > > This needs an unequivocal statement from AMD that flushing caches only on CPUs
> > > > that do VMRUN is sufficient. That sounds like it should be obviously correct,
> > > > as I don't see how else a cache line can be dirtied for the encrypted PA, but
> > > > this entire non-coherent caches mess makes me more than a bit paranoid.
> > >
> > > As long as the wbinvd_on_all_cpus() related to the ASID flushing isn't
> > > changed, this should be ok. And the code currently flushes the source pages
> > > when doing LAUNCH_UPDATE commands and adding encrypted regions, so should be
> > > good there.
> >
> > Nice, thanks!
> >
> > > Would it make sense to make this configurable, with the current behavior the
> > > default, until testing looks good for a while?
> >
> > I don't hate the idea, but I'm inclined to hit the "I'm feeling lucky" button.
> > I would rather we put in effort to all but guarantee we can do a clean revert in
> > the future, at which point a kill switch doesn't add all that much value. E.g.
> > it would allow for a non-disruptive fix, and maybe a slightly faster confirmation
> > of a bug, but that's about it.
> >
> > And since the fallout from this would be host data corruption, _not_ rebooting
> > hosts that may have been corrupted is probably a bad idea, i.e. the whole
> > non-disruptive fix benefit is quite dubious.
> >
> > The other issue is that it'd be extremely difficult to know when we could/should
> > remove the kill switch. It might be months or even years before anyone starts
> > running high volume of SEV/SEV-ES VMs with this optimization.
>
> I can run the next version of the patch through our CI and see if it
> uncovers anything. I just worry about corner cases... but then that's just
> me.
Heh, it's definitely not just you, this scares me more than a bit.
Doh, I realize I misread your suggestion (several times). You're suggesting we
make this opt-in. Hmm, that's definitely more valuable than a kill switch, though
it has the same problem of us having no idea when it's safe to enable by default.
And I'm not sure I like the idea of having a knob that basically says, "we think
this works, but we're too scared to enable it by default, so _you_ should totally
enable it and let us know if we've corrupted your data" :-)
next prev parent reply other threads:[~2024-03-06 16:45 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-01 2:30 [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest Zheyun Shen
2024-03-04 17:55 ` Sean Christopherson
2024-03-05 22:51 ` Tom Lendacky
2024-03-06 0:19 ` Sean Christopherson
2024-03-06 16:26 ` Tom Lendacky
2024-03-06 16:45 ` Sean Christopherson [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-03-05 14:57 Zheyun Shen
2024-03-05 16:49 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZeidnSrRcRkUe7gh@google.com \
--to=seanjc@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=szy0127@sjtu.edu.cn \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox