All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Zheyun Shen <szy0127@sjtu.edu.cn>,
	pbonzini@redhat.com, tglx@linutronix.de,  kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest
Date: Wed, 6 Mar 2024 08:45:17 -0800	[thread overview]
Message-ID: <ZeidnSrRcRkUe7gh@google.com> (raw)
In-Reply-To: <76744e4b-361d-43ae-9a52-6a410ed57303@amd.com>

On Wed, Mar 06, 2024, Tom Lendacky wrote:
> On 3/5/24 18:19, Sean Christopherson wrote:
> > On Tue, Mar 05, 2024, Tom Lendacky wrote:
> > > On 3/4/24 11:55, Sean Christopherson wrote:
> > > > +Tom
> > > > 
> > > > "KVM: SVM:" for the shortlog scope.
> > > > 
> > > > On Fri, Mar 01, 2024, Zheyun Shen wrote:
> > > > > On AMD CPUs without ensuring cache consistency, each memory page reclamation in
> > > > > an SEV guest triggers a call to wbinvd_on_all_cpus, thereby affecting the
> > > > > performance of other programs on the host.
> > > > > 
> > > > > Typically, an AMD server may have 128 cores or more, while the SEV guest might only
> > > > > utilize 8 of these cores. Meanwhile, host can use qemu-affinity to bind these 8 vCPUs
> > > > > to specific physical CPUs.
> > > > > 
> > > > > Therefore, keeping a record of the physical core numbers each time a vCPU runs
> > > > > can help avoid flushing the cache for all CPUs every time.
> > > > 
> > > > This needs an unequivocal statement from AMD that flushing caches only on CPUs
> > > > that do VMRUN is sufficient.  That sounds like it should be obviously correct,
> > > > as I don't see how else a cache line can be dirtied for the encrypted PA, but
> > > > this entire non-coherent caches mess makes me more than a bit paranoid.
> > > 
> > > As long as the wbinvd_on_all_cpus() related to the ASID flushing isn't
> > > changed, this should be ok. And the code currently flushes the source pages
> > > when doing LAUNCH_UPDATE commands and adding encrypted regions, so should be
> > > good there.
> > 
> > Nice, thanks!
> > 
> > > Would it make sense to make this configurable, with the current behavior the
> > > default, until testing looks good for a while?
> > 
> > I don't hate the idea, but I'm inclined to hit the "I'm feeling lucky" button.
> > I would rather we put in effort to all but guarantee we can do a clean revert in
> > the future, at which point a kill switch doesn't add all that much value.  E.g.
> > it would allow for a non-disruptive fix, and maybe a slightly faster confirmation
> > of a bug, but that's about it.
> > 
> > And since the fallout from this would be host data corruption, _not_ rebooting
> > hosts that may have been corrupted is probably a bad idea, i.e. the whole
> > non-disruptive fix benefit is quite dubious.
> > 
> > The other issue is that it'd be extremely difficult to know when we could/should
> > remove the kill switch.  It might be months or even years before anyone starts
> > running high volume of SEV/SEV-ES VMs with this optimization.
> 
> I can run the next version of the patch through our CI and see if it
> uncovers anything. I just worry about corner cases... but then that's just
> me.

Heh, it's definitely not just you, this scares me more than a bit.

Doh, I realize I misread your suggestion (several times).  You're suggesting we
make this opt-in.  Hmm, that's definitely more valuable than a kill switch, though
it has the same problem of us having no idea when it's safe to enable by default.
And I'm not sure I like the idea of having a knob that basically says, "we think
this works, but we're too scared to enable it by default, so _you_ should totally
enable it and let us know if we've corrupted your data" :-)

  reply	other threads:[~2024-03-06 16:45 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-01  2:30 [PATCH] KVM:SVM: Flush cache only on CPUs running SEV guest Zheyun Shen
2024-03-04 17:55 ` Sean Christopherson
2024-03-05 22:51   ` Tom Lendacky
2024-03-06  0:19     ` Sean Christopherson
2024-03-06 16:26       ` Tom Lendacky
2024-03-06 16:45         ` Sean Christopherson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2024-03-05 14:57 Zheyun Shen
2024-03-05 16:49 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZeidnSrRcRkUe7gh@google.com \
    --to=seanjc@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=szy0127@sjtu.edu.cn \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.