From: Sean Christopherson <seanjc@google.com>
To: Mingwei Zhang <mizhang@google.com>
Cc: Jim Mattson <jmattson@google.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"H. Peter Anvin" <hpa@zytor.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Ben Gardon <bgardon@google.com>
Subject: Re: [PATCH] KVM: x86/mmu: Remove KVM MMU write lock when accessing indirect_shadow_pages
Date: Tue, 6 Jun 2023 16:07:05 -0700 [thread overview]
Message-ID: <ZH+8GafaNLYPvTJI@google.com> (raw)
In-Reply-To: <CAL715WKtsC=93Nqr7QJZxspWzF04_CLqN3FUxUaqTHWFRUrwBA@mail.gmail.com>
On Tue, Jun 06, 2023, Mingwei Zhang wrote:
> > > >
> > > > I don't understand the need for READ_ONCE() here. That implies that
> > > > there is something tricky going on, and I don't think that's the case.
> > >
> > > READ_ONCE() is just telling the compiler not to remove the read. Since
> > > this is reading a global variable, the compiler might just read a
> > > previous copy if the value has already been read into a local
> > > variable. But that is not the case here...
> > >
> > > Note I see there is another READ_ONCE for
> > > kvm->arch.indirect_shadow_pages, so I am reusing the same thing.
> >
> > I agree with Jim, using READ_ONCE() doesn't make any sense. I suspect it may have
> > been a misguided attempt to force the memory read to be as close to the write_lock()
> > as possible, e.g. to minimize the chance of a false negative.
>
> Sean :) Your suggestion is the opposite with Jim. He is suggesting
> doing nothing, but your suggestion is doing way more than READ_ONCE().
Not really. Jim is asserting that the READ_ONCE() is pointless, and I completely
agree. I am also saying that I think there is a real memory ordering issue here,
and that it was being papered over by the READ_ONCE() in kvm_mmu_pte_write().
> > So I think this?
>
> Hmm. I agree with both points above, but below, the change seems too
> heavyweight. smp_wb() is a mfence(), i.e., serializing all
> loads/stores before the instruction. Doing that for every shadow page
> creation and destruction seems a lot.
No, the smp_*b() variants are just compiler barriers on x86.
> In fact, the case that only matters is '0->1' which may potentially
> confuse kvm_mmu_pte_write() when it reads 'indirect_shadow_count', but
> the majority of the cases are 'X => X + 1' where X != 0. So, those
> cases do not matter. So, if we want to add barriers, we only need it
> for 0->1. Maybe creating a new variable and not blocking
> account_shadow() and unaccount_shadow() is a better idea?
>
> Regardless, the above problem is related to interactions among
> account_shadow(), unaccount_shadow() and kvm_mmu_pte_write(). It has
> nothing to do with the 'reexecute_instruction()', which is what this
> patch is about. So, I think having a READ_ONCE() for
> reexecute_instruction() should be good enough. What do you think.
The reexecute_instruction() case should be fine without any fanciness, it's
nothing more than a heuristic, i.e. neither a false positive nor a false negative
will impact functional correctness, and nothing changes regardless of how many
times the compiler reads the variable outside of mmu_lock.
I was thinking that it would be better to have a single helper to locklessly
access indirect_shadow_pages, but I agree that applying the barriers to
reexecute_instruction() introduces a different kind of confusion.
Want to post a v2 of yours without a READ_ONCE(), and I'll post a separate fix
for the theoretical kvm_mmu_pte_write() race? And then Paolo can tell me that
there's no race and school me on lockless programming once more ;-)
next prev parent reply other threads:[~2023-06-06 23:07 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-05 0:43 [PATCH] KVM: x86/mmu: Remove KVM MMU write lock when accessing indirect_shadow_pages Mingwei Zhang
2023-06-05 16:55 ` Jim Mattson
2023-06-05 17:17 ` Ben Gardon
2023-06-05 17:53 ` Mingwei Zhang
2023-06-05 18:27 ` Paolo Bonzini
2023-06-05 17:42 ` Mingwei Zhang
2023-06-05 18:11 ` Jim Mattson
2023-06-05 18:23 ` Mingwei Zhang
2023-06-05 18:25 ` Sean Christopherson
2023-06-06 22:46 ` Mingwei Zhang
2023-06-06 22:48 ` Mingwei Zhang
2023-06-06 23:07 ` Sean Christopherson [this message]
2023-06-07 0:23 ` Mingwei Zhang
2023-06-07 0:28 ` Sean Christopherson
2023-06-15 23:57 ` Mingwei Zhang
2023-06-26 17:38 ` Jim Mattson
2023-06-26 20:42 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZH+8GafaNLYPvTJI@google.com \
--to=seanjc@google.com \
--cc=bgardon@google.com \
--cc=hpa@zytor.com \
--cc=jmattson@google.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.