From: Avi Kivity <avi@redhat.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>, kvm@vger.kernel.org
Subject: Re: [PATCH] KVM: Defer remote tlb flushes on invlpg (v2)
Date: Thu, 19 Mar 2009 12:27:22 +0200 [thread overview]
Message-ID: <49C21E0A.9070605@redhat.com> (raw)
In-Reply-To: <20090318225313.GA23082@random.random>
Andrea Arcangeli wrote:
>> static int kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
>> {
>> + bool need_flush;
>> +
>> if (sp->role.glevels != vcpu->arch.mmu.root_level) {
>> kvm_mmu_zap_page(vcpu->kvm, sp);
>> return 1;
>> }
>>
>> - if (rmap_write_protect(vcpu->kvm, sp->gfn))
>> - kvm_flush_remote_tlbs(vcpu->kvm);
>> + need_flush = rmap_write_protect(vcpu->kvm, sp->gfn);
>> + kvm_flush_remote_tlbs_cond(vcpu->kvm, need_flush);
>>
>
> Why exactly do we need to flush remote and local tlbs here (in the cr3
> overwrite path which is a local flush) if no change happened to sptes
> related to the current process? How is it relevant that a previous
> invlpg has run and we did only a local flush at the time invlpg run?
>
An invlpg on a remote cpu may have not done a local_tlb_flush() because
the spte was not present. So we have a poisoned tlb remotely which needs
to be flushed when protecting pages.
However a better fix for this is to make the local tlb flush on invlpg
unconditional.
(historical note - the kvm mmu used to depend on the shadow page tables
and guest page tables being consistent, so we were pretty paranoid about
guest tlb management bugs (or exploits) killing the host. but with the
gfn tables maintained in shadow pages, that's no longer the case)
> Same here.
>
> Same here.
>
>
Same explanations apply.
>> @@ -475,8 +475,10 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
>> break;
>> }
>>
>> - if (need_flush)
>> - kvm_flush_remote_tlbs(vcpu->kvm);
>> + if (need_flush) {
>> + vcpu->kvm->remote_tlbs_dirty = true;
>> + kvm_x86_ops->tlb_flush(vcpu);
>> + }
>> spin_unlock(&vcpu->kvm->mmu_lock);
>>
>> if (pte_gpa == -1)
>>
>
> I think it'd be more consistent to set_bit(KVM_REQ_TLB_FLUSH,
> &vcpu->requests) instead of invoking tlb_flush from a different place.
>
Agree, it can also fold an IPI as the remote cpu will notice the request
bit is set.
>> @@ -758,10 +758,22 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)
>>
>> void kvm_flush_remote_tlbs(struct kvm *kvm)
>> {
>> + kvm->remote_tlbs_dirty = false;
>> + wmb();
>> if (make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
>> ++kvm->stat.remote_tlb_flush;
>> }
>>
>> +void kvm_flush_remote_tlbs_cond(struct kvm *kvm, bool cond)
>> +{
>> + if (!cond) {
>> + rmb();
>> + cond = kvm->remote_tlbs_dirty;
>> + }
>> + if (cond)
>> + kvm_flush_remote_tlbs(kvm);
>> +}
>> +
>> void kvm_reload_remote_mmus(struct kvm *kvm)
>> {
>> make_all_cpus_request(kvm, KVM_REQ_MMU_RELOAD);
>> @@ -840,8 +852,9 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
>> need_tlb_flush = kvm_unmap_hva(kvm, address);
>> spin_unlock(&kvm->mmu_lock);
>>
>> + rmb();
>> /* we've to flush the tlb before the pages can be freed */
>> - if (need_tlb_flush)
>> + if (need_tlb_flush || kvm->remote_tlbs_dirty)
>> kvm_flush_remote_tlbs(kvm);
>>
>> }
>> @@ -865,8 +878,9 @@ static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
>> need_tlb_flush |= kvm_unmap_hva(kvm, start);
>> spin_unlock(&kvm->mmu_lock);
>>
>> + rmb();
>> /* we've to flush the tlb before the pages can be freed */
>> - if (need_tlb_flush)
>> + if (need_tlb_flush || kvm->remote_tlbs_dirty)
>> kvm_flush_remote_tlbs(kvm);
>> }
>>
>
> I think the only memory barrier required is a smp_mb() between setting
> remote_tlbs_dirty true and make_all_cpus_request.
>
> All other memory barriers seems unnecessary. It can't be a invlpg
> relative to the page the mmu notifier is working on that is setting
> remote_tlbs_dirty after mmu_lock is released in the mmu notifier
> method, so as long as the remote_tlbs_dirty bit isn't lost we're
> ok. We basically have the remote_tlbs_dirty randomly going up from a
> mmu_lock protected section and the mmu notifier only must make sure
> that after clearing it, it always flushes (so only race is if we flush
> before clearing the flag).
>
> However the setting must be done with set_bit and the clearing with
> test_and_clear_bit, otherwise when one cpus sets it inside mmu_lock
> protected section and other cpu clears it without any lock held from
> the mmu notifier, the result is undefined. AFIK without lock prefix on
> the mov instruction what can happen is like:
>
> cpu0 cpu1
> -------- ---------
> remote_tlbs_dirty = 0
> remote_tlbs_dirty = 1
> remote_tlbs_dirty == 0
>
> I don't think it only applies to bitops, surely it would more
> obviously apply to bitops but I think it also applies to plain mov.
>
I think we no longer need remote_tlbs_dirty. I'll send a new version.
--
error compiling committee.c: too many arguments to function
next prev parent reply other threads:[~2009-03-19 10:27 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-16 11:07 [PATCH] KVM: Defer remote tlb flushes on invlpg (v2) Avi Kivity
2009-03-17 19:47 ` Marcelo Tosatti
2009-03-18 0:41 ` Andi Kleen
2009-03-18 6:23 ` Avi Kivity
2009-03-18 22:53 ` Andrea Arcangeli
2009-03-19 10:27 ` Avi Kivity [this message]
2009-03-19 10:29 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49C21E0A.9070605@redhat.com \
--to=avi@redhat.com \
--cc=aarcange@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox