* missing kvm smp tlb flush in invlpg
@ 2009-03-12 17:18 Andrea Arcangeli
2009-03-15 10:35 ` Avi Kivity
2009-03-15 19:23 ` Marcelo Tosatti
0 siblings, 2 replies; 11+ messages in thread
From: Andrea Arcangeli @ 2009-03-12 17:18 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti
From: Andrea Arcangeli <aarcange@redhat.com>
While looking at invlpg out of sync code with Izik I think I noticed a
missing smp tlb flush here. Without this the other cpu can still write
to a freed host physical page. tlb smp flush must happen if
rmap_remove is called always before mmu_lock is released because the
VM will take the mmu_lock before it can finally add the page to the
freelist after swapout. mmu notifier makes it safe to flush the tlb
after freeing the page (otherwise it would never be safe) so we can do
a single flush for multiple sptes invalidated.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
---
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index a0c11ea..855eb71 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -445,6 +445,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
gpa_t pte_gpa = -1;
int level;
u64 *sptep;
+ int need_flush = 0;
spin_lock(&vcpu->kvm->mmu_lock);
@@ -464,6 +465,7 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
rmap_remove(vcpu->kvm, sptep);
if (is_large_pte(*sptep))
--vcpu->kvm->stat.lpages;
+ need_flush = 1;
}
set_shadow_pte(sptep, shadow_trap_nonpresent_pte);
break;
@@ -473,6 +475,8 @@ static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
break;
}
+ if (need_flush)
+ kvm_flush_remote_tlbs(vcpu->kvm);
spin_unlock(&vcpu->kvm->mmu_lock);
if (pte_gpa == -1)
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-12 17:18 missing kvm smp tlb flush in invlpg Andrea Arcangeli
@ 2009-03-15 10:35 ` Avi Kivity
2009-03-15 16:16 ` Andrea Arcangeli
2009-03-15 19:23 ` Marcelo Tosatti
1 sibling, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-03-15 10:35 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm, Marcelo Tosatti
Andrea Arcangeli wrote:
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> While looking at invlpg out of sync code with Izik I think I noticed a
> missing smp tlb flush here. Without this the other cpu can still write
> to a freed host physical page. tlb smp flush must happen if
> rmap_remove is called always before mmu_lock is released because the
> VM will take the mmu_lock before it can finally add the page to the
> freelist after swapout. mmu notifier makes it safe to flush the tlb
> after freeing the page (otherwise it would never be safe) so we can do
> a single flush for multiple sptes invalidated.
>
Izik pointed out that for invlpg, the guest is responsible for smp tlb
flushes, and mmu notifiers will protect against pageout.
We still have a couple of holes, though, with the current code:
- tlb loaded with an entry
- guest invlpg
- invlpg code drops the spte and rmap entry
- pageout
- mmu notifiers don't find an rmap entry, so tlb is not flushed
The second hole is much simpler, we need a local invlpg at least. This
doesn't show up on Intel since a vmexit will flush the entire tlb (and
most AMDs have NPT by now).
I think we can fix this without taking the hit of the IPI by
- running a local invlpg()
- making need_flush a vm flag instead of a local
- clearing need_flush whenever remote tlbs are flushed
- flushing remote tlbs on an mmu_notifier call when need_flush is set
Since mmu notifier calls are rare, this would collapse many remote tlb
flushes into one.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 10:35 ` Avi Kivity
@ 2009-03-15 16:16 ` Andrea Arcangeli
2009-03-15 16:19 ` Avi Kivity
0 siblings, 1 reply; 11+ messages in thread
From: Andrea Arcangeli @ 2009-03-15 16:16 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti
On Sun, Mar 15, 2009 at 12:35:48PM +0200, Avi Kivity wrote:
> Izik pointed out that for invlpg, the guest is responsible for smp tlb
> flushes, and mmu notifiers will protect against pageout.
How will mmu notifier protect against pageout if the spte is already
invalid and removed from the rmapp chain? mmu notifier will search the
rmapp chain and it'll find nothing, it'll do nothing, so then the page
will be freed under the other cpus without no ipi flushing their VT
tlbs.
All that mmu notifier does is to protect against pageout until the
mmu_lock is released. So that you can send a single ipi to the other
physical cpus after a flood of rmap_remove, without having to do the
array of pages like arch/x86/include/asm/tlb.h.
This because if the VM was in the process of swapping out that page
while we were inside the mmu_lock protected critical section, the mmu
notifier will force the swap path to take the vcpu->kvm->mmu_lock
first for each kvm instance registered with the mmu notifier. But
after taking that lock, the mmu notifier will do nothing if
rmap_remove already run before the mmu_lock was released (like in this
case). The mmu_lock is just to stop temporarily the swap, so that it
waits the ipi is delivered to all cpus before proceeding freeing the
page. It's up to the kvm code that takes the lock to flush the tlb of
any other running guest, before it is allowed to release the mmu_lock
as far as I can tell.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 16:16 ` Andrea Arcangeli
@ 2009-03-15 16:19 ` Avi Kivity
2009-03-15 16:30 ` Andrea Arcangeli
0 siblings, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-03-15 16:19 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm, Marcelo Tosatti
Andrea Arcangeli wrote:
> On Sun, Mar 15, 2009 at 12:35:48PM +0200, Avi Kivity wrote:
>
>> Izik pointed out that for invlpg, the guest is responsible for smp tlb
>> flushes, and mmu notifiers will protect against pageout.
>>
>
> How will mmu notifier protect against pageout if the spte is already
> invalid and removed from the rmapp chain? mmu notifier will search the
> rmapp chain and it'll find nothing, it'll do nothing, so then the page
> will be freed under the other cpus without no ipi flushing their VT
> tlbs.
>
I mentioned this:
> I think we can fix this without taking the hit of the IPI by
> - running a local invlpg()
> - making need_flush a vm flag instead of a local
> - clearing need_flush whenever remote tlbs are flushed
> - flushing remote tlbs on an mmu_notifier call when need_flush is set
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 16:19 ` Avi Kivity
@ 2009-03-15 16:30 ` Andrea Arcangeli
2009-03-15 16:35 ` Avi Kivity
0 siblings, 1 reply; 11+ messages in thread
From: Andrea Arcangeli @ 2009-03-15 16:30 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti
On Sun, Mar 15, 2009 at 06:19:58PM +0200, Avi Kivity wrote:
> I mentioned this:
>
>> I think we can fix this without taking the hit of the IPI by
>> - running a local invlpg()
>> - making need_flush a vm flag instead of a local
>> - clearing need_flush whenever remote tlbs are flushed
>> - flushing remote tlbs on an mmu_notifier call when need_flush is set
Ah so this was a proposed fix for this bug, I thought you were talking
about different bugs, and you didn't acknowledge this as a bug sorry!
About the need_flush that could become a per-vcpu bit too cleared at
every exit so perhaps we'll never have to flush, but it'd need to stay
in the vcpu structure to avoid cacheline bouncing.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 16:30 ` Andrea Arcangeli
@ 2009-03-15 16:35 ` Avi Kivity
2009-03-15 17:05 ` Andrea Arcangeli
0 siblings, 1 reply; 11+ messages in thread
From: Avi Kivity @ 2009-03-15 16:35 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm, Marcelo Tosatti
Andrea Arcangeli wrote:
> Ah so this was a proposed fix for this bug, I thought you were talking
> about different bugs, and you didn't acknowledge this as a bug sorry!
>
>
If ignoring bugs could make them go away...
> About the need_flush that could become a per-vcpu bit too cleared at
> every exit so perhaps we'll never have to flush, but it'd need to stay
> in the vcpu structure to avoid cacheline bouncing.
>
But then we need to set it for all vcpus on every invlpg. I'm assuming
invlpg is much more frequent than mmu notifiers, so it's better to keep
it global.
We've already taken a shared cacheline when we acquired mmu_lock.
btw, it's probably better to apply your patch, then adapt it to the
non-IPIing version; your patch is more suitable for -stable.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 16:35 ` Avi Kivity
@ 2009-03-15 17:05 ` Andrea Arcangeli
2009-03-16 10:16 ` Avi Kivity
0 siblings, 1 reply; 11+ messages in thread
From: Andrea Arcangeli @ 2009-03-15 17:05 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm, Marcelo Tosatti
On Sun, Mar 15, 2009 at 06:35:02PM +0200, Avi Kivity wrote:
> Andrea Arcangeli wrote:
>> Ah so this was a proposed fix for this bug, I thought you were talking
>> about different bugs, and you didn't acknowledge this as a bug sorry!
>>
>>
>
> If ignoring bugs could make them go away...
;)
>> About the need_flush that could become a per-vcpu bit too cleared at
>> every exit so perhaps we'll never have to flush, but it'd need to stay
>> in the vcpu structure to avoid cacheline bouncing.
>>
>
> But then we need to set it for all vcpus on every invlpg. I'm assuming
> invlpg is much more frequent than mmu notifiers, so it's better to keep it
> global.
>
> We've already taken a shared cacheline when we acquired mmu_lock.
Ok.
> btw, it's probably better to apply your patch, then adapt it to the
> non-IPIing version; your patch is more suitable for -stable.
It's up to you, I guess it'll make life easier with the compat code ;).
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-12 17:18 missing kvm smp tlb flush in invlpg Andrea Arcangeli
2009-03-15 10:35 ` Avi Kivity
@ 2009-03-15 19:23 ` Marcelo Tosatti
2009-03-15 20:11 ` Izik Eidus
1 sibling, 1 reply; 11+ messages in thread
From: Marcelo Tosatti @ 2009-03-15 19:23 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm
On Thu, Mar 12, 2009 at 06:18:43PM +0100, Andrea Arcangeli wrote:
> From: Andrea Arcangeli <aarcange@redhat.com>
>
> While looking at invlpg out of sync code with Izik I think I noticed a
> missing smp tlb flush here. Without this the other cpu can still write
> to a freed host physical page. tlb smp flush must happen if
> rmap_remove is called always before mmu_lock is released because the
> VM will take the mmu_lock before it can finally add the page to the
> freelist after swapout. mmu notifier makes it safe to flush the tlb
> after freeing the page (otherwise it would never be safe) so we can do
> a single flush for multiple sptes invalidated.
I think this fix is more expensive than it needs to be, but better than
being unsafe for now.
Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 19:23 ` Marcelo Tosatti
@ 2009-03-15 20:11 ` Izik Eidus
2009-03-16 18:22 ` Marcelo Tosatti
0 siblings, 1 reply; 11+ messages in thread
From: Izik Eidus @ 2009-03-15 20:11 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: Andrea Arcangeli, kvm
Marcelo Tosatti wrote:
> On Thu, Mar 12, 2009 at 06:18:43PM +0100, Andrea Arcangeli wrote:
>
>> From: Andrea Arcangeli <aarcange@redhat.com>
>>
>> While looking at invlpg out of sync code with Izik I think I noticed a
>> missing smp tlb flush here. Without this the other cpu can still write
>> to a freed host physical page. tlb smp flush must happen if
>> rmap_remove is called always before mmu_lock is released because the
>> VM will take the mmu_lock before it can finally add the page to the
>> freelist after swapout. mmu notifier makes it safe to flush the tlb
>> after freeing the page (otherwise it would never be safe) so we can do
>> a single flush for multiple sptes invalidated.
>>
>
> I think this fix is more expensive than it needs to be, but better than
> being unsafe for now.
>
> Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
>
>
What about inside mmu_set_spte():
} else if (pfn != spte_to_pfn(*shadow_pte)) {
pgprintk("hfn old %lx new %lx\n",
spte_to_pfn(*shadow_pte), pfn);
rmap_remove(vcpu->kvm, shadow_pte);
} else
Doesnt this required tlb flush for all the cpus as well?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 17:05 ` Andrea Arcangeli
@ 2009-03-16 10:16 ` Avi Kivity
0 siblings, 0 replies; 11+ messages in thread
From: Avi Kivity @ 2009-03-16 10:16 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: kvm, Marcelo Tosatti
Andrea Arcangeli wrote:
> It's up to you, I guess it'll make life easier with the compat code ;).
I applied it, looking now at reducing the cost.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: missing kvm smp tlb flush in invlpg
2009-03-15 20:11 ` Izik Eidus
@ 2009-03-16 18:22 ` Marcelo Tosatti
0 siblings, 0 replies; 11+ messages in thread
From: Marcelo Tosatti @ 2009-03-16 18:22 UTC (permalink / raw)
To: Izik Eidus; +Cc: Andrea Arcangeli, kvm
On Sun, Mar 15, 2009 at 10:11:29PM +0200, Izik Eidus wrote:
> Marcelo Tosatti wrote:
>> On Thu, Mar 12, 2009 at 06:18:43PM +0100, Andrea Arcangeli wrote:
>>
>>> From: Andrea Arcangeli <aarcange@redhat.com>
>>>
>>> While looking at invlpg out of sync code with Izik I think I noticed a
>>> missing smp tlb flush here. Without this the other cpu can still write
>>> to a freed host physical page. tlb smp flush must happen if
>>> rmap_remove is called always before mmu_lock is released because the
>>> VM will take the mmu_lock before it can finally add the page to the
>>> freelist after swapout. mmu notifier makes it safe to flush the tlb
>>> after freeing the page (otherwise it would never be safe) so we can do
>>> a single flush for multiple sptes invalidated.
>>>
>>
>> I think this fix is more expensive than it needs to be, but better than
>> being unsafe for now.
>>
>> Acked-by: Marcelo Tosatti <mtosatti@redhat.com>
>>
>>
> What about inside mmu_set_spte():
> } else if (pfn != spte_to_pfn(*shadow_pte)) {
> pgprintk("hfn old %lx new %lx\n",
> spte_to_pfn(*shadow_pte), pfn);
> rmap_remove(vcpu->kvm, shadow_pte);
> } else
>
> Doesnt this required tlb flush for all the cpus as well?
Probably. This particular condition can only happen without mmu
notifiers, and when doing mmap(MADV_DONTNEED), though.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-03-16 18:23 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-12 17:18 missing kvm smp tlb flush in invlpg Andrea Arcangeli
2009-03-15 10:35 ` Avi Kivity
2009-03-15 16:16 ` Andrea Arcangeli
2009-03-15 16:19 ` Avi Kivity
2009-03-15 16:30 ` Andrea Arcangeli
2009-03-15 16:35 ` Avi Kivity
2009-03-15 17:05 ` Andrea Arcangeli
2009-03-16 10:16 ` Avi Kivity
2009-03-15 19:23 ` Marcelo Tosatti
2009-03-15 20:11 ` Izik Eidus
2009-03-16 18:22 ` Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox