* [PATCH 0/2] KVM PDPTR reload fixes review
@ 2009-05-25 8:47 Avi Kivity
2009-05-25 8:47 ` [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs Avi Kivity
2009-05-25 8:47 ` [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled Avi Kivity
0 siblings, 2 replies; 7+ messages in thread
From: Avi Kivity @ 2009-05-25 8:47 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti
The upcoming F11 release switches the default 32-bit kernel to PAE, and
also uses the fancy priority inherited futexes. This combination exposed
bugs in KVM's handling of PDPTR reloads.
There's nothing complicated in there, but I'd like to submit it for
2.6.30-rc7 as it fixes an important guest. So please review it carefully.
Avi Kivity (2):
KVM: Make paravirt tlb flush also reload PAE PDPTEs
KVM: Make CR4 reloads also reload CR3 if paging is enabled
arch/x86/kvm/mmu.c | 3 +--
arch/x86/kvm/x86.c | 6 +++++-
2 files changed, 6 insertions(+), 3 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs
2009-05-25 8:47 [PATCH 0/2] KVM PDPTR reload fixes review Avi Kivity
@ 2009-05-25 8:47 ` Avi Kivity
2009-05-25 11:59 ` Marcelo Tosatti
2009-05-25 8:47 ` [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled Avi Kivity
1 sibling, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2009-05-25 8:47 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti
The paravirt tlb flush may be used not only to flush TLBs, but also
to reload the four page-directory-pointer-table entries, as it is used
as a replacement for reloading CR3. Change the code to do the entire
CR3 reloading dance instead of simply flushing the TLB.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/mmu.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6880a4c..7030b5f 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2927,8 +2927,7 @@ static int kvm_pv_mmu_write(struct kvm_vcpu *vcpu,
static int kvm_pv_mmu_flush_tlb(struct kvm_vcpu *vcpu)
{
- kvm_x86_ops->tlb_flush(vcpu);
- set_bit(KVM_REQ_MMU_SYNC, &vcpu->requests);
+ kvm_set_cr3(vcpu, vcpu->arch.cr3);
return 1;
}
--
1.6.0.6
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled
2009-05-25 8:47 [PATCH 0/2] KVM PDPTR reload fixes review Avi Kivity
2009-05-25 8:47 ` [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs Avi Kivity
@ 2009-05-25 8:47 ` Avi Kivity
2009-05-25 11:58 ` Marcelo Tosatti
1 sibling, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2009-05-25 8:47 UTC (permalink / raw)
To: kvm; +Cc: Marcelo Tosatti
The processor is documented to reload the PDPTRs while in PAE mode if any
of the CR4 bits PSE, PGE, or PAE change. Linux relies on this
behaviour when zapping the low mappings of PAE kernels during boot.
The code already handled changes to CR4.PAE; augment it to also notice changes
to PSE and PGE.
This triggered while booting an F11 PAE kernel; the futex initialization code
runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem
ended up uninitialized, killing PI futexes and pulseaudio which uses them.
Signed-off-by: Avi Kivity <avi@redhat.com>
---
arch/x86/kvm/x86.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8b5d1e5..6d44dd5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -341,6 +341,9 @@ EXPORT_SYMBOL_GPL(kvm_lmsw);
void kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
{
+ unsigned long old_cr4 = vcpu->arch.cr4;
+ unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE;
+
if (cr4 & CR4_RESERVED_BITS) {
printk(KERN_DEBUG "set_cr4: #GP, reserved bits\n");
kvm_inject_gp(vcpu, 0);
@@ -354,7 +357,8 @@ void kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
kvm_inject_gp(vcpu, 0);
return;
}
- } else if (is_paging(vcpu) && !is_pae(vcpu) && (cr4 & X86_CR4_PAE)
+ } else if (is_paging(vcpu) && (cr4 & X86_CR4_PAE)
+ && ((cr4 ^ old_cr4) & pdptr_bits)
&& !load_pdptrs(vcpu, vcpu->arch.cr3)) {
printk(KERN_DEBUG "set_cr4: #GP, pdptrs reserved bits\n");
kvm_inject_gp(vcpu, 0);
--
1.6.0.6
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled
2009-05-25 8:47 ` [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled Avi Kivity
@ 2009-05-25 11:58 ` Marcelo Tosatti
2009-05-25 15:08 ` Avi Kivity
0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-05-25 11:58 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
On Mon, May 25, 2009 at 11:47:24AM +0300, Avi Kivity wrote:
> The processor is documented to reload the PDPTRs while in PAE mode if any
> of the CR4 bits PSE, PGE, or PAE change. Linux relies on this
> behaviour when zapping the low mappings of PAE kernels during boot.
>
> The code already handled changes to CR4.PAE; augment it to also notice changes
> to PSE and PGE.
>
> This triggered while booting an F11 PAE kernel; the futex initialization code
> runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem
> ended up uninitialized, killing PI futexes and pulseaudio which uses them.
One comment regarding set_cr0. Section 8.1 of the TLB doc says:
* The processor does not maintain a PDP cache as described in
Section 4. The processor always caches information from the four
page-directory-pointer-table entries. These entries are not cached at
the time of address translation. Instead, they are always cached as part
of the execution of the following instructions:
* A MOV to CR0 that modifies CR0.PG and that occurs with IA32_EFER.LMA = 0
and CR4.PAE = 1.
However kvm_set_cr0 only caches the PDPTRs if CR0.PG changed from 0->1.
Can't see a problem there though.
Also, the checks in kvm_arch_vcpu_ioctl_set_sregs should probably be unified
with kvm_set_crX, to avoid future mistakes.
Otherwise, ACK.
>
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> arch/x86/kvm/x86.c | 6 +++++-
> 1 files changed, 5 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8b5d1e5..6d44dd5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -341,6 +341,9 @@ EXPORT_SYMBOL_GPL(kvm_lmsw);
>
> void kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> {
> + unsigned long old_cr4 = vcpu->arch.cr4;
> + unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE | X86_CR4_PAE;
> +
> if (cr4 & CR4_RESERVED_BITS) {
> printk(KERN_DEBUG "set_cr4: #GP, reserved bits\n");
> kvm_inject_gp(vcpu, 0);
> @@ -354,7 +357,8 @@ void kvm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
> kvm_inject_gp(vcpu, 0);
> return;
> }
> - } else if (is_paging(vcpu) && !is_pae(vcpu) && (cr4 & X86_CR4_PAE)
> + } else if (is_paging(vcpu) && (cr4 & X86_CR4_PAE)
> + && ((cr4 ^ old_cr4) & pdptr_bits)
> && !load_pdptrs(vcpu, vcpu->arch.cr3)) {
> printk(KERN_DEBUG "set_cr4: #GP, pdptrs reserved bits\n");
> kvm_inject_gp(vcpu, 0);
> --
> 1.6.0.6
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs
2009-05-25 8:47 ` [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs Avi Kivity
@ 2009-05-25 11:59 ` Marcelo Tosatti
2009-05-25 15:09 ` Avi Kivity
0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-05-25 11:59 UTC (permalink / raw)
To: Avi Kivity; +Cc: kvm
On Mon, May 25, 2009 at 11:47:23AM +0300, Avi Kivity wrote:
> The paravirt tlb flush may be used not only to flush TLBs, but also
> to reload the four page-directory-pointer-table entries, as it is used
> as a replacement for reloading CR3. Change the code to do the entire
> CR3 reloading dance instead of simply flushing the TLB.
Ugh, my bad.
ACK.
>
> Signed-off-by: Avi Kivity <avi@redhat.com>
> ---
> arch/x86/kvm/mmu.c | 3 +--
> 1 files changed, 1 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 6880a4c..7030b5f 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2927,8 +2927,7 @@ static int kvm_pv_mmu_write(struct kvm_vcpu *vcpu,
>
> static int kvm_pv_mmu_flush_tlb(struct kvm_vcpu *vcpu)
> {
> - kvm_x86_ops->tlb_flush(vcpu);
> - set_bit(KVM_REQ_MMU_SYNC, &vcpu->requests);
> + kvm_set_cr3(vcpu, vcpu->arch.cr3);
> return 1;
> }
>
> --
> 1.6.0.6
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled
2009-05-25 11:58 ` Marcelo Tosatti
@ 2009-05-25 15:08 ` Avi Kivity
0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2009-05-25 15:08 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm
Marcelo Tosatti wrote:
> On Mon, May 25, 2009 at 11:47:24AM +0300, Avi Kivity wrote:
>
>> The processor is documented to reload the PDPTRs while in PAE mode if any
>> of the CR4 bits PSE, PGE, or PAE change. Linux relies on this
>> behaviour when zapping the low mappings of PAE kernels during boot.
>>
>> The code already handled changes to CR4.PAE; augment it to also notice changes
>> to PSE and PGE.
>>
>> This triggered while booting an F11 PAE kernel; the futex initialization code
>> runs before any CR3 reloads and writes to a NULL pointer; the futex subsystem
>> ended up uninitialized, killing PI futexes and pulseaudio which uses them.
>>
>
> One comment regarding set_cr0. Section 8.1 of the TLB doc says:
>
> * The processor does not maintain a PDP cache as described in
> Section 4. The processor always caches information from the four
> page-directory-pointer-table entries. These entries are not cached at
> the time of address translation. Instead, they are always cached as part
> of the execution of the following instructions:
>
> * A MOV to CR0 that modifies CR0.PG and that occurs with IA32_EFER.LMA = 0
> and CR4.PAE = 1.
>
> However kvm_set_cr0 only caches the PDPTRs if CR0.PG changed from 0->1.
> Can't see a problem there though.
>
Yes, if cr0.pg == 0, then the pdptrs are meaningless.
> Also, the checks in kvm_arch_vcpu_ioctl_set_sregs should probably be unified
> with kvm_set_crX, to avoid future mistakes.
>
Yes please. But it needs to be done very carefully, since the order of
the checks matters here.
> Otherwise, ACK.
>
Thanks for the review.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs
2009-05-25 11:59 ` Marcelo Tosatti
@ 2009-05-25 15:09 ` Avi Kivity
0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2009-05-25 15:09 UTC (permalink / raw)
To: Marcelo Tosatti; +Cc: kvm
Marcelo Tosatti wrote:
> On Mon, May 25, 2009 at 11:47:23AM +0300, Avi Kivity wrote:
>
>> The paravirt tlb flush may be used not only to flush TLBs, but also
>> to reload the four page-directory-pointer-table entries, as it is used
>> as a replacement for reloading CR3. Change the code to do the entire
>> CR3 reloading dance instead of simply flushing the TLB.
>>
>
> Ugh, my bad.
>
Not really -- the pvop is misleadingly named.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-05-25 15:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-25 8:47 [PATCH 0/2] KVM PDPTR reload fixes review Avi Kivity
2009-05-25 8:47 ` [PATCH 1/2] KVM: Make paravirt tlb flush also reload PAE PDPTEs Avi Kivity
2009-05-25 11:59 ` Marcelo Tosatti
2009-05-25 15:09 ` Avi Kivity
2009-05-25 8:47 ` [PATCH 2/2] KVM: Make CR4 reloads also reload CR3 if paging is enabled Avi Kivity
2009-05-25 11:58 ` Marcelo Tosatti
2009-05-25 15:08 ` Avi Kivity
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.