public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
* KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs
@ 2009-04-15 22:10 Marcelo Tosatti
  2009-04-16  8:56 ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-04-15 22:10 UTC (permalink / raw)
  To: kvm; +Cc: Avi Kivity


Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
checking for the new cr3 value:

"Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
when userspace next tries to call KVM_RUN on the affected VCPU and kvm
attempts to activate the new non-existent page table root.

This happens since kvm only validates that cr3 points to a valid guest
physical memory page when code *inside* the guest sets cr3. However, kvm
currently trusts the userspace caller (e.g. QEMU) on the host machine to
always supply a valid page table root, rather than properly validating
it along with the rest of the reloaded guest state."

http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599

Follow Avi's suggestion to use kvm_set_cr3, and do the same for
assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu
context, as long as cr4 is valid.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>


diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 148cde2..89fb3c7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 	kvm_x86_ops->set_gdt(vcpu, &dt);
 
 	vcpu->arch.cr2 = sregs->cr2;
-	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
-	vcpu->arch.cr3 = sregs->cr3;
 
+	kvm_set_cr3(vcpu, sregs->cr3);
 	kvm_set_cr8(vcpu, sregs->cr8);
 
 	mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer;
 	kvm_x86_ops->set_efer(vcpu, sregs->efer);
 	kvm_set_apic_base(vcpu, sregs->apic_base);
 
-	kvm_x86_ops->decache_cr4_guest_bits(vcpu);
-
 	mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0;
 	kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
 	vcpu->arch.cr0 = sregs->cr0;
 
-	mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4;
-	kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
-	if (!is_long_mode(vcpu) && is_pae(vcpu))
-		load_pdptrs(vcpu, vcpu->arch.cr3);
+	kvm_set_cr4(vcpu, sregs->cr4);
 
 	if (mmu_reset_needed)
 		kvm_mmu_reset_context(vcpu);

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs
  2009-04-15 22:10 KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs Marcelo Tosatti
@ 2009-04-16  8:56 ` Avi Kivity
  2009-04-16  9:10   ` Marcelo Tosatti
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2009-04-16  8:56 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm

Marcelo Tosatti wrote:
> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
> checking for the new cr3 value:
>
> "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
> the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
> when userspace next tries to call KVM_RUN on the affected VCPU and kvm
> attempts to activate the new non-existent page table root.
>
> This happens since kvm only validates that cr3 points to a valid guest
> physical memory page when code *inside* the guest sets cr3. However, kvm
> currently trusts the userspace caller (e.g. QEMU) on the host machine to
> always supply a valid page table root, rather than properly validating
> it along with the rest of the reloaded guest state."
>
> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599
>
> Follow Avi's suggestion to use kvm_set_cr3, and do the same for
> assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu
> context, as long as cr4 is valid.
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 148cde2..89fb3c7 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>  	kvm_x86_ops->set_gdt(vcpu, &dt);
>  
>  	vcpu->arch.cr2 = sregs->cr2;
> -	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
> -	vcpu->arch.cr3 = sregs->cr3;
>  
> +	kvm_set_cr3(vcpu, sregs->cr3);
>  	kvm_set_cr8(vcpu, sregs->cr8);
>  
>  	mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer;
>  	kvm_x86_ops->set_efer(vcpu, sregs->efer);
>  	kvm_set_apic_base(vcpu, sregs->apic_base);
>  
> -	kvm_x86_ops->decache_cr4_guest_bits(vcpu);
> -
>  	mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0;
>  	kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
>  	vcpu->arch.cr0 = sregs->cr0;
>  
> -	mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4;
> -	kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
> -	if (!is_long_mode(vcpu) && is_pae(vcpu))
> -		load_pdptrs(vcpu, vcpu->arch.cr3);
> +	kvm_set_cr4(vcpu, sregs->cr4);
>  
>  	if (mmu_reset_needed)
>  		kvm_mmu_reset_context(vcpu);
>   

Consider the following:

current state:
  cr3 = 0
  cr4.pae = 0

new state:
  cr3 = 0x800
  cr4.pae = 1

When you call kvm_set_cr3(), it will inject a #GP into the guest because 
we are setting bit 11 when cr4.pae=0, which is illegal.  However the new 
cr4.pae=1, so the new state was in fact legal!

There are a few ways out, one is to first go back to real mode and set 
eveything up carefully in the right order (including EFER.LMA and 
EFER.LME, and CS.L).  The other is to refactor kvm_set_* so that we have 
internal setters which won't trigger these faults (but do need to check 
at the end that the state is legal).

This first method is probably better since that's what the guest does 
when booting anyway.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs
  2009-04-16  8:56 ` Avi Kivity
@ 2009-04-16  9:10   ` Marcelo Tosatti
  2009-04-16  9:23     ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-04-16  9:10 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

On Thu, Apr 16, 2009 at 11:56:15AM +0300, Avi Kivity wrote:
> Marcelo Tosatti wrote:
>> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
>> checking for the new cr3 value:
>>
>> "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
>> the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
>> when userspace next tries to call KVM_RUN on the affected VCPU and kvm
>> attempts to activate the new non-existent page table root.
>>
>> This happens since kvm only validates that cr3 points to a valid guest
>> physical memory page when code *inside* the guest sets cr3. However, kvm
>> currently trusts the userspace caller (e.g. QEMU) on the host machine to
>> always supply a valid page table root, rather than properly validating
>> it along with the rest of the reloaded guest state."
>>
>> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599
>>
>> Follow Avi's suggestion to use kvm_set_cr3, and do the same for
>> assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu
>> context, as long as cr4 is valid.
>>
>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>>
>>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 148cde2..89fb3c7 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>>  	kvm_x86_ops->set_gdt(vcpu, &dt);
>>   	vcpu->arch.cr2 = sregs->cr2;
>> -	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
>> -	vcpu->arch.cr3 = sregs->cr3;
>>  +	kvm_set_cr3(vcpu, sregs->cr3);
>>  	kvm_set_cr8(vcpu, sregs->cr8);
>>   	mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer;
>>  	kvm_x86_ops->set_efer(vcpu, sregs->efer);
>>  	kvm_set_apic_base(vcpu, sregs->apic_base);
>>  -	kvm_x86_ops->decache_cr4_guest_bits(vcpu);
>> -
>>  	mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0;
>>  	kvm_x86_ops->set_cr0(vcpu, sregs->cr0);
>>  	vcpu->arch.cr0 = sregs->cr0;
>>  -	mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4;
>> -	kvm_x86_ops->set_cr4(vcpu, sregs->cr4);
>> -	if (!is_long_mode(vcpu) && is_pae(vcpu))
>> -		load_pdptrs(vcpu, vcpu->arch.cr3);
>> +	kvm_set_cr4(vcpu, sregs->cr4);
>>   	if (mmu_reset_needed)
>>  		kvm_mmu_reset_context(vcpu);
>>   
>
> Consider the following:
>
> current state:
>  cr3 = 0
>  cr4.pae = 0
>
> new state:
>  cr3 = 0x800
>  cr4.pae = 1
>
> When you call kvm_set_cr3(), it will inject a #GP into the guest because  
> we are setting bit 11 when cr4.pae=0, which is illegal.  However the new  
> cr4.pae=1, so the new state was in fact legal!
>
> There are a few ways out, one is to first go back to real mode and set  
> eveything up carefully in the right order (including EFER.LMA and  
> EFER.LME, and CS.L).  The other is to refactor kvm_set_* so that we have  
> internal setters which won't trigger these faults (but do need to check  
> at the end that the state is legal).
>
> This first method is probably better since that's what the guest does  
> when booting anyway.

Humpf. And something like this? Or GP# instead of triple fault?

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 148cde2..3e63bac 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3986,7 +3986,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
 
 	vcpu->arch.cr2 = sregs->cr2;
 	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
-	vcpu->arch.cr3 = sregs->cr3;
+	if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
+		vcpu->arch.cr3 = sregs->cr3;
+	else
+		set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
 
 	kvm_set_cr8(vcpu, sregs->cr8);
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs
  2009-04-16  9:10   ` Marcelo Tosatti
@ 2009-04-16  9:23     ` Avi Kivity
  2009-04-16 11:30       ` KVM: x86: check for cr3 validity " Marcelo Tosatti
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2009-04-16  9:23 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm

Marcelo Tosatti wrote:
> Humpf. And something like this? Or GP# instead of triple fault?
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 148cde2..3e63bac 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3986,7 +3986,10 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
>  
>  	vcpu->arch.cr2 = sregs->cr2;
>  	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
> -	vcpu->arch.cr3 = sregs->cr3;
> +	if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
> +		vcpu->arch.cr3 = sregs->cr3;
> +	else
> +		set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
>  
>  	kvm_set_cr8(vcpu, sregs->cr8); 
>   

Well, that plugs the hole.  Triple fault is better than #GP IMO.

We're still missing checks on reserved bits, etc., but that can come later.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* KVM: x86: check for cr3 validity in ioctl_set_sregs
  2009-04-16  9:23     ` Avi Kivity
@ 2009-04-16 11:30       ` Marcelo Tosatti
  2009-04-16 12:42         ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Marcelo Tosatti @ 2009-04-16 11:30 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm


Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
checking for the new cr3 value:
      
"Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
when userspace next tries to call KVM_RUN on the affected VCPU and kvm
attempts to activate the new non-existent page table root.

This happens since kvm only validates that cr3 points to a valid guest
physical memory page when code *inside* the guest sets cr3. However, kvm
currently trusts the userspace caller (e.g. QEMU) on the host machine to
always supply a valid page table root, rather than properly validating
it along with the rest of the reloaded guest state."

http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599

Check for a valid cr3 address in kvm_arch_vcpu_ioctl_set_sregs, triple
fault in case of failure.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Index: kvm/arch/x86/kvm/x86.c
===================================================================
--- kvm.orig/arch/x86/kvm/x86.c
+++ kvm/arch/x86/kvm/x86.c
@@ -3986,7 +3986,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct
 
 	vcpu->arch.cr2 = sregs->cr2;
 	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
-	vcpu->arch.cr3 = sregs->cr3;
+
+	down_read(&vcpu->kvm->slots_lock);
+	if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
+		vcpu->arch.cr3 = sregs->cr3;
+	else
+		set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
+	up_read(&vcpu->kvm->slots_lock);
 
 	kvm_set_cr8(vcpu, sregs->cr8);
 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KVM: x86: check for cr3 validity in ioctl_set_sregs
  2009-04-16 11:30       ` KVM: x86: check for cr3 validity " Marcelo Tosatti
@ 2009-04-16 12:42         ` Avi Kivity
  2009-04-16 12:49           ` Avi Kivity
  0 siblings, 1 reply; 7+ messages in thread
From: Avi Kivity @ 2009-04-16 12:42 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm

Marcelo Tosatti wrote:
> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
> checking for the new cr3 value:
>       
> "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to
> the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
> when userspace next tries to call KVM_RUN on the affected VCPU and kvm
> attempts to activate the new non-existent page table root.
>
> This happens since kvm only validates that cr3 points to a valid guest
> physical memory page when code *inside* the guest sets cr3. However, kvm
> currently trusts the userspace caller (e.g. QEMU) on the host machine to
> always supply a valid page table root, rather than properly validating
> it along with the rest of the reloaded guest state."
>
> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599
>
> Check for a valid cr3 address in kvm_arch_vcpu_ioctl_set_sregs, triple
> fault in case of failure.
>
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>
> Index: kvm/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.orig/arch/x86/kvm/x86.c
> +++ kvm/arch/x86/kvm/x86.c
> @@ -3986,7 +3986,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct
>  
>  	vcpu->arch.cr2 = sregs->cr2;
>  	mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
> -	vcpu->arch.cr3 = sregs->cr3;
> +
> +	down_read(&vcpu->kvm->slots_lock);
> +	if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
> +		vcpu->arch.cr3 = sregs->cr3;
> +	else
> +		set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
> +	up_read(&vcpu->kvm->slots_lock);
>  
>  	kvm_set_cr8(vcpu, sregs->cr8);
>  
>   

Isn't this self-defeating?  If you drop slots_lock, cr3 may be invalid 
again by the time you set cvpu->arch.cr3?


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: KVM: x86: check for cr3 validity in ioctl_set_sregs
  2009-04-16 12:42         ` Avi Kivity
@ 2009-04-16 12:49           ` Avi Kivity
  0 siblings, 0 replies; 7+ messages in thread
From: Avi Kivity @ 2009-04-16 12:49 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm

Avi Kivity wrote:
> Marcelo Tosatti wrote:
>> Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity
>> checking for the new cr3 value:
>>       "Userspace callers of KVM_SET_SREGS can pass a bogus value of 
>> cr3 to
>> the kernel. This will trigger a NULL pointer access in gfn_to_rmap()
>> when userspace next tries to call KVM_RUN on the affected VCPU and kvm
>> attempts to activate the new non-existent page table root.
>>
>> This happens since kvm only validates that cr3 points to a valid guest
>> physical memory page when code *inside* the guest sets cr3. However, kvm
>> currently trusts the userspace caller (e.g. QEMU) on the host machine to
>> always supply a valid page table root, rather than properly validating
>> it along with the rest of the reloaded guest state."
>>
>> http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599 
>>
>>
>> Check for a valid cr3 address in kvm_arch_vcpu_ioctl_set_sregs, triple
>> fault in case of failure.
>>
>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
>>
>> Index: kvm/arch/x86/kvm/x86.c
>> ===================================================================
>> --- kvm.orig/arch/x86/kvm/x86.c
>> +++ kvm/arch/x86/kvm/x86.c
>> @@ -3986,7 +3986,13 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct
>>  
>>      vcpu->arch.cr2 = sregs->cr2;
>>      mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3;
>> -    vcpu->arch.cr3 = sregs->cr3;
>> +
>> +    down_read(&vcpu->kvm->slots_lock);
>> +    if (gfn_to_memslot(vcpu->kvm, sregs->cr3 >> PAGE_SHIFT))
>> +        vcpu->arch.cr3 = sregs->cr3;
>> +    else
>> +        set_bit(KVM_REQ_TRIPLE_FAULT, &vcpu->requests);
>> +    up_read(&vcpu->kvm->slots_lock);
>>  
>>      kvm_set_cr8(vcpu, sregs->cr8);
>>  
>>   
>
> Isn't this self-defeating?  If you drop slots_lock, cr3 may be invalid 
> again by the time you set cvpu->arch.cr3?
>

Uh, sorry, of course not.  I misread down as up.  Bad day for me.  Will 
apply the patch.

Still, don't we have a problem if userspace drops the memory slot where 
cr3 points to?

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-04-16 12:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-04-15 22:10 KVM: x86: use kvm_set_cr3/cr4 in ioctl_set_sregs Marcelo Tosatti
2009-04-16  8:56 ` Avi Kivity
2009-04-16  9:10   ` Marcelo Tosatti
2009-04-16  9:23     ` Avi Kivity
2009-04-16 11:30       ` KVM: x86: check for cr3 validity " Marcelo Tosatti
2009-04-16 12:42         ` Avi Kivity
2009-04-16 12:49           ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox