* [PATCH v2 0/3] Fix task switches into/out of VM86
@ 2012-01-27 19:23 Kevin Wolf
2012-01-27 19:23 ` [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks Kevin Wolf
` (3 more replies)
0 siblings, 4 replies; 36+ messages in thread
From: Kevin Wolf @ 2012-01-27 19:23 UTC (permalink / raw)
To: kvm; +Cc: kwolf, gleb, joerg.roedel, yoshikawa.takuya, avi, mtosatti
I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of
you test this with SVM? I did some testing on my buggy processor and it looks
as good as it gets, but it would be better if you could confirm.
Kevin Wolf (3):
KVM: x86 emulator: Fix task switch privilege checks
KVM: x86 emulator: VM86 segments must have DPL 3
KVM: x86 emulator: Allow PM/VM86 switch during task switch
arch/x86/include/asm/kvm_emulate.h | 3 +-
arch/x86/include/asm/kvm_host.h | 4 +-
arch/x86/kvm/emulate.c | 79 ++++++++++++++++++++++++++++++++---
arch/x86/kvm/svm.c | 5 ++-
arch/x86/kvm/vmx.c | 8 ++-
arch/x86/kvm/x86.c | 12 ++++-
6 files changed, 94 insertions(+), 17 deletions(-)
--
1.7.6.5
^ permalink raw reply [flat|nested] 36+ messages in thread* [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks 2012-01-27 19:23 [PATCH v2 0/3] Fix task switches into/out of VM86 Kevin Wolf @ 2012-01-27 19:23 ` Kevin Wolf 2012-01-30 10:39 ` Avi Kivity 2012-01-27 19:23 ` [PATCH v2 2/3] KVM: x86 emulator: VM86 segments must have DPL 3 Kevin Wolf ` (2 subsequent siblings) 3 siblings, 1 reply; 36+ messages in thread From: Kevin Wolf @ 2012-01-27 19:23 UTC (permalink / raw) To: kvm; +Cc: kwolf, gleb, joerg.roedel, yoshikawa.takuya, avi, mtosatti Currently, all task switches check privileges against the DPL of the TSS. This is only correct for jmp/call to a TSS. If a task gate is used, the DPL of this take gate is used for the check instead. Exceptions, external interrupts and iret shouldn't perform any check. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- arch/x86/include/asm/kvm_emulate.h | 2 +- arch/x86/include/asm/kvm_host.h | 4 +- arch/x86/kvm/emulate.c | 51 +++++++++++++++++++++++++++++++----- arch/x86/kvm/svm.c | 5 +++- arch/x86/kvm/vmx.c | 8 +++-- arch/x86/kvm/x86.c | 6 ++-- 6 files changed, 59 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index ab4092e..c8a9cf3 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -372,7 +372,7 @@ bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt); #define EMULATION_INTERCEPTED 2 int x86_emulate_insn(struct x86_emulate_ctxt *ctxt); int emulator_task_switch(struct x86_emulate_ctxt *ctxt, - u16 tss_selector, int reason, + u16 tss_selector, int idt_index, int reason, bool has_error_code, u32 error_code); int emulate_int_real(struct x86_emulate_ctxt *ctxt, int irq); #endif /* _ASM_X86_KVM_X86_EMULATE_H */ diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 52d6640..0533fc4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -741,8 +741,8 @@ int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu); void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg); -int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason, - bool has_error_code, u32 error_code); +int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index, + int reason, bool has_error_code, u32 error_code); int kvm_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3); diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 05a562b..1b98a2c 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1151,6 +1151,22 @@ static int pio_in_emulated(struct x86_emulate_ctxt *ctxt, return 1; } +static int read_interrupt_descriptor(struct x86_emulate_ctxt *ctxt, + u16 index, struct kvm_desc_struct *desc) +{ + struct kvm_desc_ptr dt; + ulong addr; + + ctxt->ops->get_idt(ctxt, &dt); + + if (dt.size < index * 8 + 7) + return emulate_gp(ctxt, index << 3 | 0x2); + + addr = dt.address + index * 8; + return ctxt->ops->read_std(ctxt, addr, desc, sizeof *desc, + &ctxt->exception); +} + static void get_descriptor_table_ptr(struct x86_emulate_ctxt *ctxt, u16 selector, struct desc_ptr *dt) { @@ -2350,7 +2366,7 @@ static int task_switch_32(struct x86_emulate_ctxt *ctxt, } static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, - u16 tss_selector, int reason, + u16 tss_selector, int idt_index, int reason, bool has_error_code, u32 error_code) { struct x86_emulate_ops *ops = ctxt->ops; @@ -2360,6 +2376,7 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, ulong old_tss_base = ops->get_cached_segment_base(ctxt, VCPU_SREG_TR); u32 desc_limit; + int dpl; /* FIXME: old_tss_base == ~0 ? */ @@ -2372,12 +2389,32 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, /* FIXME: check that next_tss_desc is tss */ - if (reason != TASK_SWITCH_IRET) { - if ((tss_selector & 3) > next_tss_desc.dpl || - ops->cpl(ctxt) > next_tss_desc.dpl) - return emulate_gp(ctxt, 0); + /* + * Check privileges. The three cases are task switch caused by... + * + * 1. Software interrupt: Check against DPL of the task gate + * 2. Exception/IRQ/iret: No check is performed + * 3. jmp/call: Check agains DPL of the TSS + */ + dpl = -1; + if (reason == TASK_SWITCH_GATE) { + if (idt_index != -1) { + struct kvm_desc_struct task_gate_desc; + + ret = read_interrupt_descriptor(ctxt, idt_index, + &task_gate_desc); + if (ret != X86EMUL_CONTINUE) + return ret; + + dpl = task_gate_desc.dpl; + } + } else if (reason != TASK_SWITCH_IRET) { + dpl = next_tss_desc.dpl; } + if (dpl != -1 && ((tss_selector & 3) > dpl || ops->cpl(ctxt) > dpl)) + return emulate_gp(ctxt, 0); + desc_limit = desc_limit_scaled(&next_tss_desc); if (!next_tss_desc.p || ((desc_limit < 0x67 && (next_tss_desc.type & 8)) || @@ -2430,7 +2467,7 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, } int emulator_task_switch(struct x86_emulate_ctxt *ctxt, - u16 tss_selector, int reason, + u16 tss_selector, int idt_index, int reason, bool has_error_code, u32 error_code) { int rc; @@ -2438,7 +2475,7 @@ int emulator_task_switch(struct x86_emulate_ctxt *ctxt, ctxt->_eip = ctxt->eip; ctxt->dst.type = OP_NONE; - rc = emulator_do_task_switch(ctxt, tss_selector, reason, + rc = emulator_do_task_switch(ctxt, tss_selector, idt_index, reason, has_error_code, error_code); if (rc == X86EMUL_CONTINUE) diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c index 5fa553b..6a977c1 100644 --- a/arch/x86/kvm/svm.c +++ b/arch/x86/kvm/svm.c @@ -2730,7 +2730,10 @@ static int task_switch_interception(struct vcpu_svm *svm) (int_vec == OF_VECTOR || int_vec == BP_VECTOR))) skip_emulated_instruction(&svm->vcpu); - if (kvm_task_switch(&svm->vcpu, tss_selector, reason, + if (int_type != SVM_EXITINTINFO_TYPE_SOFT) + int_vec = -1; + + if (kvm_task_switch(&svm->vcpu, tss_selector, int_vec, reason, has_error_code, error_code) == EMULATE_FAIL) { svm->vcpu.run->exit_reason = KVM_EXIT_INTERNAL_ERROR; svm->vcpu.run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION; diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 906a7e8..a335170 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -4672,9 +4672,10 @@ static int handle_task_switch(struct kvm_vcpu *vcpu) bool has_error_code = false; u32 error_code = 0; u16 tss_selector; - int reason, type, idt_v; + int reason, type, idt_v, idt_index; idt_v = (vmx->idt_vectoring_info & VECTORING_INFO_VALID_MASK); + idt_index = (vmx->idt_vectoring_info & VECTORING_INFO_VECTOR_MASK); type = (vmx->idt_vectoring_info & VECTORING_INFO_TYPE_MASK); exit_qualification = vmcs_readl(EXIT_QUALIFICATION); @@ -4712,8 +4713,9 @@ static int handle_task_switch(struct kvm_vcpu *vcpu) type != INTR_TYPE_NMI_INTR)) skip_emulated_instruction(vcpu); - if (kvm_task_switch(vcpu, tss_selector, reason, - has_error_code, error_code) == EMULATE_FAIL) { + if (kvm_task_switch(vcpu, tss_selector, + type == INTR_TYPE_SOFT_INTR ? idt_index : -1, reason, + has_error_code, error_code) == EMULATE_FAIL) { vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION; vcpu->run->internal.ndata = 0; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1171def..dc3e945 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5541,15 +5541,15 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, return 0; } -int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason, - bool has_error_code, u32 error_code) +int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index, + int reason, bool has_error_code, u32 error_code) { struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt; int ret; init_emulate_ctxt(vcpu); - ret = emulator_task_switch(ctxt, tss_selector, reason, + ret = emulator_task_switch(ctxt, tss_selector, idt_index, reason, has_error_code, error_code); if (ret) -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks 2012-01-27 19:23 ` [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks Kevin Wolf @ 2012-01-30 10:39 ` Avi Kivity 2012-01-30 11:12 ` Kevin Wolf 0 siblings, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 10:39 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On 01/27/2012 09:23 PM, Kevin Wolf wrote: > Currently, all task switches check privileges against the DPL of the > TSS. This is only correct for jmp/call to a TSS. If a task gate is used, > the DPL of this take gate is used for the check instead. Exceptions, > external interrupts and iret shouldn't perform any check. > > > @@ -2372,12 +2389,32 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, > > /* FIXME: check that next_tss_desc is tss */ > > - if (reason != TASK_SWITCH_IRET) { > - if ((tss_selector & 3) > next_tss_desc.dpl || > - ops->cpl(ctxt) > next_tss_desc.dpl) > - return emulate_gp(ctxt, 0); > + /* > + * Check privileges. The three cases are task switch caused by... > + * > + * 1. Software interrupt: Check against DPL of the task gate > + * 2. Exception/IRQ/iret: No check is performed > + * 3. jmp/call: Check agains DPL of the TSS > + */ > + dpl = -1; > + if (reason == TASK_SWITCH_GATE) { > + if (idt_index != -1) { > + struct kvm_desc_struct task_gate_desc; > + > + ret = read_interrupt_descriptor(ctxt, idt_index, > + &task_gate_desc); > + if (ret != X86EMUL_CONTINUE) > + return ret; > + > + dpl = task_gate_desc.dpl; > + } > + } else if (reason != TASK_SWITCH_IRET) { > + dpl = next_tss_desc.dpl; > } > > + if (dpl != -1 && ((tss_selector & 3) > dpl || ops->cpl(ctxt) > dpl)) > + return emulate_gp(ctxt, 0); Should be #GP(selector), no? The TASK-GATE: branch of the CALL instruction documentation lists many other conditions which we don't check, so I'll accept this, otherwise we'll never make any progress. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks 2012-01-30 10:39 ` Avi Kivity @ 2012-01-30 11:12 ` Kevin Wolf 0 siblings, 0 replies; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 11:12 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti Am 30.01.2012 11:39, schrieb Avi Kivity: > On 01/27/2012 09:23 PM, Kevin Wolf wrote: >> Currently, all task switches check privileges against the DPL of the >> TSS. This is only correct for jmp/call to a TSS. If a task gate is used, >> the DPL of this take gate is used for the check instead. Exceptions, >> external interrupts and iret shouldn't perform any check. >> >> >> @@ -2372,12 +2389,32 @@ static int emulator_do_task_switch(struct x86_emulate_ctxt *ctxt, >> >> /* FIXME: check that next_tss_desc is tss */ >> >> - if (reason != TASK_SWITCH_IRET) { >> - if ((tss_selector & 3) > next_tss_desc.dpl || >> - ops->cpl(ctxt) > next_tss_desc.dpl) >> - return emulate_gp(ctxt, 0); >> + /* >> + * Check privileges. The three cases are task switch caused by... >> + * >> + * 1. Software interrupt: Check against DPL of the task gate >> + * 2. Exception/IRQ/iret: No check is performed >> + * 3. jmp/call: Check agains DPL of the TSS >> + */ >> + dpl = -1; >> + if (reason == TASK_SWITCH_GATE) { >> + if (idt_index != -1) { >> + struct kvm_desc_struct task_gate_desc; >> + >> + ret = read_interrupt_descriptor(ctxt, idt_index, >> + &task_gate_desc); >> + if (ret != X86EMUL_CONTINUE) >> + return ret; >> + >> + dpl = task_gate_desc.dpl; >> + } >> + } else if (reason != TASK_SWITCH_IRET) { >> + dpl = next_tss_desc.dpl; >> } >> >> + if (dpl != -1 && ((tss_selector & 3) > dpl || ops->cpl(ctxt) > dpl)) >> + return emulate_gp(ctxt, 0); > > Should be #GP(selector), no? This line is just moved code, but without looking it up I would expect the selector, yes. Would be an independent bug. I can add a patch to the series if you like. > The TASK-GATE: branch of the CALL instruction documentation lists many > other conditions which we don't check, so I'll accept this, otherwise > we'll never make any progress. Hm, I didn't really look at call, just iret and exceptions. Seems there are many test cases left to write. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 2/3] KVM: x86 emulator: VM86 segments must have DPL 3 2012-01-27 19:23 [PATCH v2 0/3] Fix task switches into/out of VM86 Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks Kevin Wolf @ 2012-01-27 19:23 ` Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch Kevin Wolf 2012-01-27 19:52 ` [PATCH v2 0/3] Fix task switches into/out of VM86 Gleb Natapov 3 siblings, 0 replies; 36+ messages in thread From: Kevin Wolf @ 2012-01-27 19:23 UTC (permalink / raw) To: kvm; +Cc: kwolf, gleb, joerg.roedel, yoshikawa.takuya, avi, mtosatti Setting the segment DPL to 0 for at least the VM86 code segment makes the VM entry fail on VMX. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- arch/x86/kvm/emulate.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 1b98a2c..833969e 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1243,6 +1243,8 @@ static int load_segment_descriptor(struct x86_emulate_ctxt *ctxt, seg_desc.type = 3; seg_desc.p = 1; seg_desc.s = 1; + if (ctxt->mode == X86EMUL_MODE_VM86) + seg_desc.dpl = 3; goto load; } -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-27 19:23 [PATCH v2 0/3] Fix task switches into/out of VM86 Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 2/3] KVM: x86 emulator: VM86 segments must have DPL 3 Kevin Wolf @ 2012-01-27 19:23 ` Kevin Wolf 2012-01-30 10:24 ` Avi Kivity 2012-01-27 19:52 ` [PATCH v2 0/3] Fix task switches into/out of VM86 Gleb Natapov 3 siblings, 1 reply; 36+ messages in thread From: Kevin Wolf @ 2012-01-27 19:23 UTC (permalink / raw) To: kvm; +Cc: kwolf, gleb, joerg.roedel, yoshikawa.takuya, avi, mtosatti Task switches can switch between Protected Mode and VM86. The current mode must be updated during the task switch emulation so that the new segment selectors are interpreted correctly and privilege checks succeed. VMX code calculates the CPL from the code segment selector and rflags, so it needs rflags to be updated in the vcpu struct. SVM stores the DPL of the code segment instead, so we must be sure to give the right one when updating the selector. Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- arch/x86/include/asm/kvm_emulate.h | 1 + arch/x86/kvm/emulate.c | 26 ++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 6 ++++++ 3 files changed, 33 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index c8a9cf3..4a21c7d 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -176,6 +176,7 @@ struct x86_emulate_ops { void (*set_idt)(struct x86_emulate_ctxt *ctxt, struct desc_ptr *dt); ulong (*get_cr)(struct x86_emulate_ctxt *ctxt, int cr); int (*set_cr)(struct x86_emulate_ctxt *ctxt, int cr, ulong val); + void (*set_rflags)(struct x86_emulate_ctxt *ctxt, ulong val); int (*cpl)(struct x86_emulate_ctxt *ctxt); int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest); int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value); diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 833969e..143ce8e 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -552,6 +552,14 @@ static void set_segment_selector(struct x86_emulate_ctxt *ctxt, u16 selector, struct desc_struct desc; ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); + + if (ctxt->mode == X86EMUL_MODE_REAL) + desc.dpl = 0; + else if (ctxt->mode == X86EMUL_MODE_VM86) + desc.dpl = 3; + else + desc.dpl = selector & 0x3; + ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); } @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, return emulate_gp(ctxt, 0); ctxt->_eip = tss->eip; ctxt->eflags = tss->eflags | 2; + + /* + * If we're switching between Protected Mode and VM86, we need to make + * sure to update the mode before loading the segment descriptors so + * that the selectors are interpreted correctly. + * + * Need to get it to the vcpu struct immediately because it influences + * the CPL which is checked at least when loading the segment + * descriptors and when pushing an error code to the new kernel stack. + */ + if (ctxt->eflags & X86_EFLAGS_VM) + ctxt->mode = X86EMUL_MODE_VM86; + else + ctxt->mode = X86EMUL_MODE_PROT32; + + ctxt->ops->set_rflags(ctxt, ctxt->eflags); + + /* General purpose registers */ ctxt->regs[VCPU_REGS_RAX] = tss->eax; ctxt->regs[VCPU_REGS_RCX] = tss->ecx; ctxt->regs[VCPU_REGS_RDX] = tss->edx; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index dc3e945..502b5c3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4040,6 +4040,11 @@ static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) return res; } +static void emulator_set_rflags(struct x86_emulate_ctxt *ctxt, ulong val) +{ + kvm_set_rflags(emul_to_vcpu(ctxt), val); +} + static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) { return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { .set_idt = emulator_set_idt, .get_cr = emulator_get_cr, .set_cr = emulator_set_cr, + .set_rflags = emulator_set_rflags, .cpl = emulator_get_cpl, .get_dr = emulator_get_dr, .set_dr = emulator_set_dr, -- 1.7.6.5 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-27 19:23 ` [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch Kevin Wolf @ 2012-01-30 10:24 ` Avi Kivity 2012-01-30 10:56 ` Gleb Natapov 2012-01-30 11:05 ` Kevin Wolf 0 siblings, 2 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 10:24 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On 01/27/2012 09:23 PM, Kevin Wolf wrote: > Task switches can switch between Protected Mode and VM86. The current > mode must be updated during the task switch emulation so that the new > segment selectors are interpreted correctly and privilege checks > succeed. > > VMX code calculates the CPL from the code segment selector and rflags, > so it needs rflags to be updated in the vcpu struct. SVM stores the DPL > of the code segment instead, so we must be sure to give the right one > when updating the selector. svm has vmcb_save_area::cpl; it's independent of CS.RPL. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > arch/x86/include/asm/kvm_emulate.h | 1 + > arch/x86/kvm/emulate.c | 26 ++++++++++++++++++++++++++ > arch/x86/kvm/x86.c | 6 ++++++ > 3 files changed, 33 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h > index c8a9cf3..4a21c7d 100644 > --- a/arch/x86/include/asm/kvm_emulate.h > +++ b/arch/x86/include/asm/kvm_emulate.h > @@ -176,6 +176,7 @@ struct x86_emulate_ops { > void (*set_idt)(struct x86_emulate_ctxt *ctxt, struct desc_ptr *dt); > ulong (*get_cr)(struct x86_emulate_ctxt *ctxt, int cr); > int (*set_cr)(struct x86_emulate_ctxt *ctxt, int cr, ulong val); > + void (*set_rflags)(struct x86_emulate_ctxt *ctxt, ulong val); > int (*cpl)(struct x86_emulate_ctxt *ctxt); > int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest); > int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value); > diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c > index 833969e..143ce8e 100644 > --- a/arch/x86/kvm/emulate.c > +++ b/arch/x86/kvm/emulate.c > @@ -552,6 +552,14 @@ static void set_segment_selector(struct x86_emulate_ctxt *ctxt, u16 selector, > struct desc_struct desc; > > ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); > + > + if (ctxt->mode == X86EMUL_MODE_REAL) > + desc.dpl = 0; Can't happen. > + else if (ctxt->mode == X86EMUL_MODE_VM86) > + desc.dpl = 3; > + else > + desc.dpl = selector & 0x3; I don't think this is right. The SDM explicitly says that just the selectors are loaded, yet you're modifying the DPL here too. Unless there's an abort, we'll be loading the descriptors later anyway (with their DPL). If there's an abort, I think we should continue with the mismatched DPL. > + > ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); > } > > @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > return emulate_gp(ctxt, 0); > ctxt->_eip = tss->eip; > ctxt->eflags = tss->eflags | 2; > + > + /* > + * If we're switching between Protected Mode and VM86, we need to make > + * sure to update the mode before loading the segment descriptors so > + * that the selectors are interpreted correctly. > + * > + * Need to get it to the vcpu struct immediately because it influences > + * the CPL which is checked at least when loading the segment > + * descriptors and when pushing an error code to the new kernel stack. > + */ > + if (ctxt->eflags & X86_EFLAGS_VM) > + ctxt->mode = X86EMUL_MODE_VM86; > + else > + ctxt->mode = X86EMUL_MODE_PROT32; > + Shouldn't this be done after the set_segment_selector() block? My interpretation of the SDM is that if a fault happens while loading descriptors the fault happens with old segment cache, that is, it needs to be interpreted according to the old mode. > + ctxt->ops->set_rflags(ctxt, ctxt->eflags); > + > + /* General purpose registers */ > ctxt->regs[VCPU_REGS_RAX] = tss->eax; > ctxt->regs[VCPU_REGS_RCX] = tss->ecx; > ctxt->regs[VCPU_REGS_RDX] = tss->edx; > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index dc3e945..502b5c3 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4040,6 +4040,11 @@ static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) > return res; > } > > +static void emulator_set_rflags(struct x86_emulate_ctxt *ctxt, ulong val) > +{ > + kvm_set_rflags(emul_to_vcpu(ctxt), val); > +} > + > static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) > { > return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); > @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { > .set_idt = emulator_set_idt, > .get_cr = emulator_get_cr, > .set_cr = emulator_set_cr, > + .set_rflags = emulator_set_rflags, > .cpl = emulator_get_cpl, > .get_dr = emulator_get_dr, > .set_dr = emulator_set_dr, It would be good to switch the entire emulator to use ->set_rflags() instead of sometimes letting the caller do it. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 10:24 ` Avi Kivity @ 2012-01-30 10:56 ` Gleb Natapov 2012-01-30 12:02 ` Avi Kivity 2012-01-30 11:05 ` Kevin Wolf 1 sibling, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 10:56 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 12:24:11PM +0200, Avi Kivity wrote: > > + > > ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); > > } > > > > @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > > return emulate_gp(ctxt, 0); > > ctxt->_eip = tss->eip; > > ctxt->eflags = tss->eflags | 2; > > + > > + /* > > + * If we're switching between Protected Mode and VM86, we need to make > > + * sure to update the mode before loading the segment descriptors so > > + * that the selectors are interpreted correctly. > > + * > > + * Need to get it to the vcpu struct immediately because it influences > > + * the CPL which is checked at least when loading the segment > > + * descriptors and when pushing an error code to the new kernel stack. > > + */ > > + if (ctxt->eflags & X86_EFLAGS_VM) > > + ctxt->mode = X86EMUL_MODE_VM86; > > + else > > + ctxt->mode = X86EMUL_MODE_PROT32; > > + > > Shouldn't this be done after the set_segment_selector() block? My > interpretation of the SDM is that if a fault happens while loading > descriptors the fault happens with old segment cache, that is, it needs > to be interpreted according to the old mode. > No, spec says: Any errors associated with this loading and qualification occur in the context of the new task. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 10:56 ` Gleb Natapov @ 2012-01-30 12:02 ` Avi Kivity 2012-01-30 12:04 ` Gleb Natapov 0 siblings, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 12:02 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 12:56 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 12:24:11PM +0200, Avi Kivity wrote: > > > + > > > ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); > > > } > > > > > > @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > > > return emulate_gp(ctxt, 0); > > > ctxt->_eip = tss->eip; > > > ctxt->eflags = tss->eflags | 2; > > > + > > > + /* > > > + * If we're switching between Protected Mode and VM86, we need to make > > > + * sure to update the mode before loading the segment descriptors so > > > + * that the selectors are interpreted correctly. > > > + * > > > + * Need to get it to the vcpu struct immediately because it influences > > > + * the CPL which is checked at least when loading the segment > > > + * descriptors and when pushing an error code to the new kernel stack. > > > + */ > > > + if (ctxt->eflags & X86_EFLAGS_VM) > > > + ctxt->mode = X86EMUL_MODE_VM86; > > > + else > > > + ctxt->mode = X86EMUL_MODE_PROT32; > > > + > > > > Shouldn't this be done after the set_segment_selector() block? My > > interpretation of the SDM is that if a fault happens while loading > > descriptors the fault happens with old segment cache, that is, it needs > > to be interpreted according to the old mode. > > > No, spec says: > Any errors associated with this loading and qualification occur in the > context of the new task. There aren't any possible errors until that point. But we don't want set_segment_selector() to use the new mode, it just changes the segment selectors, nothing else. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 12:02 ` Avi Kivity @ 2012-01-30 12:04 ` Gleb Natapov 2012-01-30 13:24 ` Avi Kivity 0 siblings, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 12:04 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 02:02:14PM +0200, Avi Kivity wrote: > On 01/30/2012 12:56 PM, Gleb Natapov wrote: > > On Mon, Jan 30, 2012 at 12:24:11PM +0200, Avi Kivity wrote: > > > > + > > > > ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); > > > > } > > > > > > > > @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > > > > return emulate_gp(ctxt, 0); > > > > ctxt->_eip = tss->eip; > > > > ctxt->eflags = tss->eflags | 2; > > > > + > > > > + /* > > > > + * If we're switching between Protected Mode and VM86, we need to make > > > > + * sure to update the mode before loading the segment descriptors so > > > > + * that the selectors are interpreted correctly. > > > > + * > > > > + * Need to get it to the vcpu struct immediately because it influences > > > > + * the CPL which is checked at least when loading the segment > > > > + * descriptors and when pushing an error code to the new kernel stack. > > > > + */ > > > > + if (ctxt->eflags & X86_EFLAGS_VM) > > > > + ctxt->mode = X86EMUL_MODE_VM86; > > > > + else > > > > + ctxt->mode = X86EMUL_MODE_PROT32; > > > > + > > > > > > Shouldn't this be done after the set_segment_selector() block? My > > > interpretation of the SDM is that if a fault happens while loading > > > descriptors the fault happens with old segment cache, that is, it needs > > > to be interpreted according to the old mode. > > > > > No, spec says: > > Any errors associated with this loading and qualification occur in the > > context of the new task. > > There aren't any possible errors until that point. But we don't want > set_segment_selector() to use the new mode, it just changes the segment > selectors, nothing else. > I agree with that set_segment_selector() part, but load_segment_descriptor() happens on a new task. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 12:04 ` Gleb Natapov @ 2012-01-30 13:24 ` Avi Kivity 0 siblings, 0 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 13:24 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 02:04 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 02:02:14PM +0200, Avi Kivity wrote: > > On 01/30/2012 12:56 PM, Gleb Natapov wrote: > > > On Mon, Jan 30, 2012 at 12:24:11PM +0200, Avi Kivity wrote: > > > > > + > > > > > ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); > > > > > } > > > > > > > > > > @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > > > > > return emulate_gp(ctxt, 0); > > > > > ctxt->_eip = tss->eip; > > > > > ctxt->eflags = tss->eflags | 2; > > > > > + > > > > > + /* > > > > > + * If we're switching between Protected Mode and VM86, we need to make > > > > > + * sure to update the mode before loading the segment descriptors so > > > > > + * that the selectors are interpreted correctly. > > > > > + * > > > > > + * Need to get it to the vcpu struct immediately because it influences > > > > > + * the CPL which is checked at least when loading the segment > > > > > + * descriptors and when pushing an error code to the new kernel stack. > > > > > + */ > > > > > + if (ctxt->eflags & X86_EFLAGS_VM) > > > > > + ctxt->mode = X86EMUL_MODE_VM86; > > > > > + else > > > > > + ctxt->mode = X86EMUL_MODE_PROT32; > > > > > + > > > > > > > > Shouldn't this be done after the set_segment_selector() block? My > > > > interpretation of the SDM is that if a fault happens while loading > > > > descriptors the fault happens with old segment cache, that is, it needs > > > > to be interpreted according to the old mode. > > > > > > > No, spec says: > > > Any errors associated with this loading and qualification occur in the > > > context of the new task. > > > > There aren't any possible errors until that point. But we don't want > > set_segment_selector() to use the new mode, it just changes the segment > > selectors, nothing else. > > > I agree with that set_segment_selector() part, but > load_segment_descriptor() happens on a new task. > Yes. What I think is that the mode change should happen on the boundary, after the set_segment_selector() block and before the load_segment_descriptor() block. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 10:24 ` Avi Kivity 2012-01-30 10:56 ` Gleb Natapov @ 2012-01-30 11:05 ` Kevin Wolf 2012-01-30 11:09 ` Gleb Natapov 2012-01-30 13:23 ` Avi Kivity 1 sibling, 2 replies; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 11:05 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti Am 30.01.2012 11:24, schrieb Avi Kivity: > On 01/27/2012 09:23 PM, Kevin Wolf wrote: >> Task switches can switch between Protected Mode and VM86. The current >> mode must be updated during the task switch emulation so that the new >> segment selectors are interpreted correctly and privilege checks >> succeed. >> >> VMX code calculates the CPL from the code segment selector and rflags, >> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL >> of the code segment instead, so we must be sure to give the right one >> when updating the selector. > > svm has vmcb_save_area::cpl; it's independent of CS.RPL. Right, but cpl in the VMCB is updated when you reload a segment (and I think only then), so it's closely related. >> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >> --- >> arch/x86/include/asm/kvm_emulate.h | 1 + >> arch/x86/kvm/emulate.c | 26 ++++++++++++++++++++++++++ >> arch/x86/kvm/x86.c | 6 ++++++ >> 3 files changed, 33 insertions(+), 0 deletions(-) >> >> diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h >> index c8a9cf3..4a21c7d 100644 >> --- a/arch/x86/include/asm/kvm_emulate.h >> +++ b/arch/x86/include/asm/kvm_emulate.h >> @@ -176,6 +176,7 @@ struct x86_emulate_ops { >> void (*set_idt)(struct x86_emulate_ctxt *ctxt, struct desc_ptr *dt); >> ulong (*get_cr)(struct x86_emulate_ctxt *ctxt, int cr); >> int (*set_cr)(struct x86_emulate_ctxt *ctxt, int cr, ulong val); >> + void (*set_rflags)(struct x86_emulate_ctxt *ctxt, ulong val); >> int (*cpl)(struct x86_emulate_ctxt *ctxt); >> int (*get_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong *dest); >> int (*set_dr)(struct x86_emulate_ctxt *ctxt, int dr, ulong value); >> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c >> index 833969e..143ce8e 100644 >> --- a/arch/x86/kvm/emulate.c >> +++ b/arch/x86/kvm/emulate.c >> @@ -552,6 +552,14 @@ static void set_segment_selector(struct x86_emulate_ctxt *ctxt, u16 selector, >> struct desc_struct desc; >> >> ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); >> + >> + if (ctxt->mode == X86EMUL_MODE_REAL) >> + desc.dpl = 0; > > Can't happen. set_segment_selector is only called during task switches right now, so yes, this is impossible. Want me to drop the check? Or BUG()? >> + else if (ctxt->mode == X86EMUL_MODE_VM86) >> + desc.dpl = 3; >> + else >> + desc.dpl = selector & 0x3; > > I don't think this is right. The SDM explicitly says that just the > selectors are loaded, yet you're modifying the DPL here too. Unless > there's an abort, we'll be loading the descriptors later anyway (with > their DPL). If there's an abort, I think we should continue with the > mismatched DPL. > >> + >> ctxt->ops->set_segment(ctxt, selector, &desc, base3, seg); >> } >> >> @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, >> return emulate_gp(ctxt, 0); >> ctxt->_eip = tss->eip; >> ctxt->eflags = tss->eflags | 2; >> + >> + /* >> + * If we're switching between Protected Mode and VM86, we need to make >> + * sure to update the mode before loading the segment descriptors so >> + * that the selectors are interpreted correctly. >> + * >> + * Need to get it to the vcpu struct immediately because it influences >> + * the CPL which is checked at least when loading the segment >> + * descriptors and when pushing an error code to the new kernel stack. >> + */ >> + if (ctxt->eflags & X86_EFLAGS_VM) >> + ctxt->mode = X86EMUL_MODE_VM86; >> + else >> + ctxt->mode = X86EMUL_MODE_PROT32; >> + > > Shouldn't this be done after the set_segment_selector() block? My > interpretation of the SDM is that if a fault happens while loading > descriptors the fault happens with old segment cache, that is, it needs > to be interpreted according to the old mode. This is closely related to my argument with Gleb whether CPL changes when rflags is reloaded or if it only happens when cs is reloaded. I argued that it makes more sense to couple it with cs, so I guess I have to agree with you mostly. The SDM says that any faults during loading the segment descriptors happen in the context of the new task, and the task switch is completed before an exception is generated. So it shouldn't make any difference to the guest what the exact point is at which we change CPL, it's an KVM internal decision. So what about this order: 1. Reload general purpose registers and eflags without updating mode or writing back rflags to the vcpu struct. 2. Load segment selectors without changing the DPL yet. 3. Load segment descriptors, and disable the privilege checks in load_segment_descriptor using a new flag. For SVM, this updates the CPL when cs is reloaded. 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be cleaned up in a follow-up patch, so that VMX uses set_segment to update the CPL, like SVM does. Would this match your interpretation? >> + ctxt->ops->set_rflags(ctxt, ctxt->eflags); >> + >> + /* General purpose registers */ >> ctxt->regs[VCPU_REGS_RAX] = tss->eax; >> ctxt->regs[VCPU_REGS_RCX] = tss->ecx; >> ctxt->regs[VCPU_REGS_RDX] = tss->edx; >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index dc3e945..502b5c3 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -4040,6 +4040,11 @@ static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) >> return res; >> } >> >> +static void emulator_set_rflags(struct x86_emulate_ctxt *ctxt, ulong val) >> +{ >> + kvm_set_rflags(emul_to_vcpu(ctxt), val); >> +} >> + >> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) >> { >> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); >> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { >> .set_idt = emulator_set_idt, >> .get_cr = emulator_get_cr, >> .set_cr = emulator_set_cr, >> + .set_rflags = emulator_set_rflags, >> .cpl = emulator_get_cpl, >> .get_dr = emulator_get_dr, >> .set_dr = emulator_set_dr, > > It would be good to switch the entire emulator to use ->set_rflags() > instead of sometimes letting the caller do it. If we change the CPL only with a cs reload, set_rflags can be dropped completely in the end. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 11:05 ` Kevin Wolf @ 2012-01-30 11:09 ` Gleb Natapov 2012-01-30 13:23 ` Avi Kivity 1 sibling, 0 replies; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 11:09 UTC (permalink / raw) To: Kevin Wolf; +Cc: Avi Kivity, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 12:05:37PM +0100, Kevin Wolf wrote: > >> + > >> + /* > >> + * If we're switching between Protected Mode and VM86, we need to make > >> + * sure to update the mode before loading the segment descriptors so > >> + * that the selectors are interpreted correctly. > >> + * > >> + * Need to get it to the vcpu struct immediately because it influences > >> + * the CPL which is checked at least when loading the segment > >> + * descriptors and when pushing an error code to the new kernel stack. > >> + */ > >> + if (ctxt->eflags & X86_EFLAGS_VM) > >> + ctxt->mode = X86EMUL_MODE_VM86; > >> + else > >> + ctxt->mode = X86EMUL_MODE_PROT32; > >> + > > > > Shouldn't this be done after the set_segment_selector() block? My > > interpretation of the SDM is that if a fault happens while loading > > descriptors the fault happens with old segment cache, that is, it needs > > to be interpreted according to the old mode. > > This is closely related to my argument with Gleb whether CPL changes > when rflags is reloaded or if it only happens when cs is reloaded. I > argued that it makes more sense to couple it with cs, so I guess I have > to agree with you mostly. > > The SDM says that any faults during loading the segment descriptors > happen in the context of the new task, and the task switch is completed > before an exception is generated. So it shouldn't make any difference to > the guest what the exact point is at which we change CPL, it's an KVM > internal decision. > Actually no. What if fault happens during loading of CS descriptor? Spec clearly says it should be performed in a new task context. Meaning in vm86 mode. So cpl switch should happen already. > So what about this order: > > 1. Reload general purpose registers and eflags without updating mode > or writing back rflags to the vcpu struct. > > 2. Load segment selectors without changing the DPL yet. > > 3. Load segment descriptors, and disable the privilege checks in > load_segment_descriptor using a new flag. For SVM, this updates > the CPL when cs is reloaded. > > 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be > cleaned up in a follow-up patch, so that VMX uses set_segment > to update the CPL, like SVM does. > > Would this match your interpretation? > > >> + ctxt->ops->set_rflags(ctxt, ctxt->eflags); > >> + > >> + /* General purpose registers */ > >> ctxt->regs[VCPU_REGS_RAX] = tss->eax; > >> ctxt->regs[VCPU_REGS_RCX] = tss->ecx; > >> ctxt->regs[VCPU_REGS_RDX] = tss->edx; > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > >> index dc3e945..502b5c3 100644 > >> --- a/arch/x86/kvm/x86.c > >> +++ b/arch/x86/kvm/x86.c > >> @@ -4040,6 +4040,11 @@ static int emulator_set_cr(struct x86_emulate_ctxt *ctxt, int cr, ulong val) > >> return res; > >> } > >> > >> +static void emulator_set_rflags(struct x86_emulate_ctxt *ctxt, ulong val) > >> +{ > >> + kvm_set_rflags(emul_to_vcpu(ctxt), val); > >> +} > >> + > >> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) > >> { > >> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); > >> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { > >> .set_idt = emulator_set_idt, > >> .get_cr = emulator_get_cr, > >> .set_cr = emulator_set_cr, > >> + .set_rflags = emulator_set_rflags, > >> .cpl = emulator_get_cpl, > >> .get_dr = emulator_get_dr, > >> .set_dr = emulator_set_dr, > > > > It would be good to switch the entire emulator to use ->set_rflags() > > instead of sometimes letting the caller do it. > > If we change the CPL only with a cs reload, set_rflags can be dropped > completely in the end. > > Kevin -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 11:05 ` Kevin Wolf 2012-01-30 11:09 ` Gleb Natapov @ 2012-01-30 13:23 ` Avi Kivity 2012-01-30 14:01 ` Kevin Wolf 1 sibling, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 13:23 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 01:05 PM, Kevin Wolf wrote: > Am 30.01.2012 11:24, schrieb Avi Kivity: > > On 01/27/2012 09:23 PM, Kevin Wolf wrote: > >> Task switches can switch between Protected Mode and VM86. The current > >> mode must be updated during the task switch emulation so that the new > >> segment selectors are interpreted correctly and privilege checks > >> succeed. > >> > >> VMX code calculates the CPL from the code segment selector and rflags, > >> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL > >> of the code segment instead, so we must be sure to give the right one > >> when updating the selector. > > > > svm has vmcb_save_area::cpl; it's independent of CS.RPL. > > Right, but cpl in the VMCB is updated when you reload a segment (and I > think only then), Gah, it's broken. It should be qualified by more - real mode should keep cpl at zero, vm86 at 3. And setting rflags.vm86 should update cpl as well. For example, live migration while in real mode with cs&3!=0 or in vm86 mode with cs&3==0 would set the wrong cpl. > so it's closely related. > > >> ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); > >> + > >> + if (ctxt->mode == X86EMUL_MODE_REAL) > >> + desc.dpl = 0; > > > > Can't happen. > > set_segment_selector is only called during task switches right now, so > yes, this is impossible. Want me to drop the check? Or BUG()? BUG()s are dangerous in rarely used code since they can be exploited as a DoS. So maybe a WARN_ON_ONCE(). > >> @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, > >> return emulate_gp(ctxt, 0); > >> ctxt->_eip = tss->eip; > >> ctxt->eflags = tss->eflags | 2; > >> + > >> + /* > >> + * If we're switching between Protected Mode and VM86, we need to make > >> + * sure to update the mode before loading the segment descriptors so > >> + * that the selectors are interpreted correctly. > >> + * > >> + * Need to get it to the vcpu struct immediately because it influences > >> + * the CPL which is checked at least when loading the segment > >> + * descriptors and when pushing an error code to the new kernel stack. > >> + */ > >> + if (ctxt->eflags & X86_EFLAGS_VM) > >> + ctxt->mode = X86EMUL_MODE_VM86; > >> + else > >> + ctxt->mode = X86EMUL_MODE_PROT32; > >> + > > > > Shouldn't this be done after the set_segment_selector() block? My > > interpretation of the SDM is that if a fault happens while loading > > descriptors the fault happens with old segment cache, that is, it needs > > to be interpreted according to the old mode. > > This is closely related to my argument with Gleb whether CPL changes > when rflags is reloaded or if it only happens when cs is reloaded. I > argued that it makes more sense to couple it with cs, so I guess I have > to agree with you mostly. > > The SDM says that any faults during loading the segment descriptors > happen in the context of the new task, and the task switch is completed > before an exception is generated. So it shouldn't make any difference to > the guest what the exact point is at which we change CPL, it's an KVM > internal decision. > > So what about this order: > > 1. Reload general purpose registers and eflags without updating mode > or writing back rflags to the vcpu struct. > > 2. Load segment selectors without changing the DPL yet. (or changing anything in the segment cache) > > 3. Load segment descriptors, and disable the privilege checks in > load_segment_descriptor using a new flag. I don't like doing things that aren't directly traceable to the SDM. Perhaps we should pass the cpl as a parameter to load_segment_descriptor(). Or we should ->set_cpl() just before. > For SVM, this updates > the CPL when cs is reloaded. > > 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be > cleaned up in a follow-up patch, so that VMX uses set_segment > to update the CPL, like SVM does. > > Would this match your interpretation? Not exactly. I claim that the cpl is a separate state from cs.cpl/ss.rpl/cr0.pe/eflags.vm - it cannot be derived just from those. On the transition from real mode to protected mode, you can have (say) cs=0x13 and cpl=0. Outside of mode switches, it's good to have set_segment() update the cpl automatically. But inside mode switches it can screw up further checks or aborts. > >> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) > >> { > >> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); > >> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { > >> .set_idt = emulator_set_idt, > >> .get_cr = emulator_get_cr, > >> .set_cr = emulator_set_cr, > >> + .set_rflags = emulator_set_rflags, > >> .cpl = emulator_get_cpl, > >> .get_dr = emulator_get_dr, > >> .set_dr = emulator_set_dr, > > > > It would be good to switch the entire emulator to use ->set_rflags() > > instead of sometimes letting the caller do it. > > If we change the CPL only with a cs reload, set_rflags can be dropped > completely in the end. That's attractive, yes. Suppose load_segment_descriptor(..., VCPU_SREG_LDTR) fails. What should the new cpl be? Does it matter? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 13:23 ` Avi Kivity @ 2012-01-30 14:01 ` Kevin Wolf 2012-01-30 14:32 ` Avi Kivity 0 siblings, 1 reply; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 14:01 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti Am 30.01.2012 14:23, schrieb Avi Kivity: > On 01/30/2012 01:05 PM, Kevin Wolf wrote: >> Am 30.01.2012 11:24, schrieb Avi Kivity: >>> On 01/27/2012 09:23 PM, Kevin Wolf wrote: >>>> Task switches can switch between Protected Mode and VM86. The current >>>> mode must be updated during the task switch emulation so that the new >>>> segment selectors are interpreted correctly and privilege checks >>>> succeed. >>>> >>>> VMX code calculates the CPL from the code segment selector and rflags, >>>> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL >>>> of the code segment instead, so we must be sure to give the right one >>>> when updating the selector. >>> >>> svm has vmcb_save_area::cpl; it's independent of CS.RPL. >> >> Right, but cpl in the VMCB is updated when you reload a segment (and I >> think only then), > > Gah, it's broken. It should be qualified by more - real mode should > keep cpl at zero, vm86 at 3. And setting rflags.vm86 should update cpl > as well. > > For example, live migration while in real mode with cs&3!=0 or in vm86 > mode with cs&3==0 would set the wrong cpl. Ah, right, I didn't think of this case. Not sure if I should fix it in this series. If we do the right fixes to the task switch, it won't be strictly needed for the bug I'm fixing. >> so it's closely related. >> >>>> ctxt->ops->get_segment(ctxt, &dummy, &desc, &base3, seg); >>>> + >>>> + if (ctxt->mode == X86EMUL_MODE_REAL) >>>> + desc.dpl = 0; >>> >>> Can't happen. >> >> set_segment_selector is only called during task switches right now, so >> yes, this is impossible. Want me to drop the check? Or BUG()? > > BUG()s are dangerous in rarely used code since they can be exploited as > a DoS. So maybe a WARN_ON_ONCE(). Ok. > >>>> @@ -2273,6 +2281,24 @@ static int load_state_from_tss32(struct x86_emulate_ctxt *ctxt, >>>> return emulate_gp(ctxt, 0); >>>> ctxt->_eip = tss->eip; >>>> ctxt->eflags = tss->eflags | 2; >>>> + >>>> + /* >>>> + * If we're switching between Protected Mode and VM86, we need to make >>>> + * sure to update the mode before loading the segment descriptors so >>>> + * that the selectors are interpreted correctly. >>>> + * >>>> + * Need to get it to the vcpu struct immediately because it influences >>>> + * the CPL which is checked at least when loading the segment >>>> + * descriptors and when pushing an error code to the new kernel stack. >>>> + */ >>>> + if (ctxt->eflags & X86_EFLAGS_VM) >>>> + ctxt->mode = X86EMUL_MODE_VM86; >>>> + else >>>> + ctxt->mode = X86EMUL_MODE_PROT32; >>>> + >>> >>> Shouldn't this be done after the set_segment_selector() block? My >>> interpretation of the SDM is that if a fault happens while loading >>> descriptors the fault happens with old segment cache, that is, it needs >>> to be interpreted according to the old mode. >> >> This is closely related to my argument with Gleb whether CPL changes >> when rflags is reloaded or if it only happens when cs is reloaded. I >> argued that it makes more sense to couple it with cs, so I guess I have >> to agree with you mostly. >> >> The SDM says that any faults during loading the segment descriptors >> happen in the context of the new task, and the task switch is completed >> before an exception is generated. So it shouldn't make any difference to >> the guest what the exact point is at which we change CPL, it's an KVM >> internal decision. >> >> So what about this order: >> >> 1. Reload general purpose registers and eflags without updating mode >> or writing back rflags to the vcpu struct. >> >> 2. Load segment selectors without changing the DPL yet. > > (or changing anything in the segment cache) > >> >> 3. Load segment descriptors, and disable the privilege checks in >> load_segment_descriptor using a new flag. > > I don't like doing things that aren't directly traceable to the SDM. > Perhaps we should pass the cpl as a parameter to > load_segment_descriptor(). Or we should ->set_cpl() just before. I like the idea of having a ->set_cpl(). For SVM it should be trivial to implement. For VMX, I think instead of clearing VCPU_EXREG_CPL, vcpu_vmx->cpl should directly be updated in set_rflags and set_segment (and in the new set_cpl, obviously). Would that be enough or would we have to avoid clearing it in all other places as well? Where would it be initialised if it's not enough? >> For SVM, this updates >> the CPL when cs is reloaded. > > > >> >> 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be >> cleaned up in a follow-up patch, so that VMX uses set_segment >> to update the CPL, like SVM does. >> >> Would this match your interpretation? > > Not exactly. I claim that the cpl is a separate state from > cs.cpl/ss.rpl/cr0.pe/eflags.vm - it cannot be derived just from those. > On the transition from real mode to protected mode, you can have (say) > cs=0x13 and cpl=0. But even with cs=0x13 cs.dpl would still be 0 (in the segment cache) before cs is reloaded and you switch to a real protected mode segment, right? > Outside of mode switches, it's good to have set_segment() update the cpl > automatically. But inside mode switches it can screw up further checks > or aborts. Can you give a specific example of how it screws things up? >>>> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) >>>> { >>>> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); >>>> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { >>>> .set_idt = emulator_set_idt, >>>> .get_cr = emulator_get_cr, >>>> .set_cr = emulator_set_cr, >>>> + .set_rflags = emulator_set_rflags, >>>> .cpl = emulator_get_cpl, >>>> .get_dr = emulator_get_dr, >>>> .set_dr = emulator_set_dr, >>> >>> It would be good to switch the entire emulator to use ->set_rflags() >>> instead of sometimes letting the caller do it. >> >> If we change the CPL only with a cs reload, set_rflags can be dropped >> completely in the end. > > That's attractive, yes. > > Suppose load_segment_descriptor(..., VCPU_SREG_LDTR) fails. What should > the new cpl be? Does it matter? I think the one matching the new cs/eflags. The SDM says that the task switch is completed without further segment checks and then an exception is thrown. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 14:01 ` Kevin Wolf @ 2012-01-30 14:32 ` Avi Kivity 2012-01-30 15:26 ` Kevin Wolf 0 siblings, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 14:32 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 04:01 PM, Kevin Wolf wrote: > Am 30.01.2012 14:23, schrieb Avi Kivity: > > On 01/30/2012 01:05 PM, Kevin Wolf wrote: > >> Am 30.01.2012 11:24, schrieb Avi Kivity: > >>> On 01/27/2012 09:23 PM, Kevin Wolf wrote: > >>>> Task switches can switch between Protected Mode and VM86. The current > >>>> mode must be updated during the task switch emulation so that the new > >>>> segment selectors are interpreted correctly and privilege checks > >>>> succeed. > >>>> > >>>> VMX code calculates the CPL from the code segment selector and rflags, > >>>> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL > >>>> of the code segment instead, so we must be sure to give the right one > >>>> when updating the selector. > >>> > >>> svm has vmcb_save_area::cpl; it's independent of CS.RPL. > >> > >> Right, but cpl in the VMCB is updated when you reload a segment (and I > >> think only then), > > > > Gah, it's broken. It should be qualified by more - real mode should > > keep cpl at zero, vm86 at 3. And setting rflags.vm86 should update cpl > > as well. > > > > For example, live migration while in real mode with cs&3!=0 or in vm86 > > mode with cs&3==0 would set the wrong cpl. > > Ah, right, I didn't think of this case. > > Not sure if I should fix it in this series. No need. But we need to have a clear picture of where we're going so we move in the right direction. And that direction is cpl decoupled from cs.rpl/ss.rpl/cr0.pe/eflags.vm (but changes in response to them, except cr0.pe). Note that we don't expose cpl as separate state for save/restore, so the kvm ABI follows vmx instead of svm (unfortunately). > If we do the right fixes to > the task switch, it won't be strictly needed for the bug I'm fixing. Right. > >> > >> 3. Load segment descriptors, and disable the privilege checks in > >> load_segment_descriptor using a new flag. > > > > I don't like doing things that aren't directly traceable to the SDM. > > Perhaps we should pass the cpl as a parameter to > > load_segment_descriptor(). Or we should ->set_cpl() just before. > > I like the idea of having a ->set_cpl(). For SVM it should be trivial to > implement. > > For VMX, I think instead of clearing VCPU_EXREG_CPL, vcpu_vmx->cpl > should directly be updated in set_rflags and set_segment (and in the new > set_cpl, obviously). vcpu_vmx::cpl is currently just a cache, when valid it reflects the state of cr0.pe/eflags.vm/cs.rpl. It's not separate state as the VMCS has nowhere to hold it. vmx cannot virtualize the sequence 'mov $3, %ss; mov $1, %eax; mov %eax, %cr0; nop' - we have emulate_invalid_guest_state for that, but it's not on by default. What you propose is converting vcpu_vmx::cpl from a cache to separate state, but that state can change after a vmexit. It's not entirely clear when it's valid and when it isn't. > Would that be enough or would we have to avoid clearing it in all other > places as well? Where would it be initialised if it's not enough? Maybe vmx_vcpu_reset(). > > >> For SVM, this updates > >> the CPL when cs is reloaded. > > > > > > > >> > >> 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be > >> cleaned up in a follow-up patch, so that VMX uses set_segment > >> to update the CPL, like SVM does. > >> > >> Would this match your interpretation? > > > > Not exactly. I claim that the cpl is a separate state from > > cs.cpl/ss.rpl/cr0.pe/eflags.vm - it cannot be derived just from those. > > On the transition from real mode to protected mode, you can have (say) > > cs=0x13 and cpl=0. > > But even with cs=0x13 cs.dpl would still be 0 (in the segment cache) > before cs is reloaded and you switch to a real protected mode segment, > right? Right. But cpl is defined as cs.rpl, not cs.dpl, and I think there are cases where cs.rpl != cs.dpl. > > Outside of mode switches, it's good to have set_segment() update the cpl > > automatically. But inside mode switches it can screw up further checks > > or aborts. > > Can you give a specific example of how it screws things up? KVM_SET_REGS/KVM_SET_SREGS happen non-atomically, so you might have KVM_SET_SREGS raising the CPL to 3 only to lower it afterwards. Things in the middle happen with the wrong CPL. Not sure if anything actually checks it there. The other case is what we're looking at, task switch. To actually update cpl, set_segment() needs to look at cr0.pe and eflags, but these might not have been committed yet. It's all solvable but the solution involves knowing exactly what kvm and the emulator depend on, which makes it very fragile. I think giving the emulator less complicated primitives will make it more readable. > > >>>> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) > >>>> { > >>>> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); > >>>> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { > >>>> .set_idt = emulator_set_idt, > >>>> .get_cr = emulator_get_cr, > >>>> .set_cr = emulator_set_cr, > >>>> + .set_rflags = emulator_set_rflags, > >>>> .cpl = emulator_get_cpl, > >>>> .get_dr = emulator_get_dr, > >>>> .set_dr = emulator_set_dr, > >>> > >>> It would be good to switch the entire emulator to use ->set_rflags() > >>> instead of sometimes letting the caller do it. > >> > >> If we change the CPL only with a cs reload, set_rflags can be dropped > >> completely in the end. > > > > That's attractive, yes. > > > > Suppose load_segment_descriptor(..., VCPU_SREG_LDTR) fails. What should > > the new cpl be? Does it matter? > > I think the one matching the new cs/eflags. The SDM says that the task > switch is completed without further segment checks and then an exception > is thrown. My bet is that it will be 3 if eflags.vm=1 and unchanged otherwise -- the cpl update happens when the segment cache is updated. But that's just a guess. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 14:32 ` Avi Kivity @ 2012-01-30 15:26 ` Kevin Wolf 2012-01-30 15:44 ` Avi Kivity ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 15:26 UTC (permalink / raw) To: Avi Kivity; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti Am 30.01.2012 15:32, schrieb Avi Kivity: > On 01/30/2012 04:01 PM, Kevin Wolf wrote: >> Am 30.01.2012 14:23, schrieb Avi Kivity: >>> On 01/30/2012 01:05 PM, Kevin Wolf wrote: >>>> Am 30.01.2012 11:24, schrieb Avi Kivity: >>>>> On 01/27/2012 09:23 PM, Kevin Wolf wrote: >>>>>> Task switches can switch between Protected Mode and VM86. The current >>>>>> mode must be updated during the task switch emulation so that the new >>>>>> segment selectors are interpreted correctly and privilege checks >>>>>> succeed. >>>>>> >>>>>> VMX code calculates the CPL from the code segment selector and rflags, >>>>>> so it needs rflags to be updated in the vcpu struct. SVM stores the DPL >>>>>> of the code segment instead, so we must be sure to give the right one >>>>>> when updating the selector. >>>>> >>>>> svm has vmcb_save_area::cpl; it's independent of CS.RPL. >>>> >>>> Right, but cpl in the VMCB is updated when you reload a segment (and I >>>> think only then), >>> >>> Gah, it's broken. It should be qualified by more - real mode should >>> keep cpl at zero, vm86 at 3. And setting rflags.vm86 should update cpl >>> as well. >>> >>> For example, live migration while in real mode with cs&3!=0 or in vm86 >>> mode with cs&3==0 would set the wrong cpl. >> >> Ah, right, I didn't think of this case. >> >> Not sure if I should fix it in this series. > > No need. But we need to have a clear picture of where we're going so we > move in the right direction. And that direction is cpl decoupled from > cs.rpl/ss.rpl/cr0.pe/eflags.vm (but changes in response to them, except > cr0.pe). > > Note that we don't expose cpl as separate state for save/restore, so the > kvm ABI follows vmx instead of svm (unfortunately). Which means that restoring a VM that was in the middle of a RM -> PM mode switch will result in corrupted state. Probably not a huge problem in practice, but can we fail the save? >> If we do the right fixes to >> the task switch, it won't be strictly needed for the bug I'm fixing. > > Right. > >>>> >>>> 3. Load segment descriptors, and disable the privilege checks in >>>> load_segment_descriptor using a new flag. >>> >>> I don't like doing things that aren't directly traceable to the SDM. >>> Perhaps we should pass the cpl as a parameter to >>> load_segment_descriptor(). Or we should ->set_cpl() just before. >> >> I like the idea of having a ->set_cpl(). For SVM it should be trivial to >> implement. >> >> For VMX, I think instead of clearing VCPU_EXREG_CPL, vcpu_vmx->cpl >> should directly be updated in set_rflags and set_segment (and in the new >> set_cpl, obviously). > > vcpu_vmx::cpl is currently just a cache, when valid it reflects the > state of cr0.pe/eflags.vm/cs.rpl. It's not separate state as the VMCS > has nowhere to hold it. vmx cannot virtualize the sequence 'mov $3, > %ss; mov $1, %eax; mov %eax, %cr0; nop' - we have > emulate_invalid_guest_state for that, but it's not on by default. > > What you propose is converting vcpu_vmx::cpl from a cache to separate > state, but that state can change after a vmexit. It's not entirely > clear when it's valid and when it isn't. It's the only way I can imagine to get a set_cpl() callback. Do you have another one? That it's not entirely clear where it's valid and where it needs to be set is exactly the problem I have with it. Probably even more than you as I'm not very familiar with the KVM code. >> Would that be enough or would we have to avoid clearing it in all other >> places as well? Where would it be initialised if it's not enough? > > Maybe vmx_vcpu_reset(). Do all CPL changes go through set_cr0/segment/rflags/cpl? I guess yes, so initialising on reset and keeping it valid all the time should be possible indeed. >>>> For SVM, this updates >>>> the CPL when cs is reloaded. >>> >>> >>> >>>> >>>> 4. Call ctxt->ops.set_rflag so that VMX updates the CPL. Should be >>>> cleaned up in a follow-up patch, so that VMX uses set_segment >>>> to update the CPL, like SVM does. >>>> >>>> Would this match your interpretation? >>> >>> Not exactly. I claim that the cpl is a separate state from >>> cs.cpl/ss.rpl/cr0.pe/eflags.vm - it cannot be derived just from those. >>> On the transition from real mode to protected mode, you can have (say) >>> cs=0x13 and cpl=0. >> >> But even with cs=0x13 cs.dpl would still be 0 (in the segment cache) >> before cs is reloaded and you switch to a real protected mode segment, >> right? > > Right. But cpl is defined as cs.rpl, not cs.dpl, and I think there are > cases where cs.rpl != cs.dpl. Sounds like conforming code segments? A part of protected mode magic that I've never touched and never intended to... But using the RPL instead makes sense in that context. So SVM isn't only wrong in not considering real mode and VM86, but also in calculating CPL from vmcb->save.cs.attrib rather than from the RPL in the selector? >>> Outside of mode switches, it's good to have set_segment() update the cpl >>> automatically. But inside mode switches it can screw up further checks >>> or aborts. >> >> Can you give a specific example of how it screws things up? > > KVM_SET_REGS/KVM_SET_SREGS happen non-atomically, so you might have > KVM_SET_SREGS raising the CPL to 3 only to lower it afterwards. Things > in the middle happen with the wrong CPL. Not sure if anything actually > checks it there. > > The other case is what we're looking at, task switch. To actually > update cpl, set_segment() needs to look at cr0.pe and eflags, but these > might not have been committed yet. It's all solvable but the solution > involves knowing exactly what kvm and the emulator depend on, which > makes it very fragile. I think giving the emulator less complicated > primitives will make it more readable. I think the main problem here is that you have two sets of registers, one in the vcpu struct and one in the emulator context. >>>>>> static int emulator_get_cpl(struct x86_emulate_ctxt *ctxt) >>>>>> { >>>>>> return kvm_x86_ops->get_cpl(emul_to_vcpu(ctxt)); >>>>>> @@ -4199,6 +4204,7 @@ static struct x86_emulate_ops emulate_ops = { >>>>>> .set_idt = emulator_set_idt, >>>>>> .get_cr = emulator_get_cr, >>>>>> .set_cr = emulator_set_cr, >>>>>> + .set_rflags = emulator_set_rflags, >>>>>> .cpl = emulator_get_cpl, >>>>>> .get_dr = emulator_get_dr, >>>>>> .set_dr = emulator_set_dr, >>>>> >>>>> It would be good to switch the entire emulator to use ->set_rflags() >>>>> instead of sometimes letting the caller do it. >>>> >>>> If we change the CPL only with a cs reload, set_rflags can be dropped >>>> completely in the end. >>> >>> That's attractive, yes. >>> >>> Suppose load_segment_descriptor(..., VCPU_SREG_LDTR) fails. What should >>> the new cpl be? Does it matter? >> >> I think the one matching the new cs/eflags. The SDM says that the task >> switch is completed without further segment checks and then an exception >> is thrown. > > My bet is that it will be 3 if eflags.vm=1 and unchanged otherwise -- > the cpl update happens when the segment cache is updated. But that's > just a guess. Does even anyone see the new CPL in error cases? An exception is thrown immediately, so cs is reloaded and we get an even newer CPL. So to take any notice of the CPL, the "complete the task switch" part would have to fail a privilege check. The one thing that comes to mind is that pushing an error code could fail, but I don't think this is considered part of the task switch. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 15:26 ` Kevin Wolf @ 2012-01-30 15:44 ` Avi Kivity 2012-01-30 15:55 ` Takuya Yoshikawa 2012-01-31 9:37 ` Gleb Natapov 2 siblings, 0 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 15:44 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 05:26 PM, Kevin Wolf wrote: > > > > My bet is that it will be 3 if eflags.vm=1 and unchanged otherwise -- > > the cpl update happens when the segment cache is updated. But that's > > just a guess. > > Does even anyone see the new CPL in error cases? An exception is thrown > immediately, so cs is reloaded and we get an even newer CPL. Depends on what we have on the IDT for the exception handler... > So to take > any notice of the CPL, the "complete the task switch" part would have to > fail a privilege check. The one thing that comes to mind is that pushing > an error code could fail, but I don't think this is considered part of > the task switch. Right. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 15:26 ` Kevin Wolf 2012-01-30 15:44 ` Avi Kivity @ 2012-01-30 15:55 ` Takuya Yoshikawa 2012-01-31 9:37 ` Gleb Natapov 2 siblings, 0 replies; 36+ messages in thread From: Takuya Yoshikawa @ 2012-01-30 15:55 UTC (permalink / raw) To: Kevin Wolf Cc: Avi Kivity, kvm, gleb, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, 30 Jan 2012 16:26:06 +0100 Kevin Wolf <kwolf@redhat.com> wrote: > > The other case is what we're looking at, task switch. To actually > > update cpl, set_segment() needs to look at cr0.pe and eflags, but these > > might not have been committed yet. It's all solvable but the solution > > involves knowing exactly what kvm and the emulator depend on, which > > makes it very fragile. I think giving the emulator less complicated > > primitives will make it more readable. > > I think the main problem here is that you have two sets of registers, > one in the vcpu struct and one in the emulator context. > I think we can, partly?, eliminate the usage of the latter by moving the register initialization to the decode/emulation stage, as once talked on kvm ml, and changing the writeback code to use callbacks. But still some refactoring is needed before that. Takuya ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-30 15:26 ` Kevin Wolf 2012-01-30 15:44 ` Avi Kivity 2012-01-30 15:55 ` Takuya Yoshikawa @ 2012-01-31 9:37 ` Gleb Natapov 2012-01-31 10:26 ` Avi Kivity 2 siblings, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-31 9:37 UTC (permalink / raw) To: Kevin Wolf; +Cc: Avi Kivity, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 04:26:06PM +0100, Kevin Wolf wrote: > >> Would that be enough or would we have to avoid clearing it in all other > >> places as well? Where would it be initialised if it's not enough? > > > > Maybe vmx_vcpu_reset(). > > Do all CPL changes go through set_cr0/segment/rflags/cpl? I guess yes, > so initialising on reset and keeping it valid all the time should be > possible indeed. > CPL can be changed while guest is running. SVM saves it for us in cpl field. VMX does not, so we either will have to update cpl on each exit (cpl = cs & 3) or somehow mark it not up-to-date and recalculate on access. Can VMX exit while cpl != cs & 3 or can this happen only during emulation? If it can we cannot know real cpl after exit. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch 2012-01-31 9:37 ` Gleb Natapov @ 2012-01-31 10:26 ` Avi Kivity 0 siblings, 0 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-31 10:26 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/31/2012 11:37 AM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 04:26:06PM +0100, Kevin Wolf wrote: > > >> Would that be enough or would we have to avoid clearing it in all other > > >> places as well? Where would it be initialised if it's not enough? > > > > > > Maybe vmx_vcpu_reset(). > > > > Do all CPL changes go through set_cr0/segment/rflags/cpl? I guess yes, > > so initialising on reset and keeping it valid all the time should be > > possible indeed. > > > CPL can be changed while guest is running. SVM saves it for us in cpl > field. VMX does not, so we either will have to update cpl on each exit > (cpl = cs & 3) or somehow mark it not up-to-date and recalculate on > access. Can VMX exit while cpl != cs & 3 or can this happen only during > emulation? If it can we cannot know real cpl after exit. > Perhaps it can, with unrestricted guests, but I think we don't allow those conditions (we trap cr0 writes). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-27 19:23 [PATCH v2 0/3] Fix task switches into/out of VM86 Kevin Wolf ` (2 preceding siblings ...) 2012-01-27 19:23 ` [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch Kevin Wolf @ 2012-01-27 19:52 ` Gleb Natapov 2012-01-30 8:48 ` Kevin Wolf 3 siblings, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-27 19:52 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > you test this with SVM? I did some testing on my buggy processor and it looks > as good as it gets, but it would be better if you could confirm. > You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > Kevin Wolf (3): > KVM: x86 emulator: Fix task switch privilege checks > KVM: x86 emulator: VM86 segments must have DPL 3 > KVM: x86 emulator: Allow PM/VM86 switch during task switch > > arch/x86/include/asm/kvm_emulate.h | 3 +- > arch/x86/include/asm/kvm_host.h | 4 +- > arch/x86/kvm/emulate.c | 79 ++++++++++++++++++++++++++++++++--- > arch/x86/kvm/svm.c | 5 ++- > arch/x86/kvm/vmx.c | 8 ++- > arch/x86/kvm/x86.c | 12 ++++- > 6 files changed, 94 insertions(+), 17 deletions(-) > > -- > 1.7.6.5 -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-27 19:52 ` [PATCH v2 0/3] Fix task switches into/out of VM86 Gleb Natapov @ 2012-01-30 8:48 ` Kevin Wolf 2012-01-30 8:55 ` Gleb Natapov 0 siblings, 1 reply; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 8:48 UTC (permalink / raw) To: Gleb Natapov; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti Am 27.01.2012 20:52, schrieb Gleb Natapov: > On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: >> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of >> you test this with SVM? I did some testing on my buggy processor and it looks >> as good as it gets, but it would be better if you could confirm. >> > You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? SVM updates the CPL when the segment selector for CS is loaded. From a svm.c POV, segment selectors are updated immediately after set_rflags, so it wouldn't really make a difference to do it twice. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 8:48 ` Kevin Wolf @ 2012-01-30 8:55 ` Gleb Natapov 2012-01-30 10:22 ` Gleb Natapov 2012-01-30 10:35 ` Kevin Wolf 0 siblings, 2 replies; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 8:55 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > Am 27.01.2012 20:52, schrieb Gleb Natapov: > > On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > >> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > >> you test this with SVM? I did some testing on my buggy processor and it looks > >> as good as it gets, but it would be better if you could confirm. > >> > > You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > SVM updates the CPL when the segment selector for CS is loaded. From a > svm.c POV, segment selectors are updated immediately after set_rflags, > so it wouldn't really make a difference to do it twice. > It is too subtle to rely on that. The fact is that checking cpl after set_rflags provides incorrect value. This better be fixed. BTW does load_state_from_tss16() need the same fix? -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 8:55 ` Gleb Natapov @ 2012-01-30 10:22 ` Gleb Natapov 2012-01-30 10:35 ` Kevin Wolf 1 sibling, 0 replies; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 10:22 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti On Mon, Jan 30, 2012 at 10:55:41AM +0200, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > > Am 27.01.2012 20:52, schrieb Gleb Natapov: > > > On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > > >> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > > >> you test this with SVM? I did some testing on my buggy processor and it looks > > >> as good as it gets, but it would be better if you could confirm. > > >> > > > You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > > > SVM updates the CPL when the segment selector for CS is loaded. From a > > svm.c POV, segment selectors are updated immediately after set_rflags, > > so it wouldn't really make a difference to do it twice. > > > It is too subtle to rely on that. The fact is that checking cpl after > set_rflags provides incorrect value. This better be fixed. BTW does > load_state_from_tss16() need the same fix? > Probably not since you can't switch to vm86 this way. But may be for other reasons. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 8:55 ` Gleb Natapov 2012-01-30 10:22 ` Gleb Natapov @ 2012-01-30 10:35 ` Kevin Wolf 2012-01-30 10:45 ` Avi Kivity 2012-01-30 10:47 ` Gleb Natapov 1 sibling, 2 replies; 36+ messages in thread From: Kevin Wolf @ 2012-01-30 10:35 UTC (permalink / raw) To: Gleb Natapov; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti Am 30.01.2012 09:55, schrieb Gleb Natapov: > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: >> Am 27.01.2012 20:52, schrieb Gleb Natapov: >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of >>>> you test this with SVM? I did some testing on my buggy processor and it looks >>>> as good as it gets, but it would be better if you could confirm. >>>> >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? >> >> SVM updates the CPL when the segment selector for CS is loaded. From a >> svm.c POV, segment selectors are updated immediately after set_rflags, >> so it wouldn't really make a difference to do it twice. >> > It is too subtle to rely on that. The fact is that checking cpl after > set_rflags provides incorrect value. This better be fixed. Depends on what value you consider to be correct between reloading eflags and reloading cs. I think it's logical and more consistent to say that CPL only changes when cs is reloaded, but you could argue that it's effective with the reload of rflags. It doesn't make a difference to guests, so we can decide to choose whatever we like. Depending on what we decide on (Gleb and I disagree on this, so more input would be helpful), either VMX or SVM need a cleanup. I think it can be done independent from and on top of this fix. > BTW does load_state_from_tss16() need the same fix? The manual says "Do not use a 16-bit TSS to implement a virtual-8086 task." Actually, I don't think you could do that, even if you wanted, with a 16-bit flags field. Kevin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 10:35 ` Kevin Wolf @ 2012-01-30 10:45 ` Avi Kivity 2012-01-30 10:50 ` Gleb Natapov 2012-01-30 10:47 ` Gleb Natapov 1 sibling, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 10:45 UTC (permalink / raw) To: Kevin Wolf; +Cc: Gleb Natapov, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 12:35 PM, Kevin Wolf wrote: > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > >>>> as good as it gets, but it would be better if you could confirm. > >>>> > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > >> > >> SVM updates the CPL when the segment selector for CS is loaded. From a > >> svm.c POV, segment selectors are updated immediately after set_rflags, > >> so it wouldn't really make a difference to do it twice. > >> > > It is too subtle to rely on that. The fact is that checking cpl after > > set_rflags provides incorrect value. This better be fixed. > > Depends on what value you consider to be correct between reloading > eflags and reloading cs. I think it's logical and more consistent to say > that CPL only changes when cs is reloaded, but you could argue that it's > effective with the reload of rflags. It doesn't make a difference to > guests, so we can decide to choose whatever we like. It's best to make it independent (like svm, and force vmx to emulate this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, protected mode (and I think long mode) uses cs.rpl. Making it depend on the mode causes subtle issues during the mode switch - if you switch from real mode to protected mode while cs & 3 != 0 you end up with the wrong cpl until the jmp instruction is executed. > > Depending on what we decide on (Gleb and I disagree on this, so more > input would be helpful), either VMX or SVM need a cleanup. I think it > can be done independent from and on top of this fix. Right. IMO we should follow svm and make vmx be more flexible. One way to do it is to have a new variable for vmx cpl, and reconcile all the places where cpl is stored (cs.rpl, ss.rpl, cr0.pe, rflags.vm) just before entry. If we can't reconcile it, we have to emulate. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 10:45 ` Avi Kivity @ 2012-01-30 10:50 ` Gleb Natapov 2012-01-30 11:59 ` Avi Kivity 0 siblings, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 10:50 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 12:45:15PM +0200, Avi Kivity wrote: > On 01/30/2012 12:35 PM, Kevin Wolf wrote: > > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > > >>>> as good as it gets, but it would be better if you could confirm. > > >>>> > > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > >> > > >> SVM updates the CPL when the segment selector for CS is loaded. From a > > >> svm.c POV, segment selectors are updated immediately after set_rflags, > > >> so it wouldn't really make a difference to do it twice. > > >> > > > It is too subtle to rely on that. The fact is that checking cpl after > > > set_rflags provides incorrect value. This better be fixed. > > > > Depends on what value you consider to be correct between reloading > > eflags and reloading cs. I think it's logical and more consistent to say > > that CPL only changes when cs is reloaded, but you could argue that it's > > effective with the reload of rflags. It doesn't make a difference to > > guests, so we can decide to choose whatever we like. > > It's best to make it independent (like svm, and force vmx to emulate > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > protected mode (and I think long mode) uses cs.rpl. This is what vmx does, not svm. svm checks vmcb->cpl that can be outdated during emulation. > Making it depend on > the mode causes subtle issues during the mode switch - if you switch > from real mode to protected mode while cs & 3 != 0 you end up with the > wrong cpl until the jmp instruction is executed. > > > > > Depending on what we decide on (Gleb and I disagree on this, so more > > input would be helpful), either VMX or SVM need a cleanup. I think it > > can be done independent from and on top of this fix. > > Right. IMO we should follow svm and make vmx be more flexible. > > One way to do it is to have a new variable for vmx cpl, and reconcile > all the places where cpl is stored (cs.rpl, ss.rpl, cr0.pe, rflags.vm) > just before entry. If we can't reconcile it, we have to emulate. > > > > -- > error compiling committee.c: too many arguments to function -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 10:50 ` Gleb Natapov @ 2012-01-30 11:59 ` Avi Kivity 2012-01-30 12:16 ` Gleb Natapov 2012-01-30 12:31 ` Gleb Natapov 0 siblings, 2 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 11:59 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 12:50 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 12:45:15PM +0200, Avi Kivity wrote: > > On 01/30/2012 12:35 PM, Kevin Wolf wrote: > > > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > > > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > > > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > > > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > > > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > > > >>>> as good as it gets, but it would be better if you could confirm. > > > >>>> > > > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > > >> > > > >> SVM updates the CPL when the segment selector for CS is loaded. From a > > > >> svm.c POV, segment selectors are updated immediately after set_rflags, > > > >> so it wouldn't really make a difference to do it twice. > > > >> > > > > It is too subtle to rely on that. The fact is that checking cpl after > > > > set_rflags provides incorrect value. This better be fixed. > > > > > > Depends on what value you consider to be correct between reloading > > > eflags and reloading cs. I think it's logical and more consistent to say > > > that CPL only changes when cs is reloaded, but you could argue that it's > > > effective with the reload of rflags. It doesn't make a difference to > > > guests, so we can decide to choose whatever we like. > > > > It's best to make it independent (like svm, and force vmx to emulate > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > protected mode (and I think long mode) uses cs.rpl. > This is what vmx does, not svm. That's the architectural definition, except for mode switch sequences. vmx implements it directly which means that mode switch sequences sometimes fail, either in guest software (setting cr0.pe while cs & 3 != 0) or in "microcode" (emulate.c). > svm checks vmcb->cpl that can be > outdated during emulation. This decoupling is actually helpful, since you can defer the cpl change until the end of the switch, and avoid inconsistencies like those checked by cs_ss_rpl_check(). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 11:59 ` Avi Kivity @ 2012-01-30 12:16 ` Gleb Natapov 2012-01-30 13:27 ` Avi Kivity 2012-01-30 12:31 ` Gleb Natapov 1 sibling, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 12:16 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > On 01/30/2012 12:50 PM, Gleb Natapov wrote: > > On Mon, Jan 30, 2012 at 12:45:15PM +0200, Avi Kivity wrote: > > > On 01/30/2012 12:35 PM, Kevin Wolf wrote: > > > > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > > > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > > > > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > > > > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > > > > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > > > > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > > > > >>>> as good as it gets, but it would be better if you could confirm. > > > > >>>> > > > > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > > > >> > > > > >> SVM updates the CPL when the segment selector for CS is loaded. From a > > > > >> svm.c POV, segment selectors are updated immediately after set_rflags, > > > > >> so it wouldn't really make a difference to do it twice. > > > > >> > > > > > It is too subtle to rely on that. The fact is that checking cpl after > > > > > set_rflags provides incorrect value. This better be fixed. > > > > > > > > Depends on what value you consider to be correct between reloading > > > > eflags and reloading cs. I think it's logical and more consistent to say > > > > that CPL only changes when cs is reloaded, but you could argue that it's > > > > effective with the reload of rflags. It doesn't make a difference to > > > > guests, so we can decide to choose whatever we like. > > > > > > It's best to make it independent (like svm, and force vmx to emulate > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > protected mode (and I think long mode) uses cs.rpl. > > This is what vmx does, not svm. > > That's the architectural definition, except for mode switch sequences. > vmx implements it directly which means that mode switch sequences > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > 0) or in "microcode" (emulate.c). > > > svm checks vmcb->cpl that can be > > outdated during emulation. > > This decoupling is actually helpful, since you can defer the cpl change > until the end of the switch, and avoid inconsistencies like those > checked by cs_ss_rpl_check(). > I am not saying it is not helpful. The fact that it exists tells us that dpl and cpl are not always the same. But cpl change should not be delayed until the end of the switch! Mode switch happens in the middle of a task switch. Task switch happens in 3 stages according to the spec. If error happens during the first one (steps 1-11) it is handled by an old task, if error happens during second stage (12 this is where mode change happens) then anything can happen (we may kill vcpu till reset if we wish) after that new task is running and all errors are handled by a new task. To model this accurately we need to do task switch in this three stages too and do a full register writeback after stage 2 before stage 3. Or alternatively emulator should never access vcpu state during emulation. Entire vcpu state should be in emulation ctx. But this is more complicated and slow. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 12:16 ` Gleb Natapov @ 2012-01-30 13:27 ` Avi Kivity 0 siblings, 0 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 13:27 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 02:16 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > > On 01/30/2012 12:50 PM, Gleb Natapov wrote: > > > On Mon, Jan 30, 2012 at 12:45:15PM +0200, Avi Kivity wrote: > > > > On 01/30/2012 12:35 PM, Kevin Wolf wrote: > > > > > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > > > > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > > > > > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > > > > > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > > > > > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > > > > > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > > > > > >>>> as good as it gets, but it would be better if you could confirm. > > > > > >>>> > > > > > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > > > > > >> > > > > > >> SVM updates the CPL when the segment selector for CS is loaded. From a > > > > > >> svm.c POV, segment selectors are updated immediately after set_rflags, > > > > > >> so it wouldn't really make a difference to do it twice. > > > > > >> > > > > > > It is too subtle to rely on that. The fact is that checking cpl after > > > > > > set_rflags provides incorrect value. This better be fixed. > > > > > > > > > > Depends on what value you consider to be correct between reloading > > > > > eflags and reloading cs. I think it's logical and more consistent to say > > > > > that CPL only changes when cs is reloaded, but you could argue that it's > > > > > effective with the reload of rflags. It doesn't make a difference to > > > > > guests, so we can decide to choose whatever we like. > > > > > > > > It's best to make it independent (like svm, and force vmx to emulate > > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > > protected mode (and I think long mode) uses cs.rpl. > > > This is what vmx does, not svm. > > > > That's the architectural definition, except for mode switch sequences. > > vmx implements it directly which means that mode switch sequences > > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > > 0) or in "microcode" (emulate.c). > > > > > svm checks vmcb->cpl that can be > > > outdated during emulation. > > > > This decoupling is actually helpful, since you can defer the cpl change > > until the end of the switch, and avoid inconsistencies like those > > checked by cs_ss_rpl_check(). > > > I am not saying it is not helpful. The fact that it exists tells us > that dpl and cpl are not always the same. But cpl change should not be > delayed until the end of the switch! Mode switch happens in the middle of > a task switch. Task switch happens in 3 stages according to the spec. If > error happens during the first one (steps 1-11) it is handled by an old > task, if error happens during second stage (12 this is where mode change > happens) then anything can happen (we may kill vcpu till reset if we wish) > after that new task is running and all errors are handled by a new task. Agree. > To model this accurately we need to do task switch in this three stages > too and do a full register writeback after stage 2 before stage 3. Or > alternatively emulator should never access vcpu state during emulation. > Entire vcpu state should be in emulation ctx. But this is more > complicated and slow. Speed is immaterial here, but I agree about the complexity. I guess we should do full writeback after stage 2. -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 11:59 ` Avi Kivity 2012-01-30 12:16 ` Gleb Natapov @ 2012-01-30 12:31 ` Gleb Natapov 2012-01-30 13:28 ` Avi Kivity 1 sibling, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 12:31 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > > > It's best to make it independent (like svm, and force vmx to emulate > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > protected mode (and I think long mode) uses cs.rpl. > > This is what vmx does, not svm. > > That's the architectural definition, except for mode switch sequences. > vmx implements it directly which means that mode switch sequences > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > 0) or in "microcode" (emulate.c). > The fact that vmx check for cpl by (cs & 3) looks incorrect. It should check dpl. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 12:31 ` Gleb Natapov @ 2012-01-30 13:28 ` Avi Kivity 2012-01-30 13:29 ` Gleb Natapov 0 siblings, 1 reply; 36+ messages in thread From: Avi Kivity @ 2012-01-30 13:28 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 02:31 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > > > > It's best to make it independent (like svm, and force vmx to emulate > > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > > protected mode (and I think long mode) uses cs.rpl. > > > This is what vmx does, not svm. > > > > That's the architectural definition, except for mode switch sequences. > > vmx implements it directly which means that mode switch sequences > > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > > 0) or in "microcode" (emulate.c). > > > The fact that vmx check for cpl by (cs & 3) looks incorrect. It > should check dpl. > Which check? -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 13:28 ` Avi Kivity @ 2012-01-30 13:29 ` Gleb Natapov 2012-01-30 13:39 ` Avi Kivity 0 siblings, 1 reply; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 13:29 UTC (permalink / raw) To: Avi Kivity; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On Mon, Jan 30, 2012 at 03:28:07PM +0200, Avi Kivity wrote: > On 01/30/2012 02:31 PM, Gleb Natapov wrote: > > On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > > > > > It's best to make it independent (like svm, and force vmx to emulate > > > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > > > protected mode (and I think long mode) uses cs.rpl. > > > > This is what vmx does, not svm. > > > > > > That's the architectural definition, except for mode switch sequences. > > > vmx implements it directly which means that mode switch sequences > > > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > > > 0) or in "microcode" (emulate.c). > > > > > The fact that vmx check for cpl by (cs & 3) looks incorrect. It > > should check dpl. > > > > Which check? > In vmx_get_cpl(). -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 13:29 ` Gleb Natapov @ 2012-01-30 13:39 ` Avi Kivity 0 siblings, 0 replies; 36+ messages in thread From: Avi Kivity @ 2012-01-30 13:39 UTC (permalink / raw) To: Gleb Natapov; +Cc: Kevin Wolf, kvm, joerg.roedel, yoshikawa.takuya, mtosatti On 01/30/2012 03:29 PM, Gleb Natapov wrote: > On Mon, Jan 30, 2012 at 03:28:07PM +0200, Avi Kivity wrote: > > On 01/30/2012 02:31 PM, Gleb Natapov wrote: > > > On Mon, Jan 30, 2012 at 01:59:34PM +0200, Avi Kivity wrote: > > > > > > It's best to make it independent (like svm, and force vmx to emulate > > > > > > this behaviour). Real mode forces cpl to 0, vm86 forces cpl to 3, > > > > > > protected mode (and I think long mode) uses cs.rpl. > > > > > This is what vmx does, not svm. > > > > > > > > That's the architectural definition, except for mode switch sequences. > > > > vmx implements it directly which means that mode switch sequences > > > > sometimes fail, either in guest software (setting cr0.pe while cs & 3 != > > > > 0) or in "microcode" (emulate.c). > > > > > > > The fact that vmx check for cpl by (cs & 3) looks incorrect. It > > > should check dpl. > > > > > > > Which check? > > > In vmx_get_cpl(). 3A 3.2: Current privilege level (CPL) field — (Bits 0 and 1 of the CS segment register.) Indicates the privilege level of the currently executing program or procedure. The term current privilege level (CPL) refers to the setting of this field. (probably allows a library with a low DPL to be run as part of a program with high RPL/CPL; or you can use a "conforming code segment" as a sort of SUID bit). -- error compiling committee.c: too many arguments to function ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 0/3] Fix task switches into/out of VM86 2012-01-30 10:35 ` Kevin Wolf 2012-01-30 10:45 ` Avi Kivity @ 2012-01-30 10:47 ` Gleb Natapov 1 sibling, 0 replies; 36+ messages in thread From: Gleb Natapov @ 2012-01-30 10:47 UTC (permalink / raw) To: Kevin Wolf; +Cc: kvm, joerg.roedel, yoshikawa.takuya, avi, mtosatti On Mon, Jan 30, 2012 at 11:35:00AM +0100, Kevin Wolf wrote: > Am 30.01.2012 09:55, schrieb Gleb Natapov: > > On Mon, Jan 30, 2012 at 09:48:33AM +0100, Kevin Wolf wrote: > >> Am 27.01.2012 20:52, schrieb Gleb Natapov: > >>> On Fri, Jan 27, 2012 at 08:23:33PM +0100, Kevin Wolf wrote: > >>>> I believe this should work with both VMX and SVM now. Gleb, Jörg, can one of > >>>> you test this with SVM? I did some testing on my buggy processor and it looks > >>>> as good as it gets, but it would be better if you could confirm. > >>>> > >>> You forgot to set cpl to 3 in vmcb in svm_set_rflags() when vm86 is enabled, no? > >> > >> SVM updates the CPL when the segment selector for CS is loaded. From a > >> svm.c POV, segment selectors are updated immediately after set_rflags, > >> so it wouldn't really make a difference to do it twice. > >> > > It is too subtle to rely on that. The fact is that checking cpl after > > set_rflags provides incorrect value. This better be fixed. > > Depends on what value you consider to be correct between reloading > eflags and reloading cs. I think it's logical and more consistent to say > that CPL only changes when cs is reloaded, but you could argue that it's > effective with the reload of rflags. It doesn't make a difference to > guests, so we can decide to choose whatever we like. > > Depending on what we decide on (Gleb and I disagree on this, so more > input would be helpful), either VMX or SVM need a cleanup. I think it > can be done independent from and on top of this fix. > I think you made my point (that cpl in svm should be updated on rflags update) by pointing me to this part of the spec: The processor tests the VM flag under three general conditions: • When loading segment registers, to determine whether to use 8086-style address translation. • When decoding instructions, to determine which instructions are not supported in virtual-8086 mode and which instructions are sensitive to IOPL. • When checking privileged instructions, on page accesses, or when performing other permission checks. (Virtual-8086 mode always executes at CPL 3.) Bullet 3 clearly proves it. Furthermore task switch loads eflags and segment selector at stage 12. After that CPU runs on a new task, but since segment descriptors are still not loaded CS dpl is not updated yet, but task is in CPL3 already. > > BTW does load_state_from_tss16() need the same fix? > > The manual says "Do not use a 16-bit TSS to implement a virtual-8086 > task." Actually, I don't think you could do that, even if you wanted, > with a 16-bit flags field. > Yes. May be there are other reasons to update flags earlier like spec specifies, but I can think of any. Will fix them when we find them. -- Gleb. ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2012-01-31 10:26 UTC | newest] Thread overview: 36+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-27 19:23 [PATCH v2 0/3] Fix task switches into/out of VM86 Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 1/3] KVM: x86 emulator: Fix task switch privilege checks Kevin Wolf 2012-01-30 10:39 ` Avi Kivity 2012-01-30 11:12 ` Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 2/3] KVM: x86 emulator: VM86 segments must have DPL 3 Kevin Wolf 2012-01-27 19:23 ` [PATCH v2 3/3] KVM: x86 emulator: Allow PM/VM86 switch during task switch Kevin Wolf 2012-01-30 10:24 ` Avi Kivity 2012-01-30 10:56 ` Gleb Natapov 2012-01-30 12:02 ` Avi Kivity 2012-01-30 12:04 ` Gleb Natapov 2012-01-30 13:24 ` Avi Kivity 2012-01-30 11:05 ` Kevin Wolf 2012-01-30 11:09 ` Gleb Natapov 2012-01-30 13:23 ` Avi Kivity 2012-01-30 14:01 ` Kevin Wolf 2012-01-30 14:32 ` Avi Kivity 2012-01-30 15:26 ` Kevin Wolf 2012-01-30 15:44 ` Avi Kivity 2012-01-30 15:55 ` Takuya Yoshikawa 2012-01-31 9:37 ` Gleb Natapov 2012-01-31 10:26 ` Avi Kivity 2012-01-27 19:52 ` [PATCH v2 0/3] Fix task switches into/out of VM86 Gleb Natapov 2012-01-30 8:48 ` Kevin Wolf 2012-01-30 8:55 ` Gleb Natapov 2012-01-30 10:22 ` Gleb Natapov 2012-01-30 10:35 ` Kevin Wolf 2012-01-30 10:45 ` Avi Kivity 2012-01-30 10:50 ` Gleb Natapov 2012-01-30 11:59 ` Avi Kivity 2012-01-30 12:16 ` Gleb Natapov 2012-01-30 13:27 ` Avi Kivity 2012-01-30 12:31 ` Gleb Natapov 2012-01-30 13:28 ` Avi Kivity 2012-01-30 13:29 ` Gleb Natapov 2012-01-30 13:39 ` Avi Kivity 2012-01-30 10:47 ` Gleb Natapov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).