* [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec
@ 2014-04-17 23:33 Nadav Amit
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Nadav Amit @ 2014-04-17 23:33 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
This series of patches fix various scenarios in which KVM behavior does not
follow x86 specifications. Each patch actually deals with a separate bug.
These bugs can cause the guest to get stuck (i.e., make no progress), encounter
spurious injected exceptions, or cause guest code to misbehave. As a result
guest OS can potentially fail.
Thanks for reviewing the patches.
Nadav Amit (5):
KVM: x86: Fix wrong/stuck PMU when guest does not use PMI
KVM: x86: Fix CR3 reserved bits
KVM: x86: IN instruction emulation should ignore REP-prefix
KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
KVM: x86: Processor mode may be determined incorrectly
arch/x86/include/asm/kvm_host.h | 6 +-----
arch/x86/kvm/emulate.c | 11 ++++++-----
arch/x86/kvm/pmu.c | 7 +++++--
arch/x86/kvm/x86.c | 27 ++++++---------------------
4 files changed, 18 insertions(+), 33 deletions(-)
--
1.7.10.4
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI
2014-04-17 23:33 [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Nadav Amit
@ 2014-04-18 0:35 ` Nadav Amit
2014-04-18 0:35 ` [PATCH 2/5] KVM: x86: Fix CR3 reserved bits Nadav Amit
2014-04-18 0:35 ` [PATCH 3/5] KVM: x86: IN instruction emulation should ignore REP-prefix Nadav Amit
2014-04-18 4:11 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Nadav Amit
2014-04-23 20:47 ` [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Marcelo Tosatti
2 siblings, 2 replies; 16+ messages in thread
From: Nadav Amit @ 2014-04-18 0:35 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
If a guest enables a performance counter but does not enable PMI, the
hypervisor currently does not reprogram the performance counter once it
overflows. As a result the host performance counter is kept with the original
sampling period which was configured according to the value of the guest's
counter when the counter was enabled.
Such behaviour can cause very bad consequences. The most distrubing one can
cause the guest not to make any progress at all, and keep exiting due to host
PMI before any guest instructions is exeucted. This situation occurs when the
performance counter holds a very high value when the guest enables the
performance counter. As a result the host's sampling period is configured to be
very short. The host then never reconfigures the sampling period and get stuck
at entry->PMI->exit loop. We encountered such a scenario in our experiments.
The solution is to reprogram the counter even if the guest does not use PMI.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
:100644 100644 5c4f631... cbecaa9... M arch/x86/kvm/pmu.c
arch/x86/kvm/pmu.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 5c4f631..cbecaa9 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -108,7 +108,10 @@ static void kvm_perf_overflow(struct perf_event *perf_event,
{
struct kvm_pmc *pmc = perf_event->overflow_handler_context;
struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
- __set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+ if (!test_and_set_bit(pmc->idx, (unsigned long *)&pmu->reprogram_pmi)) {
+ __set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
+ kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
+ }
}
static void kvm_perf_overflow_intr(struct perf_event *perf_event,
@@ -117,7 +120,7 @@ static void kvm_perf_overflow_intr(struct perf_event *perf_event,
struct kvm_pmc *pmc = perf_event->overflow_handler_context;
struct kvm_pmu *pmu = &pmc->vcpu->arch.pmu;
if (!test_and_set_bit(pmc->idx, (unsigned long *)&pmu->reprogram_pmi)) {
- kvm_perf_overflow(perf_event, data, regs);
+ __set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
kvm_make_request(KVM_REQ_PMU, pmc->vcpu);
/*
* Inject PMI. If vcpu was in a guest mode during NMI PMI
--
1.7.10.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 2/5] KVM: x86: Fix CR3 reserved bits
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
@ 2014-04-18 0:35 ` Nadav Amit
2014-05-10 7:13 ` Jan Kiszka
2014-04-18 0:35 ` [PATCH 3/5] KVM: x86: IN instruction emulation should ignore REP-prefix Nadav Amit
1 sibling, 1 reply; 16+ messages in thread
From: Nadav Amit @ 2014-04-18 0:35 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
According to Intel specifications, PAE and non-PAE does not have any reserved
bits. In long-mode, regardless to PCIDE, only the high bits (above the
physical address) are reserved.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
:100644 100644 7de069af.. e21aee9... M arch/x86/include/asm/kvm_host.h
:100644 100644 205b17e... 1d60374... M arch/x86/kvm/emulate.c
:100644 100644 8b8fc0b... f4d9839... M arch/x86/kvm/x86.c
arch/x86/include/asm/kvm_host.h | 6 +-----
arch/x86/kvm/emulate.c | 4 ----
arch/x86/kvm/x86.c | 25 +++++--------------------
3 files changed, 6 insertions(+), 29 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7de069af..e21aee9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -50,11 +50,7 @@
| X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
| X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))
-#define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1)
-#define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD))
-#define CR3_PCID_ENABLED_RESERVED_BITS 0xFFFFFF0000000000ULL
-#define CR3_L_MODE_RESERVED_BITS (CR3_NONPAE_RESERVED_BITS | \
- 0xFFFFFF0000000000ULL)
+#define CR3_L_MODE_RESERVED_BITS 0xFFFFFF0000000000ULL
#define CR4_RESERVED_BITS \
(~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
| X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 205b17e..1d60374 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3386,10 +3386,6 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
if (efer & EFER_LMA)
rsvd = CR3_L_MODE_RESERVED_BITS;
- else if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_PAE)
- rsvd = CR3_PAE_RESERVED_BITS;
- else if (ctxt->ops->get_cr(ctxt, 0) & X86_CR0_PG)
- rsvd = CR3_NONPAE_RESERVED_BITS;
if (new_val & rsvd)
return emulate_gp(ctxt, 0);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8b8fc0b..f4d9839 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -701,26 +701,11 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
return 0;
}
- if (is_long_mode(vcpu)) {
- if (kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) {
- if (cr3 & CR3_PCID_ENABLED_RESERVED_BITS)
- return 1;
- } else
- if (cr3 & CR3_L_MODE_RESERVED_BITS)
- return 1;
- } else {
- if (is_pae(vcpu)) {
- if (cr3 & CR3_PAE_RESERVED_BITS)
- return 1;
- if (is_paging(vcpu) &&
- !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
- return 1;
- }
- /*
- * We don't check reserved bits in nonpae mode, because
- * this isn't enforced, and VMware depends on this.
- */
- }
+ if (is_long_mode(vcpu) && (cr3 & CR3_L_MODE_RESERVED_BITS))
+ return 1;
+ if (is_pae(vcpu) && is_paging(vcpu) &&
+ !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
+ return 1;
vcpu->arch.cr3 = cr3;
__set_bit(VCPU_EXREG_CR3, (ulong *)&vcpu->arch.regs_avail);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 3/5] KVM: x86: IN instruction emulation should ignore REP-prefix
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
2014-04-18 0:35 ` [PATCH 2/5] KVM: x86: Fix CR3 reserved bits Nadav Amit
@ 2014-04-18 0:35 ` Nadav Amit
1 sibling, 0 replies; 16+ messages in thread
From: Nadav Amit @ 2014-04-18 0:35 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
The IN instruction is not be affected by REP-prefix as INS is. Therefore, the
emulation should ignore the REP prefix as well. The current emulator
implementation tries to perform writeback when IN instruction with REP-prefix
is emulated. This causes it to perform wrong memory write or spurious #GP
exception to be injected to the guest.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
:100644 100644 1d60374... 69e2636... M arch/x86/kvm/emulate.c
arch/x86/kvm/emulate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 1d60374..69e2636 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1324,7 +1324,8 @@ static int pio_in_emulated(struct x86_emulate_ctxt *ctxt,
rc->end = n * size;
}
- if (ctxt->rep_prefix && !(ctxt->eflags & EFLG_DF)) {
+ if (ctxt->rep_prefix && (ctxt->d & String) &&
+ !(ctxt->eflags & EFLG_DF)) {
ctxt->dst.data = rc->data + rc->pos;
ctxt->dst.type = OP_MEM_STR;
ctxt->dst.count = (rc->end - rc->pos) / size;
--
1.7.10.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-17 23:33 [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Nadav Amit
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
@ 2014-04-18 4:11 ` Nadav Amit
2014-04-18 4:11 ` [PATCH 5/5] KVM: x86: Processor mode may be determined incorrectly Nadav Amit
2014-04-20 9:26 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Gleb Natapov
2014-04-23 20:47 ` [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Marcelo Tosatti
2 siblings, 2 replies; 16+ messages in thread
From: Nadav Amit @ 2014-04-18 4:11 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
When using address-size override prefix with string instructions in long-mode,
ESI/EDI/ECX are zero extended if they are affected by the instruction
(incremented/decremented). Currently, the KVM emulator does not do so.
In addition, although it is not well-documented, when address override prefix
is used with REP-string instruction, RCX high half is zeroed even if ECX was
zero on the first iteration. Therefore, the emulator should clear the upper
part of RCX in this case, as x86 CPUs do.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
:100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
arch/x86/kvm/emulate.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 69e2636..a69ed67 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
else
mask = ad_mask(ctxt);
masked_increment(reg, mask, inc);
+ if (ctxt->ad_bytes == 4)
+ *reg &= 0xffffffff;
}
static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
@@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
if (ctxt->rep_prefix && (ctxt->d & String)) {
/* All REP prefixes have the same first termination condition */
if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
+ if (ctxt->ad_bytes == 4)
+ *reg_write(ctxt, VCPU_REGS_RCX) = 0;
ctxt->eip = ctxt->_eip;
goto done;
}
--
1.7.10.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH 5/5] KVM: x86: Processor mode may be determined incorrectly
2014-04-18 4:11 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Nadav Amit
@ 2014-04-18 4:11 ` Nadav Amit
2014-04-20 9:26 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Gleb Natapov
1 sibling, 0 replies; 16+ messages in thread
From: Nadav Amit @ 2014-04-18 4:11 UTC (permalink / raw)
To: gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel, Nadav Amit
If EFER.LMA is off, cs.l does not determine execution mode.
Currently, the emulation engine assumes differently.
Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
:100644 100644 f4d9839... c99f7eb... M arch/x86/kvm/x86.c
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f4d9839..c99f7eb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4887,7 +4887,7 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
ctxt->eip = kvm_rip_read(vcpu);
ctxt->mode = (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
(ctxt->eflags & X86_EFLAGS_VM) ? X86EMUL_MODE_VM86 :
- cs_l ? X86EMUL_MODE_PROT64 :
+ (cs_l && is_long_mode(vcpu)) ? X86EMUL_MODE_PROT64 :
cs_db ? X86EMUL_MODE_PROT32 :
X86EMUL_MODE_PROT16;
ctxt->guest_mode = is_guest_mode(vcpu);
--
1.7.10.4
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-18 4:11 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Nadav Amit
2014-04-18 4:11 ` [PATCH 5/5] KVM: x86: Processor mode may be determined incorrectly Nadav Amit
@ 2014-04-20 9:26 ` Gleb Natapov
2014-04-22 6:04 ` Nadav Amit
1 sibling, 1 reply; 16+ messages in thread
From: Gleb Natapov @ 2014-04-20 9:26 UTC (permalink / raw)
To: Nadav Amit; +Cc: pbonzini, tglx, mingo, hpa, x86, kvm, linux-kernel
On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
> When using address-size override prefix with string instructions in long-mode,
> ESI/EDI/ECX are zero extended if they are affected by the instruction
> (incremented/decremented). Currently, the KVM emulator does not do so.
>
> In addition, although it is not well-documented, when address override prefix
> is used with REP-string instruction, RCX high half is zeroed even if ECX was
> zero on the first iteration. Therefore, the emulator should clear the upper
> part of RCX in this case, as x86 CPUs do.
>
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
> :100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
> arch/x86/kvm/emulate.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 69e2636..a69ed67 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
> else
> mask = ad_mask(ctxt);
> masked_increment(reg, mask, inc);
> + if (ctxt->ad_bytes == 4)
> + *reg &= 0xffffffff;
*reg=(u32)*reg; and you can do it inside else part.
register_address_increment() is used also by jmp_rel and loop instructions,
is this correct for both of those too? Probably yes.
> }
>
> static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
> @@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
> if (ctxt->rep_prefix && (ctxt->d & String)) {
> /* All REP prefixes have the same first termination condition */
> if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
> + if (ctxt->ad_bytes == 4)
> + *reg_write(ctxt, VCPU_REGS_RCX) = 0;
Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
ECX modification. If later it is already covered in register_address_increment, no?
> ctxt->eip = ctxt->_eip;
> goto done;
> }
> --
> 1.7.10.4
>
--
Gleb.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-20 9:26 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Gleb Natapov
@ 2014-04-22 6:04 ` Nadav Amit
2014-04-23 19:58 ` Marcelo Tosatti
0 siblings, 1 reply; 16+ messages in thread
From: Nadav Amit @ 2014-04-22 6:04 UTC (permalink / raw)
To: Gleb Natapov, Nadav Amit
Cc: pbonzini, tglx, mingo, hpa, x86, kvm, linux-kernel
Gleb,
On 4/20/14, 12:26 PM, Gleb Natapov wrote:
> On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
>> When using address-size override prefix with string instructions in long-mode,
>> ESI/EDI/ECX are zero extended if they are affected by the instruction
>> (incremented/decremented). Currently, the KVM emulator does not do so.
>>
>> In addition, although it is not well-documented, when address override prefix
>> is used with REP-string instruction, RCX high half is zeroed even if ECX was
>> zero on the first iteration. Therefore, the emulator should clear the upper
>> part of RCX in this case, as x86 CPUs do.
>>
>> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
>> ---
>> :100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
>> arch/x86/kvm/emulate.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>> index 69e2636..a69ed67 100644
>> --- a/arch/x86/kvm/emulate.c
>> +++ b/arch/x86/kvm/emulate.c
>> @@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
>> else
>> mask = ad_mask(ctxt);
>> masked_increment(reg, mask, inc);
>> + if (ctxt->ad_bytes == 4)
>> + *reg &= 0xffffffff;
> *reg=(u32)*reg; and you can do it inside else part.
>
> register_address_increment() is used also by jmp_rel and loop instructions,
> is this correct for both of those too? Probably yes.
>
It appears to be so.
Results of 32-bit operations are implicitly zero extended to 64-bit
values, and this appears to apply to all 32 bit operations, including
implicit ones. Therefore it seems to apply to all these operations.
>> }
>>
>> static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>> @@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
>> if (ctxt->rep_prefix && (ctxt->d & String)) {
>> /* All REP prefixes have the same first termination condition */
>> if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
>> + if (ctxt->ad_bytes == 4)
>> + *reg_write(ctxt, VCPU_REGS_RCX) = 0;
> Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
> ECX modification. If later it is already covered in register_address_increment, no?
The observed behaviour of the Sandy-Bridge I use, is that even if ECX is
zero on the first iteration, the high half of RCX is zeroed. Therefore,
this is a different case, which was not covered in
register_address_increment. I agree it is totally undocumented.
Following your previous comment - I may have missed the case in which
loop instruction is executed with ECX = 0 while RCX != 0 and the address
size is 32 bit. I will test this case soon (yet, it is lower on my
priority list).
Nadav
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-22 6:04 ` Nadav Amit
@ 2014-04-23 19:58 ` Marcelo Tosatti
2014-04-23 20:11 ` Marcelo Tosatti
0 siblings, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2014-04-23 19:58 UTC (permalink / raw)
To: Nadav Amit
Cc: Gleb Natapov, Nadav Amit, pbonzini, tglx, mingo, hpa, x86, kvm,
linux-kernel
On Tue, Apr 22, 2014 at 09:04:45AM +0300, Nadav Amit wrote:
> Gleb,
>
> On 4/20/14, 12:26 PM, Gleb Natapov wrote:
> >On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
> >>When using address-size override prefix with string instructions in long-mode,
> >>ESI/EDI/ECX are zero extended if they are affected by the instruction
> >>(incremented/decremented). Currently, the KVM emulator does not do so.
> >>
> >>In addition, although it is not well-documented, when address override prefix
> >>is used with REP-string instruction, RCX high half is zeroed even if ECX was
> >>zero on the first iteration. Therefore, the emulator should clear the upper
> >>part of RCX in this case, as x86 CPUs do.
> >>
> >>Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> >>---
> >>:100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
> >> arch/x86/kvm/emulate.c | 4 ++++
> >> 1 file changed, 4 insertions(+)
> >>
> >>diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> >>index 69e2636..a69ed67 100644
> >>--- a/arch/x86/kvm/emulate.c
> >>+++ b/arch/x86/kvm/emulate.c
> >>@@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
> >> else
> >> mask = ad_mask(ctxt);
> >> masked_increment(reg, mask, inc);
> >>+ if (ctxt->ad_bytes == 4)
> >>+ *reg &= 0xffffffff;
> >*reg=(u32)*reg; and you can do it inside else part.
> >
> >register_address_increment() is used also by jmp_rel and loop instructions,
> >is this correct for both of those too? Probably yes.
> >
> It appears to be so.
> Results of 32-bit operations are implicitly zero extended to 64-bit
> values, and this appears to apply to all 32 bit operations,
> including implicit ones. Therefore it seems to apply to all these
> operations.
>
> >> }
> >>
> >> static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
> >>@@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
> >> if (ctxt->rep_prefix && (ctxt->d & String)) {
> >> /* All REP prefixes have the same first termination condition */
> >> if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
> >>+ if (ctxt->ad_bytes == 4)
> >>+ *reg_write(ctxt, VCPU_REGS_RCX) = 0;
> >Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
> >ECX modification. If later it is already covered in register_address_increment, no?
> The observed behaviour of the Sandy-Bridge I use, is that even if
> ECX is zero on the first iteration, the high half of RCX is zeroed.
> Therefore, this is a different case, which was not covered in
> register_address_increment. I agree it is totally undocumented.
> Following your previous comment - I may have missed the case in
> which loop instruction is executed with ECX = 0 while RCX != 0 and
> the address size is 32 bit. I will test this case soon (yet, it is
> lower on my priority list).
In 64-bit mode, the operand size for all near branches (CALL, RET, JCC,
JCXZ, JMP, and LOOP) is forced to 64 bits.
These instructions update the 64-bit RIP without the need for a REX
operand-size prefix.
The following aspects of near branches are controlled by the effective
operand size:
• Truncation of the size of the instruction pointer
...
In 64-bit mode, all of the above actions are forced to 64 bits
regardless of operand size prefixes (operand size
prefixes are silently ignored). However, the displacement field for
relative branches is still limited to 32 bits and the
address size for near branches is not forced in 64-bit mode.
Address sizes affect the size of RCX used for JCXZ and LOOP; they also
impact the address calculation for memory
indirect branches. Such addresses are 64 bits by default; but they can
be overridden to 32 bits by an address size
prefix.
So it seems your patch incorrectly handles "rex call" for example.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-23 19:58 ` Marcelo Tosatti
@ 2014-04-23 20:11 ` Marcelo Tosatti
2014-04-23 20:53 ` Nadav Amit
0 siblings, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2014-04-23 20:11 UTC (permalink / raw)
To: Nadav Amit
Cc: Gleb Natapov, Nadav Amit, pbonzini, tglx, mingo, hpa, x86, kvm,
linux-kernel
On Wed, Apr 23, 2014 at 04:58:32PM -0300, Marcelo Tosatti wrote:
> On Tue, Apr 22, 2014 at 09:04:45AM +0300, Nadav Amit wrote:
> > Gleb,
> >
> > On 4/20/14, 12:26 PM, Gleb Natapov wrote:
> > >On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
> > >>When using address-size override prefix with string instructions in long-mode,
> > >>ESI/EDI/ECX are zero extended if they are affected by the instruction
> > >>(incremented/decremented). Currently, the KVM emulator does not do so.
> > >>
> > >>In addition, although it is not well-documented, when address override prefix
> > >>is used with REP-string instruction, RCX high half is zeroed even if ECX was
> > >>zero on the first iteration. Therefore, the emulator should clear the upper
> > >>part of RCX in this case, as x86 CPUs do.
> > >>
> > >>Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> > >>---
> > >>:100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
> > >> arch/x86/kvm/emulate.c | 4 ++++
> > >> 1 file changed, 4 insertions(+)
> > >>
> > >>diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> > >>index 69e2636..a69ed67 100644
> > >>--- a/arch/x86/kvm/emulate.c
> > >>+++ b/arch/x86/kvm/emulate.c
> > >>@@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
> > >> else
> > >> mask = ad_mask(ctxt);
> > >> masked_increment(reg, mask, inc);
> > >>+ if (ctxt->ad_bytes == 4)
> > >>+ *reg &= 0xffffffff;
> > >*reg=(u32)*reg; and you can do it inside else part.
> > >
> > >register_address_increment() is used also by jmp_rel and loop instructions,
> > >is this correct for both of those too? Probably yes.
> > >
> > It appears to be so.
> > Results of 32-bit operations are implicitly zero extended to 64-bit
> > values, and this appears to apply to all 32 bit operations,
> > including implicit ones. Therefore it seems to apply to all these
> > operations.
> >
> > >> }
> > >>
> > >> static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
> > >>@@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
> > >> if (ctxt->rep_prefix && (ctxt->d & String)) {
> > >> /* All REP prefixes have the same first termination condition */
> > >> if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
> > >>+ if (ctxt->ad_bytes == 4)
> > >>+ *reg_write(ctxt, VCPU_REGS_RCX) = 0;
> > >Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
> > >ECX modification. If later it is already covered in register_address_increment, no?
> > The observed behaviour of the Sandy-Bridge I use, is that even if
> > ECX is zero on the first iteration, the high half of RCX is zeroed.
> > Therefore, this is a different case, which was not covered in
> > register_address_increment. I agree it is totally undocumented.
> > Following your previous comment - I may have missed the case in
> > which loop instruction is executed with ECX = 0 while RCX != 0 and
> > the address size is 32 bit. I will test this case soon (yet, it is
> > lower on my priority list).
>
> In 64-bit mode, the operand size for all near branches (CALL, RET, JCC,
> JCXZ, JMP, and LOOP) is forced to 64 bits.
>
> These instructions update the 64-bit RIP without the need for a REX
> operand-size prefix.
>
> The following aspects of near branches are controlled by the effective
> operand size:
> • Truncation of the size of the instruction pointer
> ...
>
> In 64-bit mode, all of the above actions are forced to 64 bits
> regardless of operand size prefixes (operand size
> prefixes are silently ignored). However, the displacement field for
> relative branches is still limited to 32 bits and the
> address size for near branches is not forced in 64-bit mode.
> Address sizes affect the size of RCX used for JCXZ and LOOP; they also
> impact the address calculation for memory
> indirect branches. Such addresses are 64 bits by default; but they can
> be overridden to 32 bits by an address size
> prefix.
>
> So it seems your patch incorrectly handles "rex call" for example.
Err, operand size is forced to 64-bits, not address size.
"The following aspects of near branches are controlled by the effective
operand size:
• Truncation of the size of the instruction pointer"
Still, "67h call" should not truncate EIP (which your patch does).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec
2014-04-17 23:33 [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Nadav Amit
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
2014-04-18 4:11 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Nadav Amit
@ 2014-04-23 20:47 ` Marcelo Tosatti
2 siblings, 0 replies; 16+ messages in thread
From: Marcelo Tosatti @ 2014-04-23 20:47 UTC (permalink / raw)
To: Nadav Amit; +Cc: gleb, pbonzini, tglx, mingo, hpa, x86, kvm, linux-kernel
On Fri, Apr 18, 2014 at 02:33:06AM +0300, Nadav Amit wrote:
> This series of patches fix various scenarios in which KVM behavior does not
> follow x86 specifications. Each patch actually deals with a separate bug.
> These bugs can cause the guest to get stuck (i.e., make no progress), encounter
> spurious injected exceptions, or cause guest code to misbehave. As a result
> guest OS can potentially fail.
>
> Thanks for reviewing the patches.
>
> Nadav Amit (5):
> KVM: x86: Fix wrong/stuck PMU when guest does not use PMI
> KVM: x86: Fix CR3 reserved bits
> KVM: x86: IN instruction emulation should ignore REP-prefix
> KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
> KVM: x86: Processor mode may be determined incorrectly
Applied all except patch 4.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-23 20:11 ` Marcelo Tosatti
@ 2014-04-23 20:53 ` Nadav Amit
2014-04-23 21:01 ` H. Peter Anvin
0 siblings, 1 reply; 16+ messages in thread
From: Nadav Amit @ 2014-04-23 20:53 UTC (permalink / raw)
To: Marcelo Tosatti
Cc: Gleb Natapov, Nadav Amit, pbonzini, tglx, mingo, hpa, x86, kvm,
linux-kernel
On 4/23/14, 11:11 PM, Marcelo Tosatti wrote:
> On Wed, Apr 23, 2014 at 04:58:32PM -0300, Marcelo Tosatti wrote:
>> On Tue, Apr 22, 2014 at 09:04:45AM +0300, Nadav Amit wrote:
>>> Gleb,
>>>
>>> On 4/20/14, 12:26 PM, Gleb Natapov wrote:
>>>> On Fri, Apr 18, 2014 at 07:11:33AM +0300, Nadav Amit wrote:
>>>>> When using address-size override prefix with string instructions in long-mode,
>>>>> ESI/EDI/ECX are zero extended if they are affected by the instruction
>>>>> (incremented/decremented). Currently, the KVM emulator does not do so.
>>>>>
>>>>> In addition, although it is not well-documented, when address override prefix
>>>>> is used with REP-string instruction, RCX high half is zeroed even if ECX was
>>>>> zero on the first iteration. Therefore, the emulator should clear the upper
>>>>> part of RCX in this case, as x86 CPUs do.
>>>>>
>>>>> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
>>>>> ---
>>>>> :100644 100644 69e2636... a69ed67... M arch/x86/kvm/emulate.c
>>>>> arch/x86/kvm/emulate.c | 4 ++++
>>>>> 1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
>>>>> index 69e2636..a69ed67 100644
>>>>> --- a/arch/x86/kvm/emulate.c
>>>>> +++ b/arch/x86/kvm/emulate.c
>>>>> @@ -491,6 +491,8 @@ register_address_increment(struct x86_emulate_ctxt *ctxt, unsigned long *reg, in
>>>>> else
>>>>> mask = ad_mask(ctxt);
>>>>> masked_increment(reg, mask, inc);
>>>>> + if (ctxt->ad_bytes == 4)
>>>>> + *reg &= 0xffffffff;
>>>> *reg=(u32)*reg; and you can do it inside else part.
>>>>
>>>> register_address_increment() is used also by jmp_rel and loop instructions,
>>>> is this correct for both of those too? Probably yes.
>>>>
>>> It appears to be so.
>>> Results of 32-bit operations are implicitly zero extended to 64-bit
>>> values, and this appears to apply to all 32 bit operations,
>>> including implicit ones. Therefore it seems to apply to all these
>>> operations.
>>>
>>>>> }
>>>>>
>>>>> static void rsp_increment(struct x86_emulate_ctxt *ctxt, int inc)
>>>>> @@ -4567,6 +4569,8 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt)
>>>>> if (ctxt->rep_prefix && (ctxt->d & String)) {
>>>>> /* All REP prefixes have the same first termination condition */
>>>>> if (address_mask(ctxt, reg_read(ctxt, VCPU_REGS_RCX)) == 0) {
>>>>> + if (ctxt->ad_bytes == 4)
>>>>> + *reg_write(ctxt, VCPU_REGS_RCX) = 0;
>>>> Does zero extension happens even if ECX was zero at the beginning on an instruction or only during
>>>> ECX modification. If later it is already covered in register_address_increment, no?
>>> The observed behaviour of the Sandy-Bridge I use, is that even if
>>> ECX is zero on the first iteration, the high half of RCX is zeroed.
>>> Therefore, this is a different case, which was not covered in
>>> register_address_increment. I agree it is totally undocumented.
>>> Following your previous comment - I may have missed the case in
>>> which loop instruction is executed with ECX = 0 while RCX != 0 and
>>> the address size is 32 bit. I will test this case soon (yet, it is
>>> lower on my priority list).
>>
>> In 64-bit mode, the operand size for all near branches (CALL, RET, JCC,
>> JCXZ, JMP, and LOOP) is forced to 64 bits.
>>
>> These instructions update the 64-bit RIP without the need for a REX
>> operand-size prefix.
>>
>> The following aspects of near branches are controlled by the effective
>> operand size:
>> • Truncation of the size of the instruction pointer
>> ...
>>
>> In 64-bit mode, all of the above actions are forced to 64 bits
>> regardless of operand size prefixes (operand size
>> prefixes are silently ignored). However, the displacement field for
>> relative branches is still limited to 32 bits and the
>> address size for near branches is not forced in 64-bit mode.
>> Address sizes affect the size of RCX used for JCXZ and LOOP; they also
>> impact the address calculation for memory
>> indirect branches. Such addresses are 64 bits by default; but they can
>> be overridden to 32 bits by an address size
>> prefix.
>>
>> So it seems your patch incorrectly handles "rex call" for example.
>
> Err, operand size is forced to 64-bits, not address size.
>
> "The following aspects of near branches are controlled by the effective
> operand size:
> • Truncation of the size of the instruction pointer"
>
> Still, "67h call" should not truncate EIP (which your patch does).
>
Yes, I missed it.
But if I am not mistaken again, it means that the existing
implementation of jmp_rel is broken as well when address-size override
prefix is used. In this case, as I see it, the existing masking would
cause the carry from the add operation to the lower half of the rip not
to be added to the rip higher half.
I guess another patch is needed for that as well.
Nadav
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops
2014-04-23 20:53 ` Nadav Amit
@ 2014-04-23 21:01 ` H. Peter Anvin
0 siblings, 0 replies; 16+ messages in thread
From: H. Peter Anvin @ 2014-04-23 21:01 UTC (permalink / raw)
To: Nadav Amit, Marcelo Tosatti
Cc: Gleb Natapov, Nadav Amit, pbonzini, tglx, mingo, x86, kvm,
linux-kernel
On 04/23/2014 01:53 PM, Nadav Amit wrote:
>>
>> Err, operand size is forced to 64-bits, not address size.
>>
>> "The following aspects of near branches are controlled by the effective
>> operand size:
>> • Truncation of the size of the instruction pointer"
>>
>> Still, "67h call" should not truncate EIP (which your patch does).
>>
> Yes, I missed it.
> But if I am not mistaken again, it means that the existing
> implementation of jmp_rel is broken as well when address-size override
> prefix is used. In this case, as I see it, the existing masking would
> cause the carry from the add operation to the lower half of the rip not
> to be added to the rip higher half.
>
> I guess another patch is needed for that as well.
>
Yes, on x86 JMP really should be thought of as "MOV ...,IP/EIP/RIP". On
some other architectures, e.g. m68k, JMP acts as if it was
"LEA ...,PC", which causes some serious confusion for people familiar
with that model. However, on x86 considering JMP as a MOV to the IP
register really is very consistent and will give you the right mental model.
-hpa
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/5] KVM: x86: Fix CR3 reserved bits
2014-04-18 0:35 ` [PATCH 2/5] KVM: x86: Fix CR3 reserved bits Nadav Amit
@ 2014-05-10 7:13 ` Jan Kiszka
2014-05-10 7:24 ` [PATCH] KVM: x86: Fix CR3 reserved bits check in long mode Jan Kiszka
0 siblings, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2014-05-10 7:13 UTC (permalink / raw)
To: Nadav Amit, gleb, pbonzini; +Cc: tglx, mingo, hpa, x86, kvm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 3501 bytes --]
On 2014-04-18 02:35, Nadav Amit wrote:
> According to Intel specifications, PAE and non-PAE does not have any reserved
> bits. In long-mode, regardless to PCIDE, only the high bits (above the
> physical address) are reserved.
>
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
> :100644 100644 7de069af.. e21aee9... M arch/x86/include/asm/kvm_host.h
> :100644 100644 205b17e... 1d60374... M arch/x86/kvm/emulate.c
> :100644 100644 8b8fc0b... f4d9839... M arch/x86/kvm/x86.c
> arch/x86/include/asm/kvm_host.h | 6 +-----
> arch/x86/kvm/emulate.c | 4 ----
> arch/x86/kvm/x86.c | 25 +++++--------------------
> 3 files changed, 6 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 7de069af..e21aee9 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -50,11 +50,7 @@
> | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP | X86_CR0_AM \
> | X86_CR0_NW | X86_CR0_CD | X86_CR0_PG))
>
> -#define CR3_PAE_RESERVED_BITS ((X86_CR3_PWT | X86_CR3_PCD) - 1)
> -#define CR3_NONPAE_RESERVED_BITS ((PAGE_SIZE-1) & ~(X86_CR3_PWT | X86_CR3_PCD))
> -#define CR3_PCID_ENABLED_RESERVED_BITS 0xFFFFFF0000000000ULL
> -#define CR3_L_MODE_RESERVED_BITS (CR3_NONPAE_RESERVED_BITS | \
> - 0xFFFFFF0000000000ULL)
> +#define CR3_L_MODE_RESERVED_BITS 0xFFFFFF0000000000ULL
> #define CR4_RESERVED_BITS \
> (~(unsigned long)(X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD | X86_CR4_DE\
> | X86_CR4_PSE | X86_CR4_PAE | X86_CR4_MCE \
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index 205b17e..1d60374 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -3386,10 +3386,6 @@ static int check_cr_write(struct x86_emulate_ctxt *ctxt)
> ctxt->ops->get_msr(ctxt, MSR_EFER, &efer);
> if (efer & EFER_LMA)
> rsvd = CR3_L_MODE_RESERVED_BITS;
> - else if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_PAE)
> - rsvd = CR3_PAE_RESERVED_BITS;
> - else if (ctxt->ops->get_cr(ctxt, 0) & X86_CR0_PG)
> - rsvd = CR3_NONPAE_RESERVED_BITS;
>
> if (new_val & rsvd)
> return emulate_gp(ctxt, 0);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8b8fc0b..f4d9839 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -701,26 +701,11 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> return 0;
> }
>
> - if (is_long_mode(vcpu)) {
> - if (kvm_read_cr4_bits(vcpu, X86_CR4_PCIDE)) {
> - if (cr3 & CR3_PCID_ENABLED_RESERVED_BITS)
> - return 1;
> - } else
> - if (cr3 & CR3_L_MODE_RESERVED_BITS)
> - return 1;
> - } else {
> - if (is_pae(vcpu)) {
> - if (cr3 & CR3_PAE_RESERVED_BITS)
> - return 1;
> - if (is_paging(vcpu) &&
> - !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
> - return 1;
> - }
> - /*
> - * We don't check reserved bits in nonpae mode, because
> - * this isn't enforced, and VMware depends on this.
> - */
> - }
> + if (is_long_mode(vcpu) && (cr3 & CR3_L_MODE_RESERVED_BITS))
> + return 1;
> + if (is_pae(vcpu) && is_paging(vcpu) &&
> + !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
> + return 1;
This is wrong: is_pae returns true in long mode, but we don't have valid
pdptrs then. Crashes my Jailhouse guest.
I suppose we need a patch on top as this is already in kvm.next, right?
Jan
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] KVM: x86: Fix CR3 reserved bits check in long mode
2014-05-10 7:13 ` Jan Kiszka
@ 2014-05-10 7:24 ` Jan Kiszka
2014-05-12 10:46 ` Paolo Bonzini
0 siblings, 1 reply; 16+ messages in thread
From: Jan Kiszka @ 2014-05-10 7:24 UTC (permalink / raw)
To: Marcelo Tosatti, pbonzini
Cc: Nadav Amit, gleb, tglx, mingo, hpa, x86, kvm, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 956 bytes --]
From: Jan Kiszka <jan.kiszka@siemens.com>
Regression of 346874c9: PAE is set in long mode, but that does not mean
we have valid PDPTRs.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
arch/x86/kvm/x86.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c5582c3..198aac8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -701,10 +701,11 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
return 0;
}
- if (is_long_mode(vcpu) && (cr3 & CR3_L_MODE_RESERVED_BITS))
- return 1;
- if (is_pae(vcpu) && is_paging(vcpu) &&
- !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
+ if (is_long_mode(vcpu)) {
+ if (cr3 & CR3_L_MODE_RESERVED_BITS)
+ return 1;
+ } else if (is_pae(vcpu) && is_paging(vcpu) &&
+ !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
return 1;
vcpu->arch.cr3 = cr3;
--
1.8.1.1.298.ge7eed54
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] KVM: x86: Fix CR3 reserved bits check in long mode
2014-05-10 7:24 ` [PATCH] KVM: x86: Fix CR3 reserved bits check in long mode Jan Kiszka
@ 2014-05-12 10:46 ` Paolo Bonzini
0 siblings, 0 replies; 16+ messages in thread
From: Paolo Bonzini @ 2014-05-12 10:46 UTC (permalink / raw)
To: Jan Kiszka, Marcelo Tosatti
Cc: Nadav Amit, gleb, tglx, mingo, hpa, x86, kvm, linux-kernel
Il 10/05/2014 09:24, Jan Kiszka ha scritto:
> From: Jan Kiszka <jan.kiszka@siemens.com>
>
> Regression of 346874c9: PAE is set in long mode, but that does not mean
> we have valid PDPTRs.
>
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> ---
> arch/x86/kvm/x86.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c5582c3..198aac8 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -701,10 +701,11 @@ int kvm_set_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> return 0;
> }
>
> - if (is_long_mode(vcpu) && (cr3 & CR3_L_MODE_RESERVED_BITS))
> - return 1;
> - if (is_pae(vcpu) && is_paging(vcpu) &&
> - !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
> + if (is_long_mode(vcpu)) {
> + if (cr3 & CR3_L_MODE_RESERVED_BITS)
> + return 1;
> + } else if (is_pae(vcpu) && is_paging(vcpu) &&
> + !load_pdptrs(vcpu, vcpu->arch.walk_mmu, cr3))
> return 1;
>
> vcpu->arch.cr3 = cr3;
>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
will apply tomorrow, thanks.
Paolo
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2014-05-12 10:46 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-17 23:33 [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Nadav Amit
2014-04-18 0:35 ` [PATCH 1/5] KVM: x86: Fix wrong/stuck PMU when guest does not use PMI Nadav Amit
2014-04-18 0:35 ` [PATCH 2/5] KVM: x86: Fix CR3 reserved bits Nadav Amit
2014-05-10 7:13 ` Jan Kiszka
2014-05-10 7:24 ` [PATCH] KVM: x86: Fix CR3 reserved bits check in long mode Jan Kiszka
2014-05-12 10:46 ` Paolo Bonzini
2014-04-18 0:35 ` [PATCH 3/5] KVM: x86: IN instruction emulation should ignore REP-prefix Nadav Amit
2014-04-18 4:11 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Nadav Amit
2014-04-18 4:11 ` [PATCH 5/5] KVM: x86: Processor mode may be determined incorrectly Nadav Amit
2014-04-20 9:26 ` [PATCH 4/5] KVM: x86: RSI/RDI/RCX are zero-extended when affected by string ops Gleb Natapov
2014-04-22 6:04 ` Nadav Amit
2014-04-23 19:58 ` Marcelo Tosatti
2014-04-23 20:11 ` Marcelo Tosatti
2014-04-23 20:53 ` Nadav Amit
2014-04-23 21:01 ` H. Peter Anvin
2014-04-23 20:47 ` [PATCH 0/5] KVM: x86: Fix KVM behavior that does not follow spec Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).