* [PATCH 0/3] x86: XSA-111 follow-ups
@ 2015-01-08 15:15 Jan Beulich
2015-01-08 15:22 ` [PATCH 1/3] x86: streamline hypercall_create_continuation() Jan Beulich
` (3 more replies)
0 siblings, 4 replies; 16+ messages in thread
From: Jan Beulich @ 2015-01-08 15:15 UTC (permalink / raw)
To: xen-devel; +Cc: Keir Fraser
1: streamline hypercall_create_continuation()
2: HVM: clobber hypercall arguments just like for PV
3: HVM: make hvm_efer_valid() honor guest features
Signed-off-by: Jan Beulich <jbeulich@suse.com>
^ permalink raw reply [flat|nested] 16+ messages in thread* [PATCH 1/3] x86: streamline hypercall_create_continuation() 2015-01-08 15:15 [PATCH 0/3] x86: XSA-111 follow-ups Jan Beulich @ 2015-01-08 15:22 ` Jan Beulich 2015-01-08 16:01 ` Andrew Cooper 2015-01-08 15:22 ` [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV Jan Beulich ` (2 subsequent siblings) 3 siblings, 1 reply; 16+ messages in thread From: Jan Beulich @ 2015-01-08 15:22 UTC (permalink / raw) To: xen-devel; +Cc: Keir Fraser [-- Attachment #1: Type: text/plain, Size: 2368 bytes --] - drop clearing of excessive multicall arguments in compat case (no longer needed now that hypercall_xlat_continuation() only checks the actual arguments) - latch current into a local variable - use the cached value of hvm_guest_x86_mode() instead of re-executing it - scope restrict "regs" - while at it, convert the remaining two argument checking BUG_ON()s in hypercall_xlat_continuation() to ASSERT()s Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1675,7 +1675,6 @@ unsigned long hypercall_create_continuat unsigned int op, const char *format, ...) { struct mc_state *mcs = ¤t->mc_state; - struct cpu_user_regs *regs; const char *p = format; unsigned long arg; unsigned int i; @@ -1689,26 +1688,23 @@ unsigned long hypercall_create_continuat for ( i = 0; *p != '\0'; i++ ) mcs->call.args[i] = next_arg(p, args); - if ( is_pv_32on64_domain(current->domain) ) - { - for ( ; i < 6; i++ ) - mcs->call.args[i] = 0; - } } else { - regs = guest_cpu_user_regs(); - regs->eax = op; + struct cpu_user_regs *regs = guest_cpu_user_regs(); + struct vcpu *curr = current; + + regs->eax = op; /* Ensure the hypercall trap instruction is re-executed. */ - if ( is_pv_vcpu(current) ) + if ( is_pv_vcpu(curr) ) regs->eip -= 2; /* re-execute 'syscall' / 'int $xx' */ else - current->arch.hvm_vcpu.hcall_preempted = 1; + curr->arch.hvm_vcpu.hcall_preempted = 1; - if ( is_pv_vcpu(current) ? - !is_pv_32on64_vcpu(current) : - (hvm_guest_x86_mode(current) == 8) ) + if ( is_pv_vcpu(curr) ? + !is_pv_32on64_vcpu(curr) : + curr->arch.hvm_vcpu.hcall_64bit ) { for ( i = 0; *p != '\0'; i++ ) { @@ -1759,9 +1755,8 @@ int hypercall_xlat_continuation(unsigned ASSERT(nr <= ARRAY_SIZE(mcs->call.args)); ASSERT(!(mask >> nr)); - - BUG_ON(id && *id >= nr); - BUG_ON(id && (mask & (1U << *id))); + ASSERT(!id || *id < nr); + ASSERT(!id || !(mask & (1U << *id))); va_start(args, mask); [-- Attachment #2: x86-hcc-streamline.patch --] [-- Type: text/plain, Size: 2413 bytes --] x86: streamline hypercall_create_continuation() - drop clearing of excessive multicall arguments in compat case (no longer needed now that hypercall_xlat_continuation() only checks the actual arguments) - latch current into a local variable - use the cached value of hvm_guest_x86_mode() instead of re-executing it - scope restrict "regs" - while at it, convert the remaining two argument checking BUG_ON()s in hypercall_xlat_continuation() to ASSERT()s Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -1675,7 +1675,6 @@ unsigned long hypercall_create_continuat unsigned int op, const char *format, ...) { struct mc_state *mcs = ¤t->mc_state; - struct cpu_user_regs *regs; const char *p = format; unsigned long arg; unsigned int i; @@ -1689,26 +1688,23 @@ unsigned long hypercall_create_continuat for ( i = 0; *p != '\0'; i++ ) mcs->call.args[i] = next_arg(p, args); - if ( is_pv_32on64_domain(current->domain) ) - { - for ( ; i < 6; i++ ) - mcs->call.args[i] = 0; - } } else { - regs = guest_cpu_user_regs(); - regs->eax = op; + struct cpu_user_regs *regs = guest_cpu_user_regs(); + struct vcpu *curr = current; + + regs->eax = op; /* Ensure the hypercall trap instruction is re-executed. */ - if ( is_pv_vcpu(current) ) + if ( is_pv_vcpu(curr) ) regs->eip -= 2; /* re-execute 'syscall' / 'int $xx' */ else - current->arch.hvm_vcpu.hcall_preempted = 1; + curr->arch.hvm_vcpu.hcall_preempted = 1; - if ( is_pv_vcpu(current) ? - !is_pv_32on64_vcpu(current) : - (hvm_guest_x86_mode(current) == 8) ) + if ( is_pv_vcpu(curr) ? + !is_pv_32on64_vcpu(curr) : + curr->arch.hvm_vcpu.hcall_64bit ) { for ( i = 0; *p != '\0'; i++ ) { @@ -1759,9 +1755,8 @@ int hypercall_xlat_continuation(unsigned ASSERT(nr <= ARRAY_SIZE(mcs->call.args)); ASSERT(!(mask >> nr)); - - BUG_ON(id && *id >= nr); - BUG_ON(id && (mask & (1U << *id))); + ASSERT(!id || *id < nr); + ASSERT(!id || !(mask & (1U << *id))); va_start(args, mask); [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/3] x86: streamline hypercall_create_continuation() 2015-01-08 15:22 ` [PATCH 1/3] x86: streamline hypercall_create_continuation() Jan Beulich @ 2015-01-08 16:01 ` Andrew Cooper 2015-01-08 16:11 ` Jan Beulich 0 siblings, 1 reply; 16+ messages in thread From: Andrew Cooper @ 2015-01-08 16:01 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Keir Fraser [-- Attachment #1.1: Type: text/plain, Size: 2983 bytes --] On 08/01/15 15:22, Jan Beulich wrote: > - drop clearing of excessive multicall arguments in compat case (no > longer needed now that hypercall_xlat_continuation() only checks the > actual arguments) > - latch current into a local variable > - use the cached value of hvm_guest_x86_mode() instead of re-executing > it > - scope restrict "regs" > - while at it, convert the remaining two argument checking BUG_ON()s in > hypercall_xlat_continuation() to ASSERT()s > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> There appear to other many other places which could benifit from a caching of guest_x86_mode() (especially in the nested virt case). Is it worth considering unconditionally calculating on vmexit and removing the function? > > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -1675,7 +1675,6 @@ unsigned long hypercall_create_continuat > unsigned int op, const char *format, ...) > { > struct mc_state *mcs = ¤t->mc_state; > - struct cpu_user_regs *regs; > const char *p = format; > unsigned long arg; > unsigned int i; > @@ -1689,26 +1688,23 @@ unsigned long hypercall_create_continuat > > for ( i = 0; *p != '\0'; i++ ) > mcs->call.args[i] = next_arg(p, args); > - if ( is_pv_32on64_domain(current->domain) ) > - { > - for ( ; i < 6; i++ ) > - mcs->call.args[i] = 0; > - } > } > else > { > - regs = guest_cpu_user_regs(); > - regs->eax = op; > + struct cpu_user_regs *regs = guest_cpu_user_regs(); > + struct vcpu *curr = current; > + > + regs->eax = op; > > /* Ensure the hypercall trap instruction is re-executed. */ > - if ( is_pv_vcpu(current) ) > + if ( is_pv_vcpu(curr) ) > regs->eip -= 2; /* re-execute 'syscall' / 'int $xx' */ > else > - current->arch.hvm_vcpu.hcall_preempted = 1; > + curr->arch.hvm_vcpu.hcall_preempted = 1; > > - if ( is_pv_vcpu(current) ? > - !is_pv_32on64_vcpu(current) : > - (hvm_guest_x86_mode(current) == 8) ) > + if ( is_pv_vcpu(curr) ? > + !is_pv_32on64_vcpu(curr) : > + curr->arch.hvm_vcpu.hcall_64bit ) > { > for ( i = 0; *p != '\0'; i++ ) > { > @@ -1759,9 +1755,8 @@ int hypercall_xlat_continuation(unsigned > > ASSERT(nr <= ARRAY_SIZE(mcs->call.args)); > ASSERT(!(mask >> nr)); > - > - BUG_ON(id && *id >= nr); > - BUG_ON(id && (mask & (1U << *id))); > + ASSERT(!id || *id < nr); > + ASSERT(!id || !(mask & (1U << *id))); > > va_start(args, mask); > > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel [-- Attachment #1.2: Type: text/html, Size: 3901 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 1/3] x86: streamline hypercall_create_continuation() 2015-01-08 16:01 ` Andrew Cooper @ 2015-01-08 16:11 ` Jan Beulich 0 siblings, 0 replies; 16+ messages in thread From: Jan Beulich @ 2015-01-08 16:11 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser >>> On 08.01.15 at 17:01, <andrew.cooper3@citrix.com> wrote: > There appear to other many other places which could benifit from a > caching of guest_x86_mode() (especially in the nested virt case). Is it > worth considering unconditionally calculating on vmexit and removing the > function? If it's _only_ nested, then maybe not, But if there are at least a couple of uses in "normal" code, then maybe yes. Jan ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV 2015-01-08 15:15 [PATCH 0/3] x86: XSA-111 follow-ups Jan Beulich 2015-01-08 15:22 ` [PATCH 1/3] x86: streamline hypercall_create_continuation() Jan Beulich @ 2015-01-08 15:22 ` Jan Beulich 2015-01-08 17:20 ` Andrew Cooper 2015-01-08 15:23 ` [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features Jan Beulich 2015-01-08 15:53 ` [PATCH 0/3] x86: XSA-111 follow-ups Tim Deegan 3 siblings, 1 reply; 16+ messages in thread From: Jan Beulich @ 2015-01-08 15:22 UTC (permalink / raw) To: xen-devel; +Cc: Keir Fraser [-- Attachment #1: Type: text/plain, Size: 5286 bytes --] Unused arguments get clobbered before the call (not affecting caller visible state), while used arguments get clobbered afterwards unless a continuation is needed (affecting caller visible state). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4818,6 +4818,8 @@ static hvm_hypercall_t *const pvh_hyperc [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation }; +extern const uint8_t hypercall_args_table[], compat_hypercall_args_table[]; + int hvm_do_hypercall(struct cpu_user_regs *regs) { struct vcpu *curr = current; @@ -4856,36 +4858,95 @@ int hvm_do_hypercall(struct cpu_user_reg if ( mode == 8 ) { + unsigned long rdi = regs->rdi; + unsigned long rsi = regs->rsi; + unsigned long rdx = regs->rdx; + unsigned long r10 = regs->r10; + unsigned long r8 = regs->r8; + unsigned long r9 = regs->r9; + HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%lx, %lx, %lx, %lx, %lx, %lx)", - eax, regs->rdi, regs->rsi, regs->rdx, - regs->r10, regs->r8, regs->r9); + eax, rdi, rsi, rdx, r10, r8, r9); + +#ifndef NDEBUG + /* Deliberately corrupt parameter regs not used by this hypercall. */ + switch ( hypercall_args_table[eax] ) + { + case 0: rdi = 0xdeadbeefdeadf00dUL; + case 1: rsi = 0xdeadbeefdeadf00dUL; + case 2: rdx = 0xdeadbeefdeadf00dUL; + case 3: r10 = 0xdeadbeefdeadf00dUL; + case 4: r8 = 0xdeadbeefdeadf00dUL; + case 5: r9 = 0xdeadbeefdeadf00dUL; + } +#endif curr->arch.hvm_vcpu.hcall_64bit = 1; - if ( is_pvh_vcpu(curr) ) - regs->rax = pvh_hypercall64_table[eax](regs->rdi, regs->rsi, - regs->rdx, regs->r10, - regs->r8, regs->r9); - else - regs->rax = hvm_hypercall64_table[eax](regs->rdi, regs->rsi, - regs->rdx, regs->r10, - regs->r8, regs->r9); + regs->rax = (is_pvh_vcpu(curr) + ? pvh_hypercall64_table + : hvm_hypercall64_table)[eax](rdi, rsi, rdx, r10, r8, r9); curr->arch.hvm_vcpu.hcall_64bit = 0; + +#ifndef NDEBUG + if ( !curr->arch.hvm_vcpu.hcall_preempted ) + { + /* Deliberately corrupt parameter regs used by this hypercall. */ + switch ( hypercall_args_table[eax] ) + { + case 6: regs->r9 = 0xdeadbeefdeadf00dUL; + case 5: regs->r8 = 0xdeadbeefdeadf00dUL; + case 4: regs->r10 = 0xdeadbeefdeadf00dUL; + case 3: regs->edx = 0xdeadbeefdeadf00dUL; + case 2: regs->esi = 0xdeadbeefdeadf00dUL; + case 1: regs->edi = 0xdeadbeefdeadf00dUL; + } + } +#endif } else if ( unlikely(is_pvh_vcpu(curr)) ) regs->_eax = -ENOSYS; /* PVH 32bitfixme. */ else { + unsigned int ebx = regs->_ebx; + unsigned int ecx = regs->_ecx; + unsigned int edx = regs->_edx; + unsigned int esi = regs->_esi; + unsigned int edi = regs->_edi; + unsigned int ebp = regs->_ebp; + HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%x, %x, %x, %x, %x, %x)", eax, - (uint32_t)regs->ebx, (uint32_t)regs->ecx, - (uint32_t)regs->edx, (uint32_t)regs->esi, - (uint32_t)regs->edi, (uint32_t)regs->ebp); - - regs->eax = hvm_hypercall32_table[eax]((uint32_t)regs->ebx, - (uint32_t)regs->ecx, - (uint32_t)regs->edx, - (uint32_t)regs->esi, - (uint32_t)regs->edi, - (uint32_t)regs->ebp); + ebx, ecx, edx, esi, edi, ebp); + +#ifndef NDEBUG + /* Deliberately corrupt parameter regs not used by this hypercall. */ + switch ( compat_hypercall_args_table[eax] ) + { + case 0: ebx = 0xdeadf00d; + case 1: ecx = 0xdeadf00d; + case 2: edx = 0xdeadf00d; + case 3: esi = 0xdeadf00d; + case 4: edi = 0xdeadf00d; + case 5: ebp = 0xdeadf00d; + } +#endif + + regs->_eax = hvm_hypercall32_table[eax](ebx, ecx, edx, esi, edi, ebp); + +#ifndef NDEBUG + if ( !curr->arch.hvm_vcpu.hcall_preempted ) + { + /* Deliberately corrupt parameter regs used by this hypercall. */ + switch ( compat_hypercall_args_table[eax] ) + { + case 6: regs->ebp = 0xdeadf00d; + case 5: regs->edi = 0xdeadf00d; + case 4: regs->esi = 0xdeadf00d; + case 3: regs->edx = 0xdeadf00d; + case 2: regs->ecx = 0xdeadf00d; + case 1: regs->ebx = 0xdeadf00d; + } + } +#endif } HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u -> %lx", [-- Attachment #2: x86-HVM-hcall-clobber-args.patch --] [-- Type: text/plain, Size: 5339 bytes --] x86/HVM: clobber hypercall arguments just like for PV Unused arguments get clobbered before the call (not affecting caller visible state), while used arguments get clobbered afterwards unless a continuation is needed (affecting caller visible state). Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -4818,6 +4818,8 @@ static hvm_hypercall_t *const pvh_hyperc [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation }; +extern const uint8_t hypercall_args_table[], compat_hypercall_args_table[]; + int hvm_do_hypercall(struct cpu_user_regs *regs) { struct vcpu *curr = current; @@ -4856,36 +4858,95 @@ int hvm_do_hypercall(struct cpu_user_reg if ( mode == 8 ) { + unsigned long rdi = regs->rdi; + unsigned long rsi = regs->rsi; + unsigned long rdx = regs->rdx; + unsigned long r10 = regs->r10; + unsigned long r8 = regs->r8; + unsigned long r9 = regs->r9; + HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%lx, %lx, %lx, %lx, %lx, %lx)", - eax, regs->rdi, regs->rsi, regs->rdx, - regs->r10, regs->r8, regs->r9); + eax, rdi, rsi, rdx, r10, r8, r9); + +#ifndef NDEBUG + /* Deliberately corrupt parameter regs not used by this hypercall. */ + switch ( hypercall_args_table[eax] ) + { + case 0: rdi = 0xdeadbeefdeadf00dUL; + case 1: rsi = 0xdeadbeefdeadf00dUL; + case 2: rdx = 0xdeadbeefdeadf00dUL; + case 3: r10 = 0xdeadbeefdeadf00dUL; + case 4: r8 = 0xdeadbeefdeadf00dUL; + case 5: r9 = 0xdeadbeefdeadf00dUL; + } +#endif curr->arch.hvm_vcpu.hcall_64bit = 1; - if ( is_pvh_vcpu(curr) ) - regs->rax = pvh_hypercall64_table[eax](regs->rdi, regs->rsi, - regs->rdx, regs->r10, - regs->r8, regs->r9); - else - regs->rax = hvm_hypercall64_table[eax](regs->rdi, regs->rsi, - regs->rdx, regs->r10, - regs->r8, regs->r9); + regs->rax = (is_pvh_vcpu(curr) + ? pvh_hypercall64_table + : hvm_hypercall64_table)[eax](rdi, rsi, rdx, r10, r8, r9); curr->arch.hvm_vcpu.hcall_64bit = 0; + +#ifndef NDEBUG + if ( !curr->arch.hvm_vcpu.hcall_preempted ) + { + /* Deliberately corrupt parameter regs used by this hypercall. */ + switch ( hypercall_args_table[eax] ) + { + case 6: regs->r9 = 0xdeadbeefdeadf00dUL; + case 5: regs->r8 = 0xdeadbeefdeadf00dUL; + case 4: regs->r10 = 0xdeadbeefdeadf00dUL; + case 3: regs->edx = 0xdeadbeefdeadf00dUL; + case 2: regs->esi = 0xdeadbeefdeadf00dUL; + case 1: regs->edi = 0xdeadbeefdeadf00dUL; + } + } +#endif } else if ( unlikely(is_pvh_vcpu(curr)) ) regs->_eax = -ENOSYS; /* PVH 32bitfixme. */ else { + unsigned int ebx = regs->_ebx; + unsigned int ecx = regs->_ecx; + unsigned int edx = regs->_edx; + unsigned int esi = regs->_esi; + unsigned int edi = regs->_edi; + unsigned int ebp = regs->_ebp; + HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%x, %x, %x, %x, %x, %x)", eax, - (uint32_t)regs->ebx, (uint32_t)regs->ecx, - (uint32_t)regs->edx, (uint32_t)regs->esi, - (uint32_t)regs->edi, (uint32_t)regs->ebp); - - regs->eax = hvm_hypercall32_table[eax]((uint32_t)regs->ebx, - (uint32_t)regs->ecx, - (uint32_t)regs->edx, - (uint32_t)regs->esi, - (uint32_t)regs->edi, - (uint32_t)regs->ebp); + ebx, ecx, edx, esi, edi, ebp); + +#ifndef NDEBUG + /* Deliberately corrupt parameter regs not used by this hypercall. */ + switch ( compat_hypercall_args_table[eax] ) + { + case 0: ebx = 0xdeadf00d; + case 1: ecx = 0xdeadf00d; + case 2: edx = 0xdeadf00d; + case 3: esi = 0xdeadf00d; + case 4: edi = 0xdeadf00d; + case 5: ebp = 0xdeadf00d; + } +#endif + + regs->_eax = hvm_hypercall32_table[eax](ebx, ecx, edx, esi, edi, ebp); + +#ifndef NDEBUG + if ( !curr->arch.hvm_vcpu.hcall_preempted ) + { + /* Deliberately corrupt parameter regs used by this hypercall. */ + switch ( compat_hypercall_args_table[eax] ) + { + case 6: regs->ebp = 0xdeadf00d; + case 5: regs->edi = 0xdeadf00d; + case 4: regs->esi = 0xdeadf00d; + case 3: regs->edx = 0xdeadf00d; + case 2: regs->ecx = 0xdeadf00d; + case 1: regs->ebx = 0xdeadf00d; + } + } +#endif } HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u -> %lx", [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV 2015-01-08 15:22 ` [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV Jan Beulich @ 2015-01-08 17:20 ` Andrew Cooper 0 siblings, 0 replies; 16+ messages in thread From: Andrew Cooper @ 2015-01-08 17:20 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Keir Fraser [-- Attachment #1.1: Type: text/plain, Size: 5997 bytes --] On 08/01/15 15:22, Jan Beulich wrote: > Unused arguments get clobbered before the call (not affecting caller > visible state), while used arguments get clobbered afterwards unless > a continuation is needed (affecting caller visible state). > > Signed-off-by: Jan Beulich <jbeulich@suse.com> After a long time pouring over the Microsoft register calling conventions documentation, and the Windows PV driver code, I am now convinced that they are performing appropriate parameter saving. Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> > > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -4818,6 +4818,8 @@ static hvm_hypercall_t *const pvh_hyperc > [ __HYPERVISOR_arch_1 ] = (hvm_hypercall_t *)paging_domctl_continuation > }; > > +extern const uint8_t hypercall_args_table[], compat_hypercall_args_table[]; > + > int hvm_do_hypercall(struct cpu_user_regs *regs) > { > struct vcpu *curr = current; > @@ -4856,36 +4858,95 @@ int hvm_do_hypercall(struct cpu_user_reg > > if ( mode == 8 ) > { > + unsigned long rdi = regs->rdi; > + unsigned long rsi = regs->rsi; > + unsigned long rdx = regs->rdx; > + unsigned long r10 = regs->r10; > + unsigned long r8 = regs->r8; > + unsigned long r9 = regs->r9; > + > HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%lx, %lx, %lx, %lx, %lx, %lx)", > - eax, regs->rdi, regs->rsi, regs->rdx, > - regs->r10, regs->r8, regs->r9); > + eax, rdi, rsi, rdx, r10, r8, r9); > + > +#ifndef NDEBUG > + /* Deliberately corrupt parameter regs not used by this hypercall. */ > + switch ( hypercall_args_table[eax] ) > + { > + case 0: rdi = 0xdeadbeefdeadf00dUL; > + case 1: rsi = 0xdeadbeefdeadf00dUL; > + case 2: rdx = 0xdeadbeefdeadf00dUL; > + case 3: r10 = 0xdeadbeefdeadf00dUL; > + case 4: r8 = 0xdeadbeefdeadf00dUL; > + case 5: r9 = 0xdeadbeefdeadf00dUL; > + } > +#endif > > curr->arch.hvm_vcpu.hcall_64bit = 1; > - if ( is_pvh_vcpu(curr) ) > - regs->rax = pvh_hypercall64_table[eax](regs->rdi, regs->rsi, > - regs->rdx, regs->r10, > - regs->r8, regs->r9); > - else > - regs->rax = hvm_hypercall64_table[eax](regs->rdi, regs->rsi, > - regs->rdx, regs->r10, > - regs->r8, regs->r9); > + regs->rax = (is_pvh_vcpu(curr) > + ? pvh_hypercall64_table > + : hvm_hypercall64_table)[eax](rdi, rsi, rdx, r10, r8, r9); > curr->arch.hvm_vcpu.hcall_64bit = 0; > + > +#ifndef NDEBUG > + if ( !curr->arch.hvm_vcpu.hcall_preempted ) > + { > + /* Deliberately corrupt parameter regs used by this hypercall. */ > + switch ( hypercall_args_table[eax] ) > + { > + case 6: regs->r9 = 0xdeadbeefdeadf00dUL; > + case 5: regs->r8 = 0xdeadbeefdeadf00dUL; > + case 4: regs->r10 = 0xdeadbeefdeadf00dUL; > + case 3: regs->edx = 0xdeadbeefdeadf00dUL; > + case 2: regs->esi = 0xdeadbeefdeadf00dUL; > + case 1: regs->edi = 0xdeadbeefdeadf00dUL; > + } > + } > +#endif > } > else if ( unlikely(is_pvh_vcpu(curr)) ) > regs->_eax = -ENOSYS; /* PVH 32bitfixme. */ > else > { > + unsigned int ebx = regs->_ebx; > + unsigned int ecx = regs->_ecx; > + unsigned int edx = regs->_edx; > + unsigned int esi = regs->_esi; > + unsigned int edi = regs->_edi; > + unsigned int ebp = regs->_ebp; > + > HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u(%x, %x, %x, %x, %x, %x)", eax, > - (uint32_t)regs->ebx, (uint32_t)regs->ecx, > - (uint32_t)regs->edx, (uint32_t)regs->esi, > - (uint32_t)regs->edi, (uint32_t)regs->ebp); > - > - regs->eax = hvm_hypercall32_table[eax]((uint32_t)regs->ebx, > - (uint32_t)regs->ecx, > - (uint32_t)regs->edx, > - (uint32_t)regs->esi, > - (uint32_t)regs->edi, > - (uint32_t)regs->ebp); > + ebx, ecx, edx, esi, edi, ebp); > + > +#ifndef NDEBUG > + /* Deliberately corrupt parameter regs not used by this hypercall. */ > + switch ( compat_hypercall_args_table[eax] ) > + { > + case 0: ebx = 0xdeadf00d; > + case 1: ecx = 0xdeadf00d; > + case 2: edx = 0xdeadf00d; > + case 3: esi = 0xdeadf00d; > + case 4: edi = 0xdeadf00d; > + case 5: ebp = 0xdeadf00d; > + } > +#endif > + > + regs->_eax = hvm_hypercall32_table[eax](ebx, ecx, edx, esi, edi, ebp); > + > +#ifndef NDEBUG > + if ( !curr->arch.hvm_vcpu.hcall_preempted ) > + { > + /* Deliberately corrupt parameter regs used by this hypercall. */ > + switch ( compat_hypercall_args_table[eax] ) > + { > + case 6: regs->ebp = 0xdeadf00d; > + case 5: regs->edi = 0xdeadf00d; > + case 4: regs->esi = 0xdeadf00d; > + case 3: regs->edx = 0xdeadf00d; > + case 2: regs->ecx = 0xdeadf00d; > + case 1: regs->ebx = 0xdeadf00d; > + } > + } > +#endif > } > > HVM_DBG_LOG(DBG_LEVEL_HCALL, "hcall%u -> %lx", > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel [-- Attachment #1.2: Type: text/html, Size: 6727 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 15:15 [PATCH 0/3] x86: XSA-111 follow-ups Jan Beulich 2015-01-08 15:22 ` [PATCH 1/3] x86: streamline hypercall_create_continuation() Jan Beulich 2015-01-08 15:22 ` [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV Jan Beulich @ 2015-01-08 15:23 ` Jan Beulich 2015-01-08 15:56 ` Tim Deegan 2015-01-08 18:49 ` Andrew Cooper 2015-01-08 15:53 ` [PATCH 0/3] x86: XSA-111 follow-ups Tim Deegan 3 siblings, 2 replies; 16+ messages in thread From: Jan Beulich @ 2015-01-08 15:23 UTC (permalink / raw) To: xen-devel; +Cc: Keir Fraser [-- Attachment #1: Type: text/plain, Size: 3632 bytes --] Following the earlier similar change validating CR4 modifications. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1672,20 +1672,53 @@ static int hvm_save_cpu_ctxt(struct doma return 0; } -static bool_t hvm_efer_valid(struct domain *d, - uint64_t value, uint64_t efer_validbits) +static bool_t hvm_efer_valid(const struct vcpu *v, uint64_t value, + bool_t restore) { - if ( nestedhvm_enabled(d) && cpu_has_svm ) - efer_validbits |= EFER_SVME; + unsigned int ext1_ecx = 0, ext1_edx = 0; - return !((value & ~efer_validbits) || - ((sizeof(long) != 8) && (value & EFER_LME)) || - (!cpu_has_svm && (value & EFER_SVME)) || - (!cpu_has_nx && (value & EFER_NX)) || - (!cpu_has_syscall && (value & EFER_SCE)) || - (!cpu_has_lmsl && (value & EFER_LMSLE)) || - (!cpu_has_ffxsr && (value & EFER_FFXSE)) || - ((value & (EFER_LME|EFER_LMA)) == EFER_LMA)); + if ( !restore && !is_hardware_domain(v->domain) ) + { + unsigned int level; + + ASSERT(v == current); + hvm_cpuid(0x80000000, &level, NULL, NULL, NULL); + if ( level >= 0x80000001 ) + hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx); + } + else + { + ext1_edx = boot_cpu_data.x86_capability[X86_FEATURE_LM / 32]; + ext1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_SVM / 32]; + } + + if ( (value & EFER_SCE) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) ) + return 0; + + if ( (value & (EFER_LME | EFER_LMA)) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) ) + return 0; + + if ( (value & EFER_LMA) && !(value & EFER_LME) ) + return 0; + + if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) ) + return 0; + + if ( (value & EFER_SVME) && + (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) || + !nestedhvm_enabled(v->domain)) ) + return 0; + + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) + return 0; + + if ( (value & EFER_FFXSE) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_FFXSR)) ) + return 0; + + return 1; } /* These reserved bits in lower 32 remain 0 after any load of CR0 */ @@ -1763,7 +1796,6 @@ static int hvm_load_cpu_ctxt(struct doma struct vcpu *v; struct hvm_hw_cpu ctxt; struct segment_register seg; - uint64_t efer_validbits; /* Which vcpu is this? */ vcpuid = hvm_load_instance(h); @@ -1794,9 +1826,7 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_LMA - | EFER_NX | EFER_SCE; - if ( !hvm_efer_valid(d, ctxt.msr_efer, efer_validbits) ) + if ( !hvm_efer_valid(v, ctxt.msr_efer, 1) ) { printk(XENLOG_G_ERR "HVM%d restore: bad EFER %#" PRIx64 "\n", d->domain_id, ctxt.msr_efer); @@ -2936,12 +2966,10 @@ err: int hvm_set_efer(uint64_t value) { struct vcpu *v = current; - uint64_t efer_validbits; value &= ~EFER_LMA; - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_NX | EFER_SCE; - if ( !hvm_efer_valid(v->domain, value, efer_validbits) ) + if ( !hvm_efer_valid(v, value, 0) ) { gdprintk(XENLOG_WARNING, "Trying to set reserved bit in " "EFER: %#"PRIx64"\n", value); [-- Attachment #2: x86-HVM-refine-EFER-reserved-bits-checks.patch --] [-- Type: text/plain, Size: 3681 bytes --] x86/HVM: make hvm_efer_valid() honor guest features Following the earlier similar change validating CR4 modifications. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -1672,20 +1672,53 @@ static int hvm_save_cpu_ctxt(struct doma return 0; } -static bool_t hvm_efer_valid(struct domain *d, - uint64_t value, uint64_t efer_validbits) +static bool_t hvm_efer_valid(const struct vcpu *v, uint64_t value, + bool_t restore) { - if ( nestedhvm_enabled(d) && cpu_has_svm ) - efer_validbits |= EFER_SVME; + unsigned int ext1_ecx = 0, ext1_edx = 0; - return !((value & ~efer_validbits) || - ((sizeof(long) != 8) && (value & EFER_LME)) || - (!cpu_has_svm && (value & EFER_SVME)) || - (!cpu_has_nx && (value & EFER_NX)) || - (!cpu_has_syscall && (value & EFER_SCE)) || - (!cpu_has_lmsl && (value & EFER_LMSLE)) || - (!cpu_has_ffxsr && (value & EFER_FFXSE)) || - ((value & (EFER_LME|EFER_LMA)) == EFER_LMA)); + if ( !restore && !is_hardware_domain(v->domain) ) + { + unsigned int level; + + ASSERT(v == current); + hvm_cpuid(0x80000000, &level, NULL, NULL, NULL); + if ( level >= 0x80000001 ) + hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx); + } + else + { + ext1_edx = boot_cpu_data.x86_capability[X86_FEATURE_LM / 32]; + ext1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_SVM / 32]; + } + + if ( (value & EFER_SCE) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) ) + return 0; + + if ( (value & (EFER_LME | EFER_LMA)) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) ) + return 0; + + if ( (value & EFER_LMA) && !(value & EFER_LME) ) + return 0; + + if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) ) + return 0; + + if ( (value & EFER_SVME) && + (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) || + !nestedhvm_enabled(v->domain)) ) + return 0; + + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) + return 0; + + if ( (value & EFER_FFXSE) && + !(ext1_edx & cpufeat_mask(X86_FEATURE_FFXSR)) ) + return 0; + + return 1; } /* These reserved bits in lower 32 remain 0 after any load of CR0 */ @@ -1763,7 +1796,6 @@ static int hvm_load_cpu_ctxt(struct doma struct vcpu *v; struct hvm_hw_cpu ctxt; struct segment_register seg; - uint64_t efer_validbits; /* Which vcpu is this? */ vcpuid = hvm_load_instance(h); @@ -1794,9 +1826,7 @@ static int hvm_load_cpu_ctxt(struct doma return -EINVAL; } - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_LMA - | EFER_NX | EFER_SCE; - if ( !hvm_efer_valid(d, ctxt.msr_efer, efer_validbits) ) + if ( !hvm_efer_valid(v, ctxt.msr_efer, 1) ) { printk(XENLOG_G_ERR "HVM%d restore: bad EFER %#" PRIx64 "\n", d->domain_id, ctxt.msr_efer); @@ -2936,12 +2966,10 @@ err: int hvm_set_efer(uint64_t value) { struct vcpu *v = current; - uint64_t efer_validbits; value &= ~EFER_LMA; - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_NX | EFER_SCE; - if ( !hvm_efer_valid(v->domain, value, efer_validbits) ) + if ( !hvm_efer_valid(v, value, 0) ) { gdprintk(XENLOG_WARNING, "Trying to set reserved bit in " "EFER: %#"PRIx64"\n", value); [-- Attachment #3: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 15:23 ` [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features Jan Beulich @ 2015-01-08 15:56 ` Tim Deegan 2015-01-08 16:04 ` Jan Beulich 2015-01-08 18:49 ` Andrew Cooper 1 sibling, 1 reply; 16+ messages in thread From: Tim Deegan @ 2015-01-08 15:56 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser At 15:23 +0000 on 08 Jan (1420727005), Jan Beulich wrote: > + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) > + return 0; I see this bit has no CPUID flag, and the docs don't seem to suggest that it would ever not be valid. Are there real CPUs where it can't be set? My reviewed-by: stands anyway as this is an improvement over the current code. Just wondering whether this needs to go on the list of interesting edge cazses for Andrew's feature-levelling rewrite. Cheers, Tim. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 15:56 ` Tim Deegan @ 2015-01-08 16:04 ` Jan Beulich 2015-01-08 18:57 ` Andrew Cooper 0 siblings, 1 reply; 16+ messages in thread From: Jan Beulich @ 2015-01-08 16:04 UTC (permalink / raw) To: Tim Deegan; +Cc: xen-devel, Keir Fraser >>> On 08.01.15 at 16:56, <tim@xen.org> wrote: > At 15:23 +0000 on 08 Jan (1420727005), Jan Beulich wrote: >> + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) >> + return 0; > > I see this bit has no CPUID flag, and the docs don't seem to suggest > that it would ever not be valid. Are there real CPUs where it can't > be set? Yes, I'm afraid AMD introduced this subsequent to SVM, iirc upon VMware request. > My reviewed-by: stands anyway as this is an improvement over the > current code. Just wondering whether this needs to go on the list > of interesting edge cazses for Andrew's feature-levelling rewrite. Yes, dealing with this ought to be interesting. Jan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 16:04 ` Jan Beulich @ 2015-01-08 18:57 ` Andrew Cooper 0 siblings, 0 replies; 16+ messages in thread From: Andrew Cooper @ 2015-01-08 18:57 UTC (permalink / raw) To: Jan Beulich, Tim Deegan; +Cc: xen-devel, Keir Fraser On 08/01/15 16:04, Jan Beulich wrote: >>>> On 08.01.15 at 16:56, <tim@xen.org> wrote: >> At 15:23 +0000 on 08 Jan (1420727005), Jan Beulich wrote: >>> + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) >>> + return 0; >> I see this bit has no CPUID flag, and the docs don't seem to suggest >> that it would ever not be valid. Are there real CPUs where it can't >> be set? > Yes, I'm afraid AMD introduced this subsequent to SVM, iirc upon > VMware request. > >> My reviewed-by: stands anyway as this is an improvement over the >> current code. Just wondering whether this needs to go on the list >> of interesting edge cazses for Andrew's feature-levelling rewrite. > Yes, dealing with this ought to be interesting. :( I had pessimistically assumed I wasn't going to be able to get away with purely architectural state in the new shiny world. I will take care to accommodate this in the next iteration. ~Andrew ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 15:23 ` [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features Jan Beulich 2015-01-08 15:56 ` Tim Deegan @ 2015-01-08 18:49 ` Andrew Cooper 2015-01-09 11:20 ` Jan Beulich 1 sibling, 1 reply; 16+ messages in thread From: Andrew Cooper @ 2015-01-08 18:49 UTC (permalink / raw) To: Jan Beulich, xen-devel; +Cc: Keir Fraser [-- Attachment #1.1: Type: text/plain, Size: 5476 bytes --] On 08/01/15 15:23, Jan Beulich wrote: > Following the earlier similar change validating CR4 modifications. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -1672,20 +1672,53 @@ static int hvm_save_cpu_ctxt(struct doma > return 0; > } > > -static bool_t hvm_efer_valid(struct domain *d, > - uint64_t value, uint64_t efer_validbits) > +static bool_t hvm_efer_valid(const struct vcpu *v, uint64_t value, > + bool_t restore) > { > - if ( nestedhvm_enabled(d) && cpu_has_svm ) > - efer_validbits |= EFER_SVME; > + unsigned int ext1_ecx = 0, ext1_edx = 0; > > - return !((value & ~efer_validbits) || > - ((sizeof(long) != 8) && (value & EFER_LME)) || > - (!cpu_has_svm && (value & EFER_SVME)) || > - (!cpu_has_nx && (value & EFER_NX)) || > - (!cpu_has_syscall && (value & EFER_SCE)) || > - (!cpu_has_lmsl && (value & EFER_LMSLE)) || > - (!cpu_has_ffxsr && (value & EFER_FFXSE)) || > - ((value & (EFER_LME|EFER_LMA)) == EFER_LMA)); > + if ( !restore && !is_hardware_domain(v->domain) ) > + { > + unsigned int level; > + > + ASSERT(v == current); > + hvm_cpuid(0x80000000, &level, NULL, NULL, NULL); > + if ( level >= 0x80000001 ) > + hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx); > + } > + else > + { > + ext1_edx = boot_cpu_data.x86_capability[X86_FEATURE_LM / 32]; > + ext1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_SVM / 32]; > + } > + > + if ( (value & EFER_SCE) && > + !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) ) > + return 0; > + > + if ( (value & (EFER_LME | EFER_LMA)) && > + !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) ) > + return 0; > + > + if ( (value & EFER_LMA) && !(value & EFER_LME) ) > + return 0; The LME/LMA handling more complicated than this. LMA is read-only on Intel, but specified as read-write on AMD, with the requirement that if it doesn't match the value generated by hardware, a #GP fault will occur. I believe this actually means it is read-only on AMD as well. LMA only gets set by hardware after paging is enabled and the processor switches properly into long mode, which means that there is a window between setting LME and setting CR0.PG where LMA should read as 0. I think hvm_efer_valid() also needs the current EFER and CR0 to work out what the current LMA should be, and reject any attempt to change it. > + > + if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) ) > + return 0; > + > + if ( (value & EFER_SVME) && > + (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) || > + !nestedhvm_enabled(v->domain)) ) This is going to cause an issue for the restore case, as the HVM PARAMs follow the architectural state. I don't believe it is reasonable for nestedhvm_enabled() to disagree with cpufeat_mask(X86_FEATURE_{SVM,VMX}), or for the toolstack to be able to yank nestedhvm while the VM is active. Looking at the checks when setting HVM_PARAM_NESTEDHVM, neither of the guest feature flags are actually checked before setting up the nested infrastructure. Having said all of this, don't appear to be any migration records associated with nested state (hardware-loaded VMCS/VMCB pointers, currently nested?) which leads me to suspect that a domain actually using nested virt is not going to survive a migration, so it might be acceptable to fudge the checking of SVME for now. ~Andrew > + return 0; > + > + if ( (value & EFER_LMSLE) && !cpu_has_lmsl ) > + return 0; > + > + if ( (value & EFER_FFXSE) && > + !(ext1_edx & cpufeat_mask(X86_FEATURE_FFXSR)) ) > + return 0; > + > + return 1; > } > > /* These reserved bits in lower 32 remain 0 after any load of CR0 */ > @@ -1763,7 +1796,6 @@ static int hvm_load_cpu_ctxt(struct doma > struct vcpu *v; > struct hvm_hw_cpu ctxt; > struct segment_register seg; > - uint64_t efer_validbits; > > /* Which vcpu is this? */ > vcpuid = hvm_load_instance(h); > @@ -1794,9 +1826,7 @@ static int hvm_load_cpu_ctxt(struct doma > return -EINVAL; > } > > - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_LMA > - | EFER_NX | EFER_SCE; > - if ( !hvm_efer_valid(d, ctxt.msr_efer, efer_validbits) ) > + if ( !hvm_efer_valid(v, ctxt.msr_efer, 1) ) > { > printk(XENLOG_G_ERR "HVM%d restore: bad EFER %#" PRIx64 "\n", > d->domain_id, ctxt.msr_efer); > @@ -2936,12 +2966,10 @@ err: > int hvm_set_efer(uint64_t value) > { > struct vcpu *v = current; > - uint64_t efer_validbits; > > value &= ~EFER_LMA; > > - efer_validbits = EFER_FFXSE | EFER_LMSLE | EFER_LME | EFER_NX | EFER_SCE; > - if ( !hvm_efer_valid(v->domain, value, efer_validbits) ) > + if ( !hvm_efer_valid(v, value, 0) ) > { > gdprintk(XENLOG_WARNING, "Trying to set reserved bit in " > "EFER: %#"PRIx64"\n", value); > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel [-- Attachment #1.2: Type: text/html, Size: 6702 bytes --] [-- Attachment #2: Type: text/plain, Size: 126 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-08 18:49 ` Andrew Cooper @ 2015-01-09 11:20 ` Jan Beulich 2015-01-09 15:09 ` Andrew Cooper 0 siblings, 1 reply; 16+ messages in thread From: Jan Beulich @ 2015-01-09 11:20 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser >>> On 08.01.15 at 19:49, <andrew.cooper3@citrix.com> wrote: > On 08/01/15 15:23, Jan Beulich wrote: >> Following the earlier similar change validating CR4 modifications. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> >> >> --- a/xen/arch/x86/hvm/hvm.c >> +++ b/xen/arch/x86/hvm/hvm.c >> @@ -1672,20 +1672,53 @@ static int hvm_save_cpu_ctxt(struct doma >> return 0; >> } >> >> -static bool_t hvm_efer_valid(struct domain *d, >> - uint64_t value, uint64_t efer_validbits) >> +static bool_t hvm_efer_valid(const struct vcpu *v, uint64_t value, >> + bool_t restore) >> { >> - if ( nestedhvm_enabled(d) && cpu_has_svm ) >> - efer_validbits |= EFER_SVME; >> + unsigned int ext1_ecx = 0, ext1_edx = 0; >> >> - return !((value & ~efer_validbits) || >> - ((sizeof(long) != 8) && (value & EFER_LME)) || >> - (!cpu_has_svm && (value & EFER_SVME)) || >> - (!cpu_has_nx && (value & EFER_NX)) || >> - (!cpu_has_syscall && (value & EFER_SCE)) || >> - (!cpu_has_lmsl && (value & EFER_LMSLE)) || >> - (!cpu_has_ffxsr && (value & EFER_FFXSE)) || >> - ((value & (EFER_LME|EFER_LMA)) == EFER_LMA)); >> + if ( !restore && !is_hardware_domain(v->domain) ) >> + { >> + unsigned int level; >> + >> + ASSERT(v == current); >> + hvm_cpuid(0x80000000, &level, NULL, NULL, NULL); >> + if ( level >= 0x80000001 ) >> + hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx); >> + } >> + else >> + { >> + ext1_edx = boot_cpu_data.x86_capability[X86_FEATURE_LM / 32]; >> + ext1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_SVM / 32]; >> + } >> + >> + if ( (value & EFER_SCE) && >> + !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) ) >> + return 0; >> + >> + if ( (value & (EFER_LME | EFER_LMA)) && >> + !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) ) >> + return 0; >> + >> + if ( (value & EFER_LMA) && !(value & EFER_LME) ) >> + return 0; > > The LME/LMA handling more complicated than this. > > LMA is read-only on Intel, but specified as read-write on AMD, with the > requirement that if it doesn't match the value generated by hardware, a > #GP fault will occur. I believe this actually means it is read-only on > AMD as well. > > LMA only gets set by hardware after paging is enabled and the processor > switches properly into long mode, which means that there is a window > between setting LME and setting CR0.PG where LMA should read as 0. Did you note that hvm_set_efer() clears EFER_LMA before calling hvm_efer_valid(), thus avoiding all those issues? > I think hvm_efer_valid() also needs the current EFER and CR0 to work out > what the current LMA should be, and reject any attempt to change it. I.e. this would be needed only for the restore path (and it not being done hasn't caused problems for me because guests don't normally get migrated before they fully enabled long mode). >> + if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) ) >> + return 0; >> + >> + if ( (value & EFER_SVME) && >> + (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) || >> + !nestedhvm_enabled(v->domain)) ) > > This is going to cause an issue for the restore case, as the HVM PARAMs > follow the architectural state. > > I don't believe it is reasonable for nestedhvm_enabled() to disagree > with cpufeat_mask(X86_FEATURE_{SVM,VMX}), or for the toolstack to be > able to yank nestedhvm while the VM is active. I effectively only translated the pre-existing (and kind of suspicious to me) nestedhvm_enabled() use here. > Looking at the checks when setting HVM_PARAM_NESTEDHVM, neither of the > guest feature flags are actually checked before setting up the nested > infrastructure. Which would be a separate issue. > Having said all of this, don't appear to be any migration records > associated with nested state (hardware-loaded VMCS/VMCB pointers, > currently nested?) which leads me to suspect that a domain actually > using nested virt is not going to survive a migration, so it might be > acceptable to fudge the checking of SVME for now. So am I getting you right that while these are all valid observations, you don't really ask for a change to the patch in this regard (i.e. taking care of above EFER_LMA issue is all that's needed, and even that only for the unusual case of a guest getting migrated in the middle of it enabling long mode on one of its CPUs)? Jan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-09 11:20 ` Jan Beulich @ 2015-01-09 15:09 ` Andrew Cooper 2015-01-09 15:33 ` Jan Beulich 0 siblings, 1 reply; 16+ messages in thread From: Andrew Cooper @ 2015-01-09 15:09 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser On 09/01/15 11:20, Jan Beulich wrote: >>>> On 08.01.15 at 19:49, <andrew.cooper3@citrix.com> wrote: >> On 08/01/15 15:23, Jan Beulich wrote: >>> Following the earlier similar change validating CR4 modifications. >>> >>> Signed-off-by: Jan Beulich <jbeulich@suse.com> >>> >>> --- a/xen/arch/x86/hvm/hvm.c >>> +++ b/xen/arch/x86/hvm/hvm.c >>> @@ -1672,20 +1672,53 @@ static int hvm_save_cpu_ctxt(struct doma >>> return 0; >>> } >>> >>> -static bool_t hvm_efer_valid(struct domain *d, >>> - uint64_t value, uint64_t efer_validbits) >>> +static bool_t hvm_efer_valid(const struct vcpu *v, uint64_t value, >>> + bool_t restore) >>> { >>> - if ( nestedhvm_enabled(d) && cpu_has_svm ) >>> - efer_validbits |= EFER_SVME; >>> + unsigned int ext1_ecx = 0, ext1_edx = 0; >>> >>> - return !((value & ~efer_validbits) || >>> - ((sizeof(long) != 8) && (value & EFER_LME)) || >>> - (!cpu_has_svm && (value & EFER_SVME)) || >>> - (!cpu_has_nx && (value & EFER_NX)) || >>> - (!cpu_has_syscall && (value & EFER_SCE)) || >>> - (!cpu_has_lmsl && (value & EFER_LMSLE)) || >>> - (!cpu_has_ffxsr && (value & EFER_FFXSE)) || >>> - ((value & (EFER_LME|EFER_LMA)) == EFER_LMA)); >>> + if ( !restore && !is_hardware_domain(v->domain) ) >>> + { >>> + unsigned int level; >>> + >>> + ASSERT(v == current); >>> + hvm_cpuid(0x80000000, &level, NULL, NULL, NULL); >>> + if ( level >= 0x80000001 ) >>> + hvm_cpuid(0x80000001, NULL, NULL, &ext1_ecx, &ext1_edx); >>> + } >>> + else >>> + { >>> + ext1_edx = boot_cpu_data.x86_capability[X86_FEATURE_LM / 32]; >>> + ext1_ecx = boot_cpu_data.x86_capability[X86_FEATURE_SVM / 32]; >>> + } >>> + >>> + if ( (value & EFER_SCE) && >>> + !(ext1_edx & cpufeat_mask(X86_FEATURE_SYSCALL)) ) >>> + return 0; >>> + >>> + if ( (value & (EFER_LME | EFER_LMA)) && >>> + !(ext1_edx & cpufeat_mask(X86_FEATURE_LM)) ) >>> + return 0; >>> + >>> + if ( (value & EFER_LMA) && !(value & EFER_LME) ) >>> + return 0; >> The LME/LMA handling more complicated than this. >> >> LMA is read-only on Intel, but specified as read-write on AMD, with the >> requirement that if it doesn't match the value generated by hardware, a >> #GP fault will occur. I believe this actually means it is read-only on >> AMD as well. >> >> LMA only gets set by hardware after paging is enabled and the processor >> switches properly into long mode, which means that there is a window >> between setting LME and setting CR0.PG where LMA should read as 0. > Did you note that hvm_set_efer() clears EFER_LMA before calling > hvm_efer_valid(), thus avoiding all those issues? I failed to appreciate its intent. > >> I think hvm_efer_valid() also needs the current EFER and CR0 to work out >> what the current LMA should be, and reject any attempt to change it. > I.e. this would be needed only for the restore path (and it not being > done hasn't caused problems for me because guests don't normally > get migrated before they fully enabled long mode). Urgh. In the restore case we don't have a current EFER to use. In this case I think it is acceptable to check that new_lma is (new_lme && !cr4.pg), which allows us to declare that the restore state is consistent (presuming we have already validated cr4). For the non-restore case, the caller takes care of removing LMA from consideration. > >>> + if ( (value & EFER_NX) && !(ext1_edx & cpufeat_mask(X86_FEATURE_NX)) ) >>> + return 0; >>> + >>> + if ( (value & EFER_SVME) && >>> + (!(ext1_ecx & cpufeat_mask(X86_FEATURE_SVM)) || >>> + !nestedhvm_enabled(v->domain)) ) >> This is going to cause an issue for the restore case, as the HVM PARAMs >> follow the architectural state. >> >> I don't believe it is reasonable for nestedhvm_enabled() to disagree >> with cpufeat_mask(X86_FEATURE_{SVM,VMX}), or for the toolstack to be >> able to yank nestedhvm while the VM is active. > I effectively only translated the pre-existing (and kind of suspicious > to me) nestedhvm_enabled() use here. So you did. (My review focused a bit too much along the lines of "does this new function match what the manuals state") > >> Looking at the checks when setting HVM_PARAM_NESTEDHVM, neither of the >> guest feature flags are actually checked before setting up the nested >> infrastructure. > Which would be a separate issue. It is indeed. I apologize for rambling slightly - it was intended purely as an observation. > >> Having said all of this, don't appear to be any migration records >> associated with nested state (hardware-loaded VMCS/VMCB pointers, >> currently nested?) which leads me to suspect that a domain actually >> using nested virt is not going to survive a migration, so it might be >> acceptable to fudge the checking of SVME for now. > So am I getting you right that while these are all valid observations, > you don't really ask for a change to the patch in this regard (i.e. > taking care of above EFER_LMA issue is all that's needed, and even > that only for the unusual case of a guest getting migrated in the > middle of it enabling long mode on one of its CPUs)? As the nested checks are the same as before, I am happy with them staying as they are, and leaving work to whomever sees about fixing the NESTEDHVM semantics. ~Andrew ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-09 15:09 ` Andrew Cooper @ 2015-01-09 15:33 ` Jan Beulich 2015-01-09 15:36 ` Andrew Cooper 0 siblings, 1 reply; 16+ messages in thread From: Jan Beulich @ 2015-01-09 15:33 UTC (permalink / raw) To: Andrew Cooper; +Cc: xen-devel, Keir Fraser >>> On 09.01.15 at 16:09, <andrew.cooper3@citrix.com> wrote: > On 09/01/15 11:20, Jan Beulich wrote: >>>>> On 08.01.15 at 19:49, <andrew.cooper3@citrix.com> wrote: >>> I think hvm_efer_valid() also needs the current EFER and CR0 to work out >>> what the current LMA should be, and reject any attempt to change it. >> I.e. this would be needed only for the restore path (and it not being >> done hasn't caused problems for me because guests don't normally >> get migrated before they fully enabled long mode). > > Urgh. In the restore case we don't have a current EFER to use. In this > case I think it is acceptable to check that new_lma is (new_lme && > !cr4.pg), which allows us to declare that the restore state is > consistent (presuming we have already validated cr4). I suppose you mean cr0 here, and also new_lma == new_lme && cr0.pg? Jan ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features 2015-01-09 15:33 ` Jan Beulich @ 2015-01-09 15:36 ` Andrew Cooper 0 siblings, 0 replies; 16+ messages in thread From: Andrew Cooper @ 2015-01-09 15:36 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser On 09/01/15 15:33, Jan Beulich wrote: >>>> On 09.01.15 at 16:09, <andrew.cooper3@citrix.com> wrote: >> On 09/01/15 11:20, Jan Beulich wrote: >>>>>> On 08.01.15 at 19:49, <andrew.cooper3@citrix.com> wrote: >>>> I think hvm_efer_valid() also needs the current EFER and CR0 to work out >>>> what the current LMA should be, and reject any attempt to change it. >>> I.e. this would be needed only for the restore path (and it not being >>> done hasn't caused problems for me because guests don't normally >>> get migrated before they fully enabled long mode). >> Urgh. In the restore case we don't have a current EFER to use. In this >> case I think it is acceptable to check that new_lma is (new_lme && >> !cr4.pg), which allows us to declare that the restore state is >> consistent (presuming we have already validated cr4). > I suppose you mean cr0 here, and also new_lma == new_lme && cr0.pg? Oops yes - I do on both counts. ~Andrew ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH 0/3] x86: XSA-111 follow-ups 2015-01-08 15:15 [PATCH 0/3] x86: XSA-111 follow-ups Jan Beulich ` (2 preceding siblings ...) 2015-01-08 15:23 ` [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features Jan Beulich @ 2015-01-08 15:53 ` Tim Deegan 3 siblings, 0 replies; 16+ messages in thread From: Tim Deegan @ 2015-01-08 15:53 UTC (permalink / raw) To: Jan Beulich; +Cc: xen-devel, Keir Fraser At 15:15 +0000 on 08 Jan (1420726540), Jan Beulich wrote: > 1: streamline hypercall_create_continuation() > 2: HVM: clobber hypercall arguments just like for PV > 3: HVM: make hvm_efer_valid() honor guest features > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-01-09 15:36 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-08 15:15 [PATCH 0/3] x86: XSA-111 follow-ups Jan Beulich 2015-01-08 15:22 ` [PATCH 1/3] x86: streamline hypercall_create_continuation() Jan Beulich 2015-01-08 16:01 ` Andrew Cooper 2015-01-08 16:11 ` Jan Beulich 2015-01-08 15:22 ` [PATCH 2/3] x86/HVM: clobber hypercall arguments just like for PV Jan Beulich 2015-01-08 17:20 ` Andrew Cooper 2015-01-08 15:23 ` [PATCH 3/3] x86/HVM: make hvm_efer_valid() honor guest features Jan Beulich 2015-01-08 15:56 ` Tim Deegan 2015-01-08 16:04 ` Jan Beulich 2015-01-08 18:57 ` Andrew Cooper 2015-01-08 18:49 ` Andrew Cooper 2015-01-09 11:20 ` Jan Beulich 2015-01-09 15:09 ` Andrew Cooper 2015-01-09 15:33 ` Jan Beulich 2015-01-09 15:36 ` Andrew Cooper 2015-01-08 15:53 ` [PATCH 0/3] x86: XSA-111 follow-ups Tim Deegan
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.