From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v4 04/16] xen: Add is_vmware_port_enabled Date: Thu, 18 Sep 2014 18:53:45 -0400 Message-ID: <541B6279.2070104@oracle.com> References: <1410460610-14759-1-git-send-email-dslutz@verizon.com> <1410460610-14759-5-git-send-email-dslutz@verizon.com> <5412F059.10003@oracle.com> <5419AF39.70403@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "Slutz, Donald Christopher" Cc: Kevin Tian , Keir Fraser , Ian Campbell , Stefano Stabellini , George Dunlap , Eddie Dong , Tim Deegan , "xen-devel@lists.xen.org" , Jan Beulich , Aravind Gopalakrishnan , Jun Nakajima , Andrew Cooper , Suravee Suthikulpanit , Ian Jackson List-Id: xen-devel@lists.xenproject.org On 09/17/2014 02:23 PM, Slutz, Donald Christopher wrote: > On 09/17/14 11:56, Boris Ostrovsky wrote: >> On 09/16/2014 08:08 AM, Slutz, Donald Christopher wrote: >>> On 09/12/14 09:08, Boris Ostrovsky wrote: >>>> On 09/11/2014 02:36 PM, Don Slutz wrote: >>>>> int __get_instruction_length_from_list(struct vcpu *v, >>>>> - const enum instruction_index *list, unsigned int list_count) >>>>> + const enum instruction_index >>>>> *list, >>>>> + unsigned int list_count, >>>>> + bool_t err_rpt) >>>>> { >>>>> struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; >>>>> unsigned int i, j, inst_len = 0; >>>>> @@ -211,10 +222,13 @@ int __get_instruction_length_from_list(struct >>>>> vcpu *v, >>>>> mismatch: ; >>>>> } >>>>> - gdprintk(XENLOG_WARNING, >>>>> - "%s: Mismatch between expected and actual instruction >>>>> bytes: " >>>>> - "eip = %lx\n", __func__, (unsigned long)vmcb->rip); >>>>> - hvm_inject_hw_exception(TRAP_gp_fault, 0); >>>>> + if ( err_rpt ) >>>>> + { >>>>> + gdprintk(XENLOG_WARNING, >>>>> + "%s: Mismatch between expected and actual >>>>> instruction bytes: " >>>>> + "eip = %lx\n", __func__, (unsigned long)vmcb->rip); >>>>> + hvm_inject_hw_exception(TRAP_gp_fault, 0); >>>>> + } >>>>> return 0; >>>>> done: >>>>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c >>>>> index b5188e6..9e14d2a 100644 >>>>> --- a/xen/arch/x86/hvm/svm/svm.c >>>>> +++ b/xen/arch/x86/hvm/svm/svm.c >>>>> @@ -59,6 +59,7 @@ >>>>> #include >>>>> #include >>>>> #include >>>>> +#include >>>>> #include >>>>> #include >>>>> #include >>>>> @@ -2065,6 +2066,38 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb, >>>>> return; >>>>> } >>>>> +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs, >>>>> + struct vcpu *v) >>>>> +{ >>>>> + struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb; >>>>> + unsigned long inst_len; >>>>> + unsigned long inst_addr = svm_rip2pointer(v); >>>>> + int rc; >>>>> + static const enum instruction_index list[] = { >>>>> + INSTR_INL_DX, INSTR_INB_DX, INSTR_OUTL_DX, INSTR_OUTB_DX >>>>> + }; >>>>> + >>>>> + inst_len = __get_instruction_length_from_list( >>>>> + v, list, ARRAY_SIZE(list), 0); >>>> I should have asked earlier but I don't think I understand why the >>>> last argument here is 0 (and therefore why you have this last argument >>>> at all). >>>> >>>> Because whether or not you are warning in >>>> __get_instruction_length_from_list() it will still return 0. And that, >>>> in turn, will cause vmport_gp_check() to return an error. And then you >>>> will print another warning in VMPORT_LOG. So there is a warning anyway. >>>> >>> A key part that you appear to have missed is that >>> __get_instruction_length_from_list() uses gdprintk(XENLOG_WARNING,... >>> but VMPORT_DBG_LOG is only available in debug=y builds. So the new >>> last argument is used to control this. Since this change includes >>> enabling #GP vmexits, it is now possible for ring 3 users to generate at >>> large volume of these which with gdprintk() can flood the console. >> Would it be possible to decide where and whether to print the warning >> inside __get_instruction_length_from_list() as opposed to passing a >> new parameter? E.g. if vmware_port_enabled is set and list includes >> IN/OUT and possibly something else? >> > It could be. However the test is complex and not something I am willing > to be part of this patch. So the only thing I have come up with is using > #ifndef NDEBUG around the gdprintk(). This leaves the > > > hvm_inject_hw_exception(TRAP_gp_fault, 0); > > Which is not correct in this case. Now are far as I can tell, calling this > twice does the wrong thing for this case. I.E. turns it into a double > fault. > For #GP I need to call it with vmcb->exitinfo1 instead of 0. > > (Note: v5 posted before I got this.) > > I just spent some time checking and found out that even with the cpu > reporting: > > (XEN) - Next-RIP Saved on #VMEXIT > > svm_nextrip_insn_length(v) is 0. As I see it, this could happen for one of three reasons: * !cpu_has_svm_nrips which can't be the case since (1) family 21 supports it and (2) you actually see in the log that it does have it * next RIP is before current RIP which I think can't be the case neither because we are not looking at branch instruction or something like that. * nextrip == rip. Which I don't see how it can be true. Can you check why svm_nextrip_insn_length(v) is 0? But regardless of that, how do you expect your code to work on CPUs that don't support NRIP? On those processors you *will* be decoding the instruction twice. > > (hyper-0-21-54:~>cat /proc/cpuinfo > processor : 0 > vendor_id : AuthenticAMD > cpu family : 21 > model : 2 > model name : AMD Opteron(tm) Processor 4365 EE > stepping : 0 > microcode : 0x600081c > ...) > > Which means to me that by using __get_instruction_length_from_list() > I am reading the instruction bytes 2 times. Once in > __get_instruction_length_from_list() and once in vmport_gp_check(). > This is not too bad since the rate of these is low. But this does > suggest to > me that the change to using __get_instruction_length_from_list() was > not the best. Having a routine that both checks the instruction and > reports on which one was found would be much better. > > So do I go with v5, or dropping the use of > __get_instruction_length_from_list() > in a v6? (The coding of the new routine will take some time.) Given what Jan said, it sounds like v5 is not going to work since you'd be decoding instruction twice, right? -boris > > -Don Slutz > > >> -boris >> >>> >>>> Second, since this handler appears to be handling #GP only for VMware >>>> guest we should make sure that it is not executed for any other guest. >>>> You do now condition intercept got #GP for such guests only but I >>>> still think having a check here is worth doing. Maybe a BUG() or >>>> ASSERT()? >>>> >>>> The same comments are applicable to VMX code, I suspect. >>>> >>> I will change the check in vmport_gp_check on is_vmware_port_enabled >>> into an ASSERT() so both SVM and VMX will be covered. >>> >>>>> + >>>>> + rc = vmport_gp_check(regs, v, inst_len, inst_addr, >>>>> + vmcb->exitinfo1, vmcb->exitinfo2); >>>>> + if ( !rc ) >>>>> + __update_guest_eip(regs, inst_len); >>>>> + else >>>>> + { >>>>> + VMPORT_DBG_LOG(VMPORT_LOG_GP_UNKNOWN, >>>>> + "gp: rc=%d ei1=0x%lx ei2=0x%lx ip=%"PRIx64 >>>>> + " (0x%lx,%ld) ax=%"PRIx64" bx=%"PRIx64" >>>>> cx=%"PRIx64 >>>>> + " dx=%"PRIx64" si=%"PRIx64" di=%"PRIx64, rc, >>>>> + (unsigned long)vmcb->exitinfo1, >>>>> + (unsigned long)vmcb->exitinfo2, regs->rip, >>>>> inst_addr, >>>>> + inst_len, regs->rax, regs->rbx, regs->rcx, >>>>> regs->rdx, >>>>> + regs->rsi, regs->rdi); >>>>> + hvm_inject_hw_exception(TRAP_gp_fault, regs->error_code); >>>>> + } >>>>> +} >>>>> + >>>> . > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel