From mboxrd@z Thu Jan  1 00:00:00 1970
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Subject: Re: [PATCH v4 04/16] xen: Add is_vmware_port_enabled
Date: Thu, 18 Sep 2014 18:53:45 -0400
Message-ID: <541B6279.2070104@oracle.com>
References: <1410460610-14759-1-git-send-email-dslutz@verizon.com>	<1410460610-14759-5-git-send-email-dslutz@verizon.com>	<5412F059.10003@oracle.com>	<A3CD31A5D207064088A18AC2AF7B5DC6B90012B3@MIA20725MBX891B.apps.tmrk.corp>	<5419AF39.70403@oracle.com>
	<A3CD31A5D207064088A18AC2AF7B5DC6BA5C08E2@MIA20725MBX891A.apps.tmrk.corp>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <A3CD31A5D207064088A18AC2AF7B5DC6BA5C08E2@MIA20725MBX891A.apps.tmrk.corp>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: "Slutz, Donald Christopher" <dslutz@verizon.com>
Cc: Kevin Tian <kevin.tian@intel.com>, Keir Fraser <keir@xen.org>, Ian Campbell <ian.campbell@citrix.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, George Dunlap <George.Dunlap@eu.citrix.com>, Eddie Dong <eddie.dong@intel.com>, Tim Deegan <tim@xen.org>, "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>, Jan Beulich <jbeulich@suse.com>, Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>, Jun Nakajima <jun.nakajima@intel.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>, Ian Jackson <ian.jackson@eu.citrix.com>
List-Id: xen-devel@lists.xenproject.org

On 09/17/2014 02:23 PM, Slutz, Donald Christopher wrote:
> On 09/17/14 11:56, Boris Ostrovsky wrote:
>> On 09/16/2014 08:08 AM, Slutz, Donald Christopher wrote:
>>> On 09/12/14 09:08, Boris Ostrovsky wrote:
>>>> On 09/11/2014 02:36 PM, Don Slutz wrote:
>>>>>     int __get_instruction_length_from_list(struct vcpu *v,
>>>>> -        const enum instruction_index *list, unsigned int list_count)
>>>>> +                                       const enum instruction_index
>>>>> *list,
>>>>> +                                       unsigned int list_count,
>>>>> +                                       bool_t err_rpt)
>>>>>     {
>>>>>         struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>>>>>         unsigned int i, j, inst_len = 0;
>>>>> @@ -211,10 +222,13 @@ int __get_instruction_length_from_list(struct
>>>>> vcpu *v,
>>>>>         mismatch: ;
>>>>>         }
>>>>>     -    gdprintk(XENLOG_WARNING,
>>>>> -             "%s: Mismatch between expected and actual instruction
>>>>> bytes: "
>>>>> -             "eip = %lx\n",  __func__, (unsigned long)vmcb->rip);
>>>>> -    hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>>>> +    if ( err_rpt )
>>>>> +    {
>>>>> +        gdprintk(XENLOG_WARNING,
>>>>> +                 "%s: Mismatch between expected and actual
>>>>> instruction bytes: "
>>>>> +                 "eip = %lx\n",  __func__, (unsigned long)vmcb->rip);
>>>>> +        hvm_inject_hw_exception(TRAP_gp_fault, 0);
>>>>> +    }
>>>>>         return 0;
>>>>>        done:
>>>>> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
>>>>> index b5188e6..9e14d2a 100644
>>>>> --- a/xen/arch/x86/hvm/svm/svm.c
>>>>> +++ b/xen/arch/x86/hvm/svm/svm.c
>>>>> @@ -59,6 +59,7 @@
>>>>>     #include <public/sched.h>
>>>>>     #include <asm/hvm/vpt.h>
>>>>>     #include <asm/hvm/trace.h>
>>>>> +#include <asm/hvm/vmport.h>
>>>>>     #include <asm/hap.h>
>>>>>     #include <asm/apic.h>
>>>>>     #include <asm/debugger.h>
>>>>> @@ -2065,6 +2066,38 @@ svm_vmexit_do_vmsave(struct vmcb_struct *vmcb,
>>>>>         return;
>>>>>     }
>>>>>     +static void svm_vmexit_gp_intercept(struct cpu_user_regs *regs,
>>>>> +                                    struct vcpu *v)
>>>>> +{
>>>>> +    struct vmcb_struct *vmcb = v->arch.hvm_svm.vmcb;
>>>>> +    unsigned long inst_len;
>>>>> +    unsigned long inst_addr = svm_rip2pointer(v);
>>>>> +    int rc;
>>>>> +    static const enum instruction_index list[] = {
>>>>> +        INSTR_INL_DX, INSTR_INB_DX, INSTR_OUTL_DX, INSTR_OUTB_DX
>>>>> +    };
>>>>> +
>>>>> +    inst_len = __get_instruction_length_from_list(
>>>>> +        v, list, ARRAY_SIZE(list), 0);
>>>> I should have asked earlier but I don't think I understand why the
>>>> last argument here is 0 (and therefore why you have this last argument
>>>> at all).
>>>>
>>>> Because whether or not you are warning in
>>>> __get_instruction_length_from_list() it will still return 0. And that,
>>>> in turn, will cause vmport_gp_check() to return an error. And then you
>>>> will print another warning in VMPORT_LOG. So there is a warning anyway.
>>>>
>>> A key part that you appear to have missed is that
>>> __get_instruction_length_from_list() uses gdprintk(XENLOG_WARNING,...
>>> but VMPORT_DBG_LOG is only available in debug=y builds.  So the new
>>> last argument is used to control this.  Since this change includes
>>> enabling #GP vmexits, it is now possible for ring 3 users to generate at
>>> large volume of these which with gdprintk() can flood the console.
>> Would it be possible to decide where and whether to print the warning
>> inside __get_instruction_length_from_list() as opposed to passing a
>> new parameter? E.g. if vmware_port_enabled is set and list includes
>> IN/OUT and possibly something else?
>>
> It could be.  However the test is complex and not something I am willing
> to be part of this patch.  So the only thing I have come up with is using
> #ifndef NDEBUG around the gdprintk().  This leaves the
>
>
>           hvm_inject_hw_exception(TRAP_gp_fault, 0);
>
> Which is not correct in this case.  Now are far as I can tell, calling this
> twice does the wrong thing for this case.  I.E. turns it into a double
> fault.
> For #GP I need to call it with vmcb->exitinfo1 instead of 0.
>
> (Note: v5 posted before I got this.)
>
> I just spent some time checking and found out that even with the cpu
> reporting:
>
> (XEN)   - Next-RIP Saved on #VMEXIT
>
> svm_nextrip_insn_length(v) is 0.

As I see it, this could happen for one of three reasons:
* !cpu_has_svm_nrips which can't be the case since (1) family 21 
supports it and (2) you actually see in the log that it does have it
* next RIP is before current RIP which I think can't be the case neither 
because we are not looking at branch instruction or something like that.
* nextrip == rip. Which I don't see how it can be true.

Can you check why svm_nextrip_insn_length(v) is 0?

But regardless of that, how do you expect your code to work on CPUs that 
don't support NRIP? On those processors you *will* be decoding the 
instruction twice.

>
> (hyper-0-21-54:~>cat /proc/cpuinfo
> processor       : 0
> vendor_id       : AuthenticAMD
> cpu family      : 21
> model           : 2
> model name      : AMD Opteron(tm) Processor 4365 EE
> stepping        : 0
> microcode       : 0x600081c
> ...)
>
> Which means to me that by using __get_instruction_length_from_list()
> I am reading the instruction bytes 2 times.  Once in
> __get_instruction_length_from_list() and once in vmport_gp_check().
> This is not too bad since the rate of these is low.  But this does
> suggest to
> me that the change to using __get_instruction_length_from_list() was
> not the best.  Having a routine that both checks the instruction and
> reports on which one was found would be much better.
>
> So do I go with v5, or dropping the use of
> __get_instruction_length_from_list()
> in a v6?  (The coding of the new routine will take some time.)

Given what Jan said, it sounds like v5 is not going to work since you'd 
be decoding instruction twice, right?

-boris

>
>      -Don Slutz
>
>
>> -boris
>>
>>>
>>>> Second, since this handler appears to be handling #GP only for VMware
>>>> guest we should make sure that it is not executed for any other guest.
>>>> You do now condition intercept got #GP for such guests only but I
>>>> still think having a check here is worth doing. Maybe a BUG() or
>>>> ASSERT()?
>>>>
>>>> The same comments are applicable to VMX code, I suspect.
>>>>
>>> I will change the check in vmport_gp_check on is_vmware_port_enabled
>>> into an ASSERT() so both SVM and VMX will be covered.
>>>
>>>>> +
>>>>> +    rc = vmport_gp_check(regs, v, inst_len, inst_addr,
>>>>> +                         vmcb->exitinfo1, vmcb->exitinfo2);
>>>>> +    if ( !rc )
>>>>> +        __update_guest_eip(regs, inst_len);
>>>>> +    else
>>>>> +    {
>>>>> +        VMPORT_DBG_LOG(VMPORT_LOG_GP_UNKNOWN,
>>>>> +                       "gp: rc=%d ei1=0x%lx ei2=0x%lx ip=%"PRIx64
>>>>> +                       " (0x%lx,%ld) ax=%"PRIx64" bx=%"PRIx64"
>>>>> cx=%"PRIx64
>>>>> +                       " dx=%"PRIx64" si=%"PRIx64" di=%"PRIx64, rc,
>>>>> +                       (unsigned long)vmcb->exitinfo1,
>>>>> +                       (unsigned long)vmcb->exitinfo2, regs->rip,
>>>>> inst_addr,
>>>>> +                       inst_len, regs->rax, regs->rbx, regs->rcx,
>>>>> regs->rdx,
>>>>> +                       regs->rsi, regs->rdi);
>>>>> +        hvm_inject_hw_exception(TRAP_gp_fault, regs->error_code);
>>>>> +    }
>>>>> +}
>>>>> +
>>>> .
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel