From mboxrd@z Thu Jan 1 00:00:00 1970 From: Don Slutz Subject: Re: [PATCH for-4.5 v6 06/16] xen: Convert vmware_port to xentrace usage Date: Mon, 29 Sep 2014 14:10:44 -0400 Message-ID: <5429A0A4.9070704@terremark.com> References: <1411236447-7435-1-git-send-email-dslutz@verizon.com> <1411236447-7435-7-git-send-email-dslutz@verizon.com> <5422FF1D.8000308@eu.citrix.com> <5423167A.1070901@terremark.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , Don Slutz Cc: Tim Deegan , Kevin Tian , Keir Fraser , Ian Campbell , Stefano Stabellini , Ian Jackson , Eddie Dong , "xen-devel@lists.xen.org" , Jan Beulich , Aravind Gopalakrishnan , Jun Nakajima , Andrew Cooper , Boris Ostrovsky , Suravee Suthikulpanit List-Id: xen-devel@lists.xenproject.org On 09/25/14 11:14, George Dunlap wrote: > On Wed, Sep 24, 2014 at 8:07 PM, Don Slutz wrote: >> On 09/24/14 13:27, George Dunlap wrote: >>> On 09/20/2014 07:07 PM, Don Slutz wrote: >>>> Reduce the VMPORT_DBG_LOG calls. >>> >>> You should also have mentioned that you added hew HVMTRACE macros which >>> will log the TSC value. >>> >>> The reason the HVMTRACE macros don't log the TSC values is that for the >>> most part you can get all the timing information you need from the TSC on >>> the vmexit and vmenter. Looking at where you've added the TSC values, I >>> don't really see how it adds anything except bloat to the log. Is there a >>> reason you need to know exactly when these different things happened, >>> instead of just being able to bracket them between VMENTER and VMEXITs? >>> >> I did want a way to know how long the VMware code was taking. I am >> not sure this is required. >> >> For example: >> >> CPU1 2899550319282 (+ 4170) VMEXIT [ exitcode = 0x00000000, rIP = >> 0x00007fad13ffec8c ] >> CPU1 2899550320086 (+ 804) TRAP_GP [ inst_len = 1 edx = 0x00005658 >> exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] >> CPU1 2899550325054 (+ 4968) VMPORT_READ_BEFORE [ eax = 0x564d5868 ebx >> = 0x000001b3 ecx = 0x0003001e edx = 0x00005658 esi = 0x00000000 edi = >> 0x000001b3 ] >> CPU1 2899550326050 (+ 996) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx = >> 0x000001b3 ecx = 0x0001001e edx = 0x00005658 esi = 0x00000000 edi = >> 0x000001b3 ] >> CPU1 2899550326722 (+ 672) vlapic_accept_pic_intr [ i8259_target = 1, >> accept_pic_int = 0 ] >> CPU1 2899550327454 (+ 732) VMENTRY >> >> >> But I am happy to drop the new log TSC macros. > The tracing function itself is not free -- the trace_var() function > probably executes 5x the amount of code that is actually executed > between the READ_BEFORE and READ_AFTER (given it's a switch statement > and each one is basically a handful of variable assignments). It's > not unlikely that most of the 996 us there is in the trace function > itself. Ok, but the 83700 us is not all this. >>>> diff --git a/xen/arch/x86/hvm/vmware/vmport.c >>>> b/xen/arch/x86/hvm/vmware/vmport.c >>>> index 811c303..962ee32 100644 >>>> --- a/xen/arch/x86/hvm/vmware/vmport.c >>>> +++ b/xen/arch/x86/hvm/vmware/vmport.c >>> We normally don't log both BEFORE and AFTER states of things like >>> hypercalls -- just logging the outcome of what the hypervisor did should be >>> sufficient, shouldn't it? >> >> Not that clear with this poorly build hyper-call. >> >>> Do you really need to know the value of things that got clobbered? >> >> When checking on the complex state machine that is the "RPC" code >> it was very helpful. With that code moving to QEMU, the before and >> after is not so important. >> >> Here is a real example: >> >> CPU2 865821836508576 (+ 2562) VMEXIT [ exitcode = 0x00000000, rIP >> = 0x00007f68a8b17c8c ] >> CPU2 865821836509362 (+ 786) TRAP_GP [ inst_len = 1 edx = >> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] >> CPU2 865821836514132 (+ 4770) VMPORT_READ_BEFORE [ eax = 0x564d5868 >> ebx = 0x00000034 ecx = 0x0002001e edx = 0x00025658 esi = 0x00000000 edi = >> 0x000001be ] >> CPU2 865821836597832 (+ 83700) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx >> = 0x00000034 ecx = 0x0001001e edx = 0x00025658 esi = 0x00000000 edi = >> 0x000001be ] >> CPU2 865821836598756 (+ 924) vlapic_accept_pic_intr [ i8259_target = >> 1, accept_pic_int = 0 ] >> CPU2 865821836605602 (+ 6846) vlapic_accept_pic_intr [ i8259_target = >> 1, accept_pic_int = 0 ] >> CPU2 865821836606436 (+ 834) VMENTRY >> CPU2 865821836609712 (+ 3276) VMEXIT [ exitcode = 0x00000000, rIP >> = 0x00007f68a8b17c8c ] >> CPU2 865821836610654 (+ 942) TRAP_GP [ inst_len = 1 edx = >> 0x00025658 exitinfo1 = 0x0000000000000000 exitinfo2 = 0x0000000000000000 ] >> CPU2 865821836616828 (+ 6174) VMPORT_READ_BEFORE [ eax = 0x564d5868 >> ebx = 0x00000034 ecx = 0x0003001e edx = 0x00025658 esi = 0x00000000 edi = >> 0x000001be ] >> CPU2 865821836617800 (+ 972) VMPORT_READ_AFTER [ eax = 0x564d5868 ebx >> = 0x00000011 ecx = 0x0003001e edx = 0x00015658 esi = 0x00000000 edi = >> 0x000001be ] >> CPU2 865821836618664 (+ 864) vlapic_accept_pic_intr [ i8259_target = >> 1, accept_pic_int = 0 ] >> CPU2 865821836619444 (+ 780) VMENTRY >> >> Note that in the one "RPC" call, >> >> >> grep VMPORT_READ_BEFORE ~/zz-xentrace-vmware3-0.out | wc >> 1592 39800 256312 >> >> It took 1592 #GP traps to handle it, and 9643628760 tsc cycles. > Right, so what I started to say yesterday: It looks like most of the > trace points you're adding here is to help you debug the functionality > of the hypervisor. That certainly makes sense for you to do during > development. But what we want in upstream is something that will help > us in production. For that to be useful, we need the logging to be as > efficient as possible. Every additional HVM trace point potentially > adds more data to someone else's HVM trace. So we don't want > extraneous information, and we don't want to log something that we can > infer from something else. > > In general, it's enough to give information about the decisions that > Xen is making to infer what the previous state is; and then giving > information about what Xen did in response (i.e., return values, > injection of traps, &c) to help figure out how the guest responded. > In this case, I'd probably trace: > 1. vmport hypercall, handled command > - cmd > - return values of modified registers. Ideally only the registers > that are modified, but just taking a big batch would be OK for now. > Note: No need for rip here, as it will be collected at the VMEXIT > 2. vmport hypercall, unhandled command > - Just the unimplemented fail > 3. In the vmport_gp_check(), if it successfully decodes an IO instruction: > - direction (read/write) > - size of the access > > I might consider logging something the failure path of > *_vmexit_gp_intercept(), with information that might help you figure > out why it didn't make it to the vmcall; but on the whole I think I'd > probably leave that off. > > Hopefully all that would give you enough information to figure out > where the problem was and how to reproduce the behavior locally; and > once you can reproduce it locally, you could add in debugging traces > (which wouldn't be upstreamed) to help you figure out why it wasn't > taking the path you expected. > > Does that make sense? Yes. Will attempt to have this subset. -Don Slutz > If you really want more traces, I might consider allowing them in the > code but off by default; but I think you probably won't need more > information from running production systems. > > -George