From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH v12 for-xen-4.5 16/20] x86/VPMU: Handle PMU interrupts for PV guests Date: Wed, 01 Oct 2014 08:53:18 -0400 Message-ID: <542BF93E.7040505@oracle.com> References: <1411673336-32736-1-git-send-email-boris.ostrovsky@oracle.com> <1411673336-32736-17-git-send-email-boris.ostrovsky@oracle.com> <542A81B6020000780003ADD4@mail.emea.novell.com> <542AC71C.6020804@oracle.com> <542AEC00020000780003B224@mail.emea.novell.com> <542ADC57.1070708@oracle.com> <542BC00A020000780003B510@mail.emea.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <542BC00A020000780003B510@mail.emea.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: kevin.tian@intel.com, keir@xen.org, suravee.suthikulpanit@amd.com, andrew.cooper3@citrix.com, tim@xen.org, dietmar.hahn@ts.fujitsu.com, xen-devel@lists.xen.org, Aravind.Gopalakrishnan@amd.com, jun.nakajima@intel.com, dgdegra@tycho.nsa.gov List-Id: xen-devel@lists.xenproject.org On 10/01/2014 02:49 AM, Jan Beulich wrote: >>>> On 30.09.14 at 18:37, wrote: >> On 09/30/2014 11:44 AM, Jan Beulich wrote: >>>>>> + { >>>>>> + r->cs = cur_regs->cs; >>>>>> + if ( sampled->arch.flags & TF_kernel_mode ) >>>>>> + r->cs &= ~3; >>>>> And once again I wonder how the consumer of this data is to tell >>>>> apart guest kernel and hypervisor addresses. >>>> Based on the RIP --- perf, for example, searches through various symbol >>>> tables. >>> That doesn't help when profiling HVM/PVH guests - addresses are >>> ambiguous in that case. >> Hypervisor traces are only sent to dom0, which is currently PV only. The >> key here, of course, is the word 'currently'. > So you completely ignore PVH Dom0? Experimental or not, I don't > think that's the way to go. As I mentioned in an earlier reply, I will set domain_id in the reported structure to DOMID_XEN when we are reporting hypervisor sample. > Furthermore the check around this is > once again using sampled, not sampling. Which check are you referring to? > > Looking at the separation of hypervisor vs guest context to report > again > > /* Non-privileged domains are always in XENPMU_MODE_SELF mode */ > if ( (vpmu_mode & XENPMU_MODE_SELF) || > (!is_hardware_domain(sampled->domain) && > !is_idle_vcpu(sampled)) ) > cur_regs = guest_cpu_user_regs(); > else > cur_regs = regs; > > I now additionally wonder why the condition here isn't just the SELF > check: If the interrupt happened while in the hypervisor, why would > you override this unconditionally to report a guest sample instead? > Shouldn't the profiling domain tell you what it wants in that case > (global vs guest local view)? The second part of the check (!is_hardware_domain(sampled->domain) && !is_idle_vcpu(sampled)) is to prevent sending hypervisor sample to a non-privileged guest. vpmu_mode may be, for example, XENPMU_MODE_HV but that only means that dom0 can get hypervisor samples. Perhaps the comment is confusing in that it may imply that each domain has its own XENPMU_MODE. Which is not true --- vpmu_mode is a global. I should have said that "Non-privileged domains are *effectively* always in XENPMU_MODE_SELF mode". > >>>> I suppose I can set xenpmu_data->domain_id below to either DOMID_SELF >>>> for guest and DOMID_XEN for the hypervisor. >>> That's an option, but I'm really having reservations against simulating >>> ring-0 execution in PV guests here. It would certainly be better if we >>> could report reality here, but I can see reservations on the consumer >>> (perf) side against us doing so. >> Yes, perf will probably not like it --- as I mentioned in an earlier >> message, it calls user_mode(regs) which is essentially !!(regs->cs & 3). > So you're crippling the Xen implementation in order to please one > of potentially many consumers... Along the lines of what I said > above, I think this ought to be controlled by the consumer of the > interface, defaulting to not doing any masking here. I can add a return value (flags, for example) to indicate whether we are in user or kernel mode. I don't want to provide another control interface for this. -boris