From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ingo Molnar Subject: Re: KVM PMU virtualization Date: Fri, 26 Feb 2010 11:56:21 +0100 Message-ID: <20100226105621.GD7463@elte.hu> References: <4B86917C.4070102@redhat.com> <20100225173423.GB4246@8bytes.org> <1267152917.1726.82.camel@localhost> <20100226085105.GC4246@8bytes.org> <20100226091732.GI15885@elte.hu> <20100226104218.GE4246@8bytes.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Zhang, Yanmin" , Jes Sorensen , KVM General , Peter Zijlstra , Avi Kivity , Zachary Amsden , Gleb Natapov , ming.m.lin@intel.com To: Joerg Roedel Return-path: Received: from mx2.mail.elte.hu ([157.181.151.9]:39869 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933370Ab0BZK4d (ORCPT ); Fri, 26 Feb 2010 05:56:33 -0500 Content-Disposition: inline In-Reply-To: <20100226104218.GE4246@8bytes.org> Sender: kvm-owner@vger.kernel.org List-ID: * Joerg Roedel wrote: > On Fri, Feb 26, 2010 at 10:17:32AM +0100, Ingo Molnar wrote: > > My suggestion, as always, would be to start very simple and very minimal: > > > > Enable 'perf kvm top' to show guest overhead. Use the exact same kernel image > > both as a host and as guest (for testing), to not have to deal with the symbol > > space transport problem initially. Enable 'perf kvm record' to only record > > guest events by default. Etc. > > > > This alone will be a quite useful result already - and gives a basis for > > further work. No need to spend months to do the big grand design straight > > away, all of this can be done gradually and in the order of usefulness - and > > you'll always have something that actually works (and helps your other KVM > > projects) along the way. > > > > [ And, as so often, once you walk that path, that grand scheme you are > > thinking about right now might easily become last year's really bad idea ;-) ] > > > > So please start walking the path and experience the challenges first-hand. > > That sounds like a good approach for the 'measure-guest-from-host' > problem. It is also not very hard to implement. Where does perf fetch > the rip of the nmi from, stack only or is this configurable? The host semantics are that it takes the stack from the regs, and with call-graph recording (perf record -g) it will walk down the exception stack, irq stack, kernel stack, and user-space stack as well. (up to the point the pages are present - it stops on a non-present page. An app that is being profiled has its stack present so it's not an issue in practice.) I'd suggest to leave out call graph sampling initially, and just get 'perf kvm top' to work with guest RIPs, simply sampled from the VM exit state. See arch/x86/kernel/cpu/perf_event.c: static void perf_callchain_kernel(struct pt_regs *regs, struct perf_callchain_entry *entry) { callchain_store(entry, PERF_CONTEXT_KERNEL); callchain_store(entry, regs->ip); dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry); } If you have easy access to the VM state from NMI context right there then just hack in the guest RIP and you should have some prototype that samples the guest. (assuming you use the same kernel image for both the host an the guest) This would be the easiest way to prototype it all. Ingo