From mboxrd@z Thu Jan 1 00:00:00 1970 From: Peter Zijlstra Subject: Re: [PATCH v1 1/8] perf/x86: add support to mask counters from host Date: Mon, 5 Nov 2018 17:56:57 +0100 Message-ID: <20181105165657.GD22431@hirez.programming.kicks-ass.net> References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com> <1541066648-40690-2-git-send-email-wei.w.wang@intel.com> <20181101145257.GD3178@hirez.programming.kicks-ass.net> <5BDC140F.6060303@intel.com> <20181105093413.GO3178@hirez.programming.kicks-ass.net> <5BE02725.3010707@intel.com> <20181105121413.GC22431@hirez.programming.kicks-ass.net> <286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "linux-kernel@vger.kernel.org" , "kvm@vger.kernel.org" , "pbonzini@redhat.com" , "ak@linux.intel.com" , "mingo@redhat.com" , "rkrcmar@redhat.com" , "Xu, Like" To: "Wang, Wei W" Return-path: Content-Disposition: inline In-Reply-To: <286AC319A985734F985F78AFA26841F73DE3AC8B@shsmsx102.ccr.corp.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org Please; don't send malformed emails like this. Lines wrap at 78 chars. On Mon, Nov 05, 2018 at 03:37:24PM +0000, Wang, Wei W wrote: > On Monday, November 5, 2018 8:14 PM, Peter Zijlstra wrote: > > That can only work if the host counter has perf_event_attr::exclude_guest=1, > > any counter without that must also count when the guest is running. > > > > (and, IIRC, normal perf tool events do not have that set by default) > > Probably no. Please see Line 81 at > https://github.com/torvalds/linux/blob/master/tools/perf/util/util.c > perf_guest by default is false, which makes "attr->exclude_guest = 1" Then you're in luck. But if the host creates an even that has exclude_guest=0 set, it should still work. > > The thing is; you cannot do blind pass-through of the PMU, some of its > > features simply do not work in a guest. Also, the host perf driver expects > > certain functionality that must be respected. > > Actually we are not blindly assigning the perf counters. Guest works > with its own complete perf stack (like the one on the host) which also > has its own constraints. But it knows nothing of the host state. > The counter is also not passed through to the guest, guest accesses to > the assigned counter will still exit to the hypervisor, and the > hypervisor helps update the counter. Yes, you have to; because the PMU doesn't properly virtualize, also because the HV -- linux in our case -- already claimed the PMU. So the network passthrough case you mentioned simply doesn't apply at all. Don't bother looking at it for inspiration. > > Those are the constraints you have to work with. > > > > Back when we all started down this virt rathole, I proposed people do > > paravirt perf, where events would be handed to the host kernel and let the > > host kernel do its normal thing. But people wanted to do the MSR based > > thing because of !linux guests. > > IMHO, it is worthwhile to care more about the real use case. When a > user gets a virtual machine from a vendor, all he can do is to run > perf inside the guest. The above contention concerns would not happen, > because the user wouldn't be able to come to the host to run perf on > the virtualization software (e.g. ./perf qemu..) and in the meantime > running perf in the guest to cause the contention. That's your job. Mine is to make sure that whatever you propose fits in the existing model and doesn't make a giant mess of things. And for Linux guests on Linux hosts, paravirt perf still makes the most sense to me; then you get the host scheduling all the events and providing the guest with the proper counts/runtimes/state. > On the other hand, when we improve the user experience of running perf > inside the guest by reducing the virtualization overhead, that would > bring real benefits to the real use case. You can start to improve things by doing a less stupid implementation of the existing code.