linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	 Arnaldo Carvalho de Melo <acme@kernel.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	 Like Xu <likexu@tencent.com>,
	Mark Rutland <mark.rutland@arm.com>,
	 Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,  Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	 Adrian Hunter <adrian.hunter@intel.com>,
	kvm@vger.kernel.org,  linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	 Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhang Xiong <xiong.y.zhang@intel.com>,
	 Lv Zhiyuan <zhiyuan.lv@intel.com>,
	Yang Weijiang <weijiang.yang@intel.com>,
	 Dapeng Mi <dapeng1.mi@intel.com>,
	Jim Mattson <jmattson@google.com>,
	 David Dunn <daviddunn@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>
Subject: Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event
Date: Fri, 29 Sep 2023 15:46:55 +0000	[thread overview]
Message-ID: <ZRbxb15Opa2_AusF@google.com> (raw)
In-Reply-To: <20230929115344.GE6282@noisy.programming.kicks-ass.net>

On Fri, Sep 29, 2023, Peter Zijlstra wrote:
> On Wed, Sep 27, 2023 at 10:27:07AM -0700, Sean Christopherson wrote:
> > Jumping the gun a bit (we're in the *super* early stages of scraping together a
> > rough PoC), but I think we should effectively put KVM's current vPMU support into
> > maintenance-only mode, i.e. stop adding new features unless they are *very* simple
> > to enable, and instead pursue an implementation that (a) lets userspace (and/or
> > the kernel builder) completely disable host perf (or possibly just host perf usage
> > of the hardware PMU) and (b) let KVM passthrough the entire hardware PMU when it
> > has been turned off in the host.
> 
> I don't think you need to go that far, host can use PMU just fine as
> long as it doesn't overlap with a vCPU. Basically, if you force
> perf_attr::exclude_guest on everything your vCPU can haz the full thing.

Complexity aside, my understanding is that the overhead of trapping and emulating
all of the guest counter and MSR accesses results in unacceptably degraded functionality
for the guest.  And we haven't even gotten to things like arch LBRs where context
switching MSRs between the guest and host is going to be quite costly.

> > Note, a similar idea was floated and rejected in the past[*], but that failed
> > proposal tried to retain host perf+PMU functionality by making the behavior dynamic,
> > which I agree would create an awful ABI for the host.  If we make the "knob" a
> > Kconfig 
> 
> Must not be Kconfig, distros would have no sane choice.

Or not only a Kconfig?  E.g. similar to how the kernel has
CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS and nopku.

> > or kernel param, i.e. require the platform owner to opt-out of using perf
> > no later than at boot time, then I think we can provide a sane ABI, keep the
> > implementation simple, all without breaking existing users that utilize perf in
> > the host to profile guests.
> 
> It's a shit choice to have to make. At the same time I'm not sure I have
> a better proposal.
> 
> It does mean a host cannot profile one guest and have pass-through on the
> other. Eg. have a development and production guest on the same box. This
> is pretty crap.
> 
> Making it a guest-boot-option would allow that, but then the host gets
> complicated again. I think I can make it trivially work for per-task
> events, simply error the creation of events without exclude_guest for
> affected vCPU tasks. But the CPU events are tricky.
> 
> 
> I will firmly reject anything that takes the PMU away from the host
> entirely through.

Why?  What is so wrong with supporting use cases where the platform owner *wants*
to give up host PMU and NMI watchdog functionality?  If disabling host PMU usage
were complex, highly invasive, and/or difficult to maintain, then I can understand
the pushback.  

But if we simply allow hiding hardware PMU support, then isn't the cost to perf
just a few lines in init_hw_perf_events()?  And if we put a stake in the ground
and say that exposing "advanced" PMU features to KVM guests requires a passthrough
PMU, i.e. the PMU to be hidden from the host, that will significantly reduce our
maintenance and complexity.

The kernel allows disabling almost literally every other feature that is even
remotely optional, I don't understand why the hardware PMU is special.

  parent reply	other threads:[~2023-09-29 15:47 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27  3:31 [Patch v4 00/13] Enable fixed counter 3 and topdown perf metrics for vPMU Dapeng Mi
2023-09-27  3:31 ` [Patch v4 01/13] KVM: x86/pmu: Add Intel CPUID-hinted TopDown slots event Dapeng Mi
2023-09-27  3:31 ` [Patch v4 02/13] KVM: x86/pmu: Support PMU fixed counter 3 Dapeng Mi
2023-09-27  3:31 ` [Patch v4 03/13] perf/core: Add function perf_event_group_leader_check() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 04/13] perf/core: Add function perf_event_move_group() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 05/13] perf/core: Add *group_leader for perf_event_create_group_kernel_counters() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 06/13] perf/x86: Fix typos and inconsistent indents in perf_event header Dapeng Mi
2023-09-27  3:31 ` [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event Dapeng Mi
2023-09-27 11:33   ` Peter Zijlstra
2023-09-27 17:27     ` Sean Christopherson
2023-09-28  9:24       ` Mi, Dapeng
2023-09-29 11:53       ` Peter Zijlstra
2023-09-29 15:20         ` Ravi Bangoria
2023-10-02 12:29           ` Peter Zijlstra
2023-10-03  6:36             ` Ravi Bangoria
2023-09-29 15:46         ` Sean Christopherson [this message]
2023-09-30  3:29           ` Jim Mattson
2023-10-01  0:31             ` Namhyung Kim
2023-10-02 11:57           ` Peter Zijlstra
2023-10-02 13:30             ` Ingo Molnar
2023-10-02 15:23               ` David Dunn
2023-10-02 19:02                 ` Mingwei Zhang
2023-10-02 15:56               ` Sean Christopherson
2023-10-02 19:50                 ` Liang, Kan
2023-10-02 20:52                   ` Peter Zijlstra
2023-10-02 20:40                 ` Peter Zijlstra
2023-10-03  0:56                   ` Sean Christopherson
2023-10-03  8:16                     ` Peter Zijlstra
2023-10-03 15:23                       ` Sean Christopherson
2023-10-03 18:21                         ` Jim Mattson
2023-10-04 11:32                           ` Peter Zijlstra
2023-10-04 11:21                         ` Peter Zijlstra
2023-10-04 19:51                           ` Mingwei Zhang
2023-10-04 21:50                             ` Sean Christopherson
2023-10-04 22:05                               ` Sean Christopherson
2023-10-08 10:04                                 ` Like Xu
2023-10-09 17:03                                   ` Manali Shukla
2023-10-11 14:15                                     ` Peter Zijlstra
2023-10-17 10:24                                       ` Manali Shukla
2023-10-17 16:58                                         ` Mingwei Zhang
2023-10-18  0:01                                           ` Mingwei Zhang
2023-10-11 14:20                               ` Peter Zijlstra
2023-10-13 17:02                                 ` Sean Christopherson
2023-10-03 17:31                       ` Manali Shukla
2023-10-03 22:02                     ` Mingwei Zhang
2023-10-04 20:43                       ` Sean Christopherson
2023-09-27  3:31 ` [Patch v4 08/13] perf/core: Add new function perf_event_topdown_metrics() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 09/13] perf/x86/intel: Handle KVM virtual metrics event in perf system Dapeng Mi
2023-09-27  3:31 ` [Patch v4 10/13] KVM: x86/pmu: Extend pmc_reprogram_counter() to create group events Dapeng Mi
2023-09-27  3:31 ` [Patch v4 11/13] KVM: x86/pmu: Support topdown perf metrics feature Dapeng Mi
2023-09-27  3:31 ` [Patch v4 12/13] KVM: x86/pmu: Handle PERF_METRICS overflow Dapeng Mi
2023-09-27  3:31 ` [Patch v4 13/13] KVM: x86/pmu: Expose Topdown in MSR_IA32_PERF_CAPABILITIES Dapeng Mi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZRbxb15Opa2_AusF@google.com \
    --to=seanjc@google.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dapeng1.mi@intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=daviddunn@google.com \
    --cc=irogers@google.com \
    --cc=jmattson@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=likexu@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=mizhang@google.com \
    --cc=namhyung@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=weijiang.yang@intel.com \
    --cc=xiong.y.zhang@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).