Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Like Xu <like.xu.linux@gmail.com>
To: Sean Christopherson <seanjc@google.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>,
	Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Kan Liang <kan.liang@linux.intel.com>,
	Like Xu <likexu@tencent.com>, Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	kvm@vger.kernel.org, linux-perf-users@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Zhang Xiong <xiong.y.zhang@intel.com>,
	Lv Zhiyuan <zhiyuan.lv@intel.com>,
	Yang Weijiang <weijiang.yang@intel.com>,
	Dapeng Mi <dapeng1.mi@intel.com>,
	David Dunn <daviddunn@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Mingwei Zhang <mizhang@google.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event
Date: Sun, 8 Oct 2023 18:04:28 +0800	[thread overview]
Message-ID: <c69a1eb1-e07a-8270-ca63-54949ded433d@gmail.com> (raw)
In-Reply-To: <ZR3hx9s1yJBR0WRJ@google.com>

Hi all,

On 5/10/2023 6:05 am, Sean Christopherson wrote:
> So I'll add a self-NAK to the idea of completely disabling the host PMU, I think
> that would burn us quite badly at some point.

I seem to have missed a party, so allow me to add a few more comments
to better facilitate future discussions in this direction:

(1) PMU counters on TEE

The SGX/SEV is already part of the upstream, but what kind of performance
data will be obtained by sampling enclaves or sev-guest with hardware pmu
counters on host (will the perf-report show these data missing holes or
pure encrypted data), we don't have a clear idea nor have we established
the right expectations. But on AMD profiling a SEV-SNP guest is supported:

"Fingerprinting attack protection is also not supported in the current
generation of these technologies. Fingerprinting attacks attempt to
determine what code the VM is running by monitoring its access patterns,
performance counter information, etc." (AMD SEV-SNP White Paper, 2020)

(2) PMU Guest/Host Co-existence Development

The introduction of pt_mode in the KVM was misleading, leading subsequent
developers to believe that static slicing of pmu facility usage was allowed.

On user scenarios, the host/perf should treat pmu resource requests from
vCPUs with regularity (which can be unequal under the host's authority IMO)
while allowing the host to be able to profile any software entity (including
hypervisor and guest-code, including TEE code in debug mode). Functionality
takes precedence over performance.

The semantics of exclude_guest/host should be tied to the hw-event isolation
settings on the hardware interfaces, not to the human-defined sw-context.
The perf subsystem is the arbiter of pmu resource allocation on the host,
and any attempt to change the status quo (or maintenance scope) will not
succeed. Therefore, vPMU developers are required to be familiar with the
implementation details of both perf and kvm, and try not to add perf APIs
dedicated to serving KVM blindly.

Getting host and guests to share limited PMU resources harmoniously is not
particularly difficult compared to real rocket science in the kernel, so
please don't be intimidated.

(3) Performance Concern in Co-existence

I wonder if it would be possible to add a knob to turn off the perf counter
multiplexing mechanism on the host, so that in coexistence scenarios, the
number of VM exits on the vCPU would not be increased by counter rotations
due to timer expiration.

For normal counters shared between guest and host, the number of counter
msr switches requiring a vm-entry level will be relatively small.
(The number of counters is growing; for LBR, it is possible to share LBR
select values to avoid frequent switching, but of course this requires the
implementation of a software filtering mechanism when the host/guest read
the LBR records, and some additional PMI; for DS-based PEBS, host and guest
PEBS buffers are automatically segregated based on linear address).

There is a lot of room for optimisation here, and in real scenarios where
triggering a large number of register switches in the host/guest PMU is
to be expected and observed easily (accompanied by a large number of pmi
appearances).

If we are really worried about the virtualisation overhead of vPMU, then
virtio-pmu might be an option. In this technology direction, the back-end
pmu can add more performance events of interest to the VM (including host
un-core and off-core events, host-side software events, etc.) In terms of
implementation, the semantics of the MSRLIST instruction can be re-used,
along with compatibility with the different PMU hardware interfaces on ARM
and Risc-v, which is also very friendly to production environments based on
its virtio nature.

(4) New vPMU Feature Development

We should not put KVM's current vPMU support into maintenance-only mode.
Users want more PMU features in the guest, like AMD vIBS, Intel pmu higher
versions, Intel topdown and Arch lbr, more on the way. The maturity of
different features' patch sets aren't the same, but we can't ignore these
real needs because of available time for key maintainers, apathy towards
contributors, mindset avoidance and laziness, and preference for certain
technology stacks. These technical challenges will attract an influx of
open source heroes to push the technology forward, which is good in the
long run.

(5) More to think about

Similar to the guest PMU feature, the debugging feature may face the same
state. For example, what happens when you debug code inside the host and
guest at the same time (host debugs hypevisor/guest code and guest debugs
guest code only) ?

Forgive my ignorance and offence, but we don't want to see a KVM subsystem
controlled and driven by Google's demands.

Please feel free to share comments to move forward.

Thanks,
Like Xu

next prev parent reply	other threads:[~2023-10-08 10:04 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-27  3:31 [Patch v4 00/13] Enable fixed counter 3 and topdown perf metrics for vPMU Dapeng Mi
2023-09-27  3:31 ` [Patch v4 01/13] KVM: x86/pmu: Add Intel CPUID-hinted TopDown slots event Dapeng Mi
2023-09-27  3:31 ` [Patch v4 02/13] KVM: x86/pmu: Support PMU fixed counter 3 Dapeng Mi
2023-09-27  3:31 ` [Patch v4 03/13] perf/core: Add function perf_event_group_leader_check() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 04/13] perf/core: Add function perf_event_move_group() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 05/13] perf/core: Add *group_leader for perf_event_create_group_kernel_counters() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 06/13] perf/x86: Fix typos and inconsistent indents in perf_event header Dapeng Mi
2023-09-27  3:31 ` [Patch v4 07/13] perf/x86: Add constraint for guest perf metrics event Dapeng Mi
2023-09-27 11:33   ` Peter Zijlstra
2023-09-27 17:27     ` Sean Christopherson
2023-09-28  9:24       ` Mi, Dapeng
2023-09-29 11:53       ` Peter Zijlstra
2023-09-29 15:20         ` Ravi Bangoria
2023-10-02 12:29           ` Peter Zijlstra
2023-10-03  6:36             ` Ravi Bangoria
2023-09-29 15:46         ` Sean Christopherson
2023-09-30  3:29           ` Jim Mattson
2023-10-01  0:31             ` Namhyung Kim
2023-10-02 11:57           ` Peter Zijlstra
2023-10-02 13:30             ` Ingo Molnar
2023-10-02 15:23               ` David Dunn
2023-10-02 19:02                 ` Mingwei Zhang
2023-10-02 15:56               ` Sean Christopherson
2023-10-02 19:50                 ` Liang, Kan
2023-10-02 20:52                   ` Peter Zijlstra
2023-10-02 20:40                 ` Peter Zijlstra
2023-10-03  0:56                   ` Sean Christopherson
2023-10-03  8:16                     ` Peter Zijlstra
2023-10-03 15:23                       ` Sean Christopherson
2023-10-03 18:21                         ` Jim Mattson
2023-10-04 11:32                           ` Peter Zijlstra
2023-10-04 11:21                         ` Peter Zijlstra
2023-10-04 19:51                           ` Mingwei Zhang
2023-10-04 21:50                             ` Sean Christopherson
2023-10-04 22:05                               ` Sean Christopherson
2023-10-08 10:04                                 ` Like Xu [this message]
2023-10-09 17:03                                   ` Manali Shukla
2023-10-11 14:15                                     ` Peter Zijlstra
2023-10-17 10:24                                       ` Manali Shukla
2023-10-17 16:58                                         ` Mingwei Zhang
2023-10-18  0:01                                           ` Mingwei Zhang
2023-10-11 14:20                               ` Peter Zijlstra
2023-10-13 17:02                                 ` Sean Christopherson
2023-10-03 17:31                       ` Manali Shukla
2023-10-03 22:02                     ` Mingwei Zhang
2023-10-04 20:43                       ` Sean Christopherson
2023-09-27  3:31 ` [Patch v4 08/13] perf/core: Add new function perf_event_topdown_metrics() Dapeng Mi
2023-09-27  3:31 ` [Patch v4 09/13] perf/x86/intel: Handle KVM virtual metrics event in perf system Dapeng Mi
2023-09-27  3:31 ` [Patch v4 10/13] KVM: x86/pmu: Extend pmc_reprogram_counter() to create group events Dapeng Mi
2023-09-27  3:31 ` [Patch v4 11/13] KVM: x86/pmu: Support topdown perf metrics feature Dapeng Mi
2023-09-27  3:31 ` [Patch v4 12/13] KVM: x86/pmu: Handle PERF_METRICS overflow Dapeng Mi
2023-09-27  3:31 ` [Patch v4 13/13] KVM: x86/pmu: Expose Topdown in MSR_IA32_PERF_CAPABILITIES Dapeng Mi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c69a1eb1-e07a-8270-ca63-54949ded433d@gmail.com \
    --to=like.xu.linux@gmail.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=dapeng1.mi@intel.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=daviddunn@google.com \
    --cc=irogers@google.com \
    --cc=jmattson@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=likexu@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@kernel.org \
    --cc=mizhang@google.com \
    --cc=namhyung@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=weijiang.yang@intel.com \
    --cc=xiong.y.zhang@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox