public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mingwei Zhang <mizhang@google.com>,
	Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Xiong Zhang <xiong.y.zhang@intel.com>,
	Dapeng Mi <dapeng1.mi@linux.intel.com>,
	Kan Liang <kan.liang@intel.com>,
	Zhenyu Wang <zhenyuw@linux.intel.com>,
	Manali Shukla <manali.shukla@amd.com>,
	Sandipan Das <sandipan.das@amd.com>,
	Jim Mattson <jmattson@google.com>,
	Stephane Eranian <eranian@google.com>,
	Ian Rogers <irogers@google.com>,
	Namhyung Kim <namhyung@kernel.org>,
	gce-passthrou-pmu-dev@google.com,
	Samantha Alt <samantha.alt@intel.com>,
	Zhiyuan Lv <zhiyuan.lv@intel.com>,
	Yanfei Xu <yanfei.xu@intel.com>, maobibo <maobibo@loongson.cn>,
	Like Xu <like.xu.linux@gmail.com>,
	kvm@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v2 07/54] perf: Add generic exclude_guest support
Date: Tue, 11 Jun 2024 14:06:41 +0200	[thread overview]
Message-ID: <20240611120641.GF8774@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <902c40cc-6e0b-4b2f-826c-457f533a0a76@linux.intel.com>

On Mon, Jun 10, 2024 at 01:23:04PM -0400, Liang, Kan wrote:
> On 2024-05-07 4:58 a.m., Peter Zijlstra wrote:

> > Bah, this is a ton of copy-paste from the normal scheduling code with
> > random changes. Why ?
> > 
> > Why can't this use ctx_sched_{in,out}() ? Surely the whole
> > CAP_PASSTHROUGHT thing is but a flag away.
> >
> 
> Not just a flag. The time has to be updated as well, since the ctx->time
> is shared among PMUs. Perf shouldn't stop it while other PMUs is still
> running.

Obviously the original changelog didn't mention any of that.... :/

> A timeguest will be introduced to track the start time of a guest.
> The event->tstamp of an exclude_guest event should always keep
> ctx->timeguest while a guest is running.
> When a guest is leaving, update the event->tstamp to now, so the guest
> time can be deducted.
> 
> The below patch demonstrate how the timeguest works.
> (It's an incomplete patch. Just to show the idea.)
> 
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 22d3e56682c9..2134e6886e22 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -953,6 +953,7 @@ struct perf_event_context {
>  	u64				time;
>  	u64				timestamp;
>  	u64				timeoffset;
> +	u64				timeguest;
> 
>  	/*
>  	 * These fields let us detect when two contexts have both
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 14fd881e3e1d..2aed56671a24 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -690,12 +690,31 @@ __perf_update_times(struct perf_event *event, u64
> now, u64 *enabled, u64 *runnin
>  		*running += delta;
>  }
> 
> +static void perf_event_update_time_guest(struct perf_event *event)
> +{
> +	/*
> +	 * If a guest is running, use the timestamp while entering the guest.
> +	 * If the guest is leaving, reset the event timestamp.
> +	 */
> +	if (!__this_cpu_read(perf_in_guest))
> +		event->tstamp = event->ctx->time;
> +	else
> +		event->tstamp = event->ctx->timeguest;
> +}

This conditional seems inverted, without a good reason. Also, in another
thread you talk about some PMUs stopping time in a guest, while other
PMUs would keep ticking. I don't think the above captures that.

>  static void perf_event_update_time(struct perf_event *event)
>  {
> -	u64 now = perf_event_time(event);
> +	u64 now;
> +
> +	/* Never count the time of an active guest into an exclude_guest event. */
> +	if (event->ctx->timeguest &&
> +	    event->pmu->capabilities & PERF_PMU_CAP_PASSTHROUGH_VPMU)
> +		return perf_event_update_time_guest(event);

Urgh, weird split. The PMU check is here. Please just inline the above
here, this seems to be the sole caller anyway.

> 
> +	now = perf_event_time(event);
>  	__perf_update_times(event, now, &event->total_time_enabled,
>  					&event->total_time_running);
> +
>  	event->tstamp = now;
>  }
> 
> @@ -3398,7 +3417,14 @@ ctx_sched_out(struct perf_event_context *ctx,
> enum event_type_t event_type)
>  			cpuctx->task_ctx = NULL;
>  	}
> 
> -	is_active ^= ctx->is_active; /* changed bits */
> +	if (event_type & EVENT_GUEST) {

Patch doesn't introduce EVENT_GUEST, lost a hunk somewhere?

> +		/*
> +		 * Schedule out all !exclude_guest events of PMU
> +		 * with PERF_PMU_CAP_PASSTHROUGH_VPMU.
> +		 */
> +		is_active = EVENT_ALL;
> +	} else
> +		is_active ^= ctx->is_active; /* changed bits */
> 
>  	list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
>  		if (perf_skip_pmu_ctx(pmu_ctx, event_type))

> @@ -5894,14 +5933,18 @@ void perf_guest_enter(u32 guest_lvtpc)
>  	}
> 
>  	perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST);
> -	ctx_sched_out(&cpuctx->ctx, EVENT_ALL | EVENT_GUEST);
> +	ctx_sched_out(&cpuctx->ctx, EVENT_GUEST);
> +	/* Set the guest start time */
> +	cpuctx->ctx.timeguest = cpuctx->ctx.time;
>  	perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST);
>  	if (cpuctx->task_ctx) {
>  		perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST);
> -		task_ctx_sched_out(cpuctx->task_ctx, EVENT_ALL | EVENT_GUEST);
> +		task_ctx_sched_out(cpuctx->task_ctx, EVENT_GUEST);
> +		cpuctx->task_ctx->timeguest = cpuctx->task_ctx->time;
>  		perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST);
>  	}
> 
>  	__this_cpu_write(perf_in_guest, true);
> @@ -5925,14 +5968,17 @@ void perf_guest_exit(void)
> 
>  	__this_cpu_write(perf_in_guest, false);
> 
>  	perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST);
> -	ctx_sched_in(&cpuctx->ctx, EVENT_ALL | EVENT_GUEST);
> +	ctx_sched_in(&cpuctx->ctx, EVENT_GUEST);
> +	cpuctx->ctx.timeguest = 0;
>  	perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST);
>  	if (cpuctx->task_ctx) {
>  		perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST);
> -		ctx_sched_in(cpuctx->task_ctx, EVENT_ALL | EVENT_GUEST);
> +		ctx_sched_in(cpuctx->task_ctx, EVENT_GUEST);
> +		cpuctx->task_ctx->timeguest = 0;
>  		perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST);
>  	}

I'm thinking EVENT_GUEST should cause the ->timeguest updates, no point
in having them explicitly duplicated here, hmm?

  reply	other threads:[~2024-06-11 12:06 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-06  5:29 [PATCH v2 00/54] Mediated Passthrough vPMU 2.0 for x86 Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 01/54] KVM: x86/pmu: Set enable bits for GP counters in PERF_GLOBAL_CTRL at "RESET" Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 02/54] KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 03/54] KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 04/54] x86/msr: Define PerfCntrGlobalStatusSet register Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 05/54] x86/msr: Introduce MSR_CORE_PERF_GLOBAL_STATUS_SET Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 06/54] perf: Support get/put passthrough PMU interfaces Mingwei Zhang
2024-05-07  8:31   ` Peter Zijlstra
2024-05-08  4:13     ` Zhang, Xiong Y
2024-05-07  8:41   ` Peter Zijlstra
2024-05-08  4:54     ` Zhang, Xiong Y
2024-05-08  8:32       ` Peter Zijlstra
2024-05-06  5:29 ` [PATCH v2 07/54] perf: Add generic exclude_guest support Mingwei Zhang
2024-05-07  8:58   ` Peter Zijlstra
2024-06-10 17:23     ` Liang, Kan
2024-06-11 12:06       ` Peter Zijlstra [this message]
2024-06-11 13:27         ` Liang, Kan
2024-06-12 11:17           ` Peter Zijlstra
2024-06-12 13:38             ` Liang, Kan
2024-06-13  9:15               ` Peter Zijlstra
2024-06-13 13:37                 ` Liang, Kan
2024-06-13 18:04                   ` Liang, Kan
2024-06-17  7:51                     ` Peter Zijlstra
2024-06-17 13:34                       ` Liang, Kan
2024-06-17 15:00                         ` Peter Zijlstra
2024-06-17 15:45                           ` Liang, Kan
2024-05-06  5:29 ` [PATCH v2 08/54] perf/x86/intel: Support PERF_PMU_CAP_PASSTHROUGH_VPMU Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 09/54] perf: core/x86: Register a new vector for KVM GUEST PMI Mingwei Zhang
2024-05-07  9:12   ` Peter Zijlstra
2024-05-08 10:06     ` Yanfei Xu
2024-05-06  5:29 ` [PATCH v2 10/54] KVM: x86: Extract x86_set_kvm_irq_handler() function Mingwei Zhang
2024-05-07  9:18   ` Peter Zijlstra
2024-05-08  8:57     ` Zhang, Xiong Y
2024-05-06  5:29 ` [PATCH v2 11/54] KVM: x86/pmu: Register guest pmi handler for emulated PMU Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 12/54] perf: x86: Add x86 function to switch PMI handler Mingwei Zhang
2024-05-07  9:22   ` Peter Zijlstra
2024-05-08  6:58     ` Zhang, Xiong Y
2024-05-08  8:37       ` Peter Zijlstra
2024-05-09  7:30         ` Zhang, Xiong Y
2024-05-07 21:40   ` Chen, Zide
2024-05-08  3:44     ` Mi, Dapeng
2024-05-30  5:12     ` Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 13/54] perf: core/x86: Forbid PMI handler when guest own PMU Mingwei Zhang
2024-05-07  9:33   ` Peter Zijlstra
2024-05-09  7:39     ` Zhang, Xiong Y
2024-05-06  5:29 ` [PATCH v2 14/54] perf: core/x86: Plumb passthrough PMU capability from x86_pmu to x86_pmu_cap Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 15/54] KVM: x86/pmu: Introduce enable_passthrough_pmu module parameter Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 16/54] KVM: x86/pmu: Plumb through pass-through PMU to vcpu for Intel CPUs Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 17/54] KVM: x86/pmu: Always set global enable bits in passthrough mode Mingwei Zhang
2024-05-08  4:18   ` Mi, Dapeng
2024-05-08  4:36     ` Mingwei Zhang
2024-05-08  6:27       ` Mi, Dapeng
2024-05-08 14:13         ` Sean Christopherson
2024-05-09  0:13           ` Mingwei Zhang
2024-05-09  0:30             ` Mi, Dapeng
2024-05-09  0:38           ` Mi, Dapeng
2024-05-06  5:29 ` [PATCH v2 18/54] KVM: x86/pmu: Add a helper to check if passthrough PMU is enabled Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 19/54] KVM: x86/pmu: Add host_perf_cap and initialize it in kvm_x86_vendor_init() Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 20/54] KVM: x86/pmu: Allow RDPMC pass through when all counters exposed to guest Mingwei Zhang
2024-05-08 21:55   ` Chen, Zide
2024-05-30  5:20     ` Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 21/54] KVM: x86/pmu: Introduce macro PMU_CAP_PERF_METRICS Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 22/54] KVM: x86/pmu: Introduce PMU operator to check if rdpmc passthrough allowed Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 23/54] KVM: x86/pmu: Manage MSR interception for IA32_PERF_GLOBAL_CTRL Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 24/54] KVM: x86/pmu: Create a function prototype to disable MSR interception Mingwei Zhang
2024-05-08 22:03   ` Chen, Zide
2024-05-30  5:24     ` Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 25/54] KVM: x86/pmu: Add intel_passthrough_pmu_msrs() to pass-through PMU MSRs Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 26/54] KVM: x86/pmu: Avoid legacy vPMU code when accessing global_ctrl in passthrough vPMU Mingwei Zhang
2024-05-08 21:48   ` Chen, Zide
2024-05-09  0:43     ` Mi, Dapeng
2024-05-09  1:29       ` Chen, Zide
2024-05-09  2:58         ` Mi, Dapeng
2024-05-30  5:28           ` Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 27/54] KVM: x86/pmu: Exclude PMU MSRs in vmx_get_passthrough_msr_slot() Mingwei Zhang
2024-05-14  7:33   ` Mi, Dapeng
2024-05-06  5:29 ` [PATCH v2 28/54] KVM: x86/pmu: Add counter MSR and selector MSR index into struct kvm_pmc Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 29/54] KVM: x86/pmu: Introduce PMU operation prototypes for save/restore PMU context Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 30/54] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU Mingwei Zhang
2024-05-14  8:08   ` Mi, Dapeng
2024-05-30  5:34     ` Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 31/54] KVM: x86/pmu: Make check_pmu_event_filter() an exported function Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 32/54] KVM: x86/pmu: Allow writing to event selector for GP counters if event is allowed Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 33/54] KVM: x86/pmu: Allow writing to fixed counter selector if counter is exposed Mingwei Zhang
2024-05-06  5:29 ` [PATCH v2 34/54] KVM: x86/pmu: Switch IA32_PERF_GLOBAL_CTRL at VM boundary Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 35/54] KVM: x86/pmu: Exclude existing vLBR logic from the passthrough PMU Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 36/54] KVM: x86/pmu: Switch PMI handler at KVM context switch boundary Mingwei Zhang
2024-07-10  8:37   ` Sandipan Das
2024-07-10 10:01     ` Zhang, Xiong Y
2024-07-10 12:30       ` Sandipan Das
2024-05-06  5:30 ` [PATCH v2 37/54] KVM: x86/pmu: Grab x86 core PMU for passthrough PMU VM Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 38/54] KVM: x86/pmu: Call perf_guest_enter() at PMU context switch Mingwei Zhang
2024-05-07  9:39   ` Peter Zijlstra
2024-05-08  4:22     ` Mi, Dapeng
2024-05-30  4:34     ` Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 39/54] KVM: x86/pmu: Add support for PMU context switch at VM-exit/enter Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 40/54] KVM: x86/pmu: Introduce PMU operator to increment counter Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 41/54] KVM: x86/pmu: Introduce PMU operator for setting counter overflow Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 42/54] KVM: x86/pmu: Implement emulated counter increment for passthrough PMU Mingwei Zhang
2024-05-08 18:28   ` Chen, Zide
2024-05-09  1:11     ` Mi, Dapeng
2024-05-30  4:20       ` Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 43/54] KVM: x86/pmu: Update pmc_{read,write}_counter() to disconnect perf API Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 44/54] KVM: x86/pmu: Disconnect counter reprogram logic from passthrough PMU Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 45/54] KVM: nVMX: Add nested virtualization support for " Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 46/54] perf/x86/amd/core: Set passthrough capability for host Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 47/54] KVM: x86/pmu/svm: Set passthrough capability for vcpus Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 48/54] KVM: x86/pmu/svm: Set enable_passthrough_pmu module parameter Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 49/54] KVM: x86/pmu/svm: Allow RDPMC pass through when all counters exposed to guest Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 50/54] KVM: x86/pmu/svm: Implement callback to disable MSR interception Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 51/54] KVM: x86/pmu/svm: Set GuestOnly bit and clear HostOnly bit when guest write to event selectors Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 52/54] KVM: x86/pmu/svm: Add registers to direct access list Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 53/54] KVM: x86/pmu/svm: Implement handlers to save and restore context Mingwei Zhang
2024-05-06  5:30 ` [PATCH v2 54/54] KVM: x86/pmu/svm: Wire up PMU filtering functionality for passthrough PMU Mingwei Zhang
2024-05-28  2:35 ` [PATCH v2 00/54] Mediated Passthrough vPMU 2.0 for x86 Ma, Yongwei
2024-05-30  4:28   ` Mingwei Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240611120641.GF8774@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=eranian@google.com \
    --cc=gce-passthrou-pmu-dev@google.com \
    --cc=irogers@google.com \
    --cc=jmattson@google.com \
    --cc=kan.liang@intel.com \
    --cc=kan.liang@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=like.xu.linux@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=manali.shukla@amd.com \
    --cc=maobibo@loongson.cn \
    --cc=mizhang@google.com \
    --cc=namhyung@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=samantha.alt@intel.com \
    --cc=sandipan.das@amd.com \
    --cc=seanjc@google.com \
    --cc=xiong.y.zhang@intel.com \
    --cc=yanfei.xu@intel.com \
    --cc=zhenyuw@linux.intel.com \
    --cc=zhiyuan.lv@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox