From: Sean Christopherson <seanjc@google.com>
To: Xiong Zhang <xiong.y.zhang@linux.intel.com>
Cc: pbonzini@redhat.com, peterz@infradead.org, mizhang@google.com,
kan.liang@intel.com, zhenyuw@linux.intel.com,
dapeng1.mi@linux.intel.com, jmattson@google.com,
kvm@vger.kernel.org, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org, zhiyuan.lv@intel.com,
eranian@google.com, irogers@google.com, samantha.alt@intel.com,
like.xu.linux@gmail.com, chao.gao@intel.com,
Kan Liang <kan.liang@linux.intel.com>
Subject: Re: [RFC PATCH 02/41] perf: Support guest enter/exit interfaces
Date: Thu, 11 Apr 2024 11:06:37 -0700 [thread overview]
Message-ID: <ZhgmrczGpccfU-cI@google.com> (raw)
In-Reply-To: <20240126085444.324918-3-xiong.y.zhang@linux.intel.com>
On Fri, Jan 26, 2024, Xiong Zhang wrote:
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 683dc086ef10..59471eeec7e4 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -3803,6 +3803,8 @@ static inline void group_update_userpage(struct perf_event *group_event)
> event_update_userpage(event);
> }
>
> +static DEFINE_PER_CPU(bool, __perf_force_exclude_guest);
> +
> static int merge_sched_in(struct perf_event *event, void *data)
> {
> struct perf_event_context *ctx = event->ctx;
> @@ -3814,6 +3816,14 @@ static int merge_sched_in(struct perf_event *event, void *data)
> if (!event_filter_match(event))
> return 0;
>
> + /*
> + * The __perf_force_exclude_guest indicates entering the guest.
> + * No events of the passthrough PMU should be scheduled.
> + */
> + if (__this_cpu_read(__perf_force_exclude_guest) &&
> + has_vpmu_passthrough_cap(event->pmu))
As mentioned in the previous reply, I think perf should WARN and reject any attempt
to trigger a "passthrough" context switch if such a switch isn't supported by
perf, not silently let it go through and then skip things later.
> + return 0;
> +
> if (group_can_go_on(event, *can_add_hw)) {
> if (!group_sched_in(event, ctx))
> list_add_tail(&event->active_list, get_event_list(event));
...
> +/*
> + * When a guest enters, force all active events of the PMU, which supports
> + * the VPMU_PASSTHROUGH feature, to be scheduled out. The events of other
> + * PMUs, such as uncore PMU, should not be impacted. The guest can
> + * temporarily own all counters of the PMU.
> + * During the period, all the creation of the new event of the PMU with
> + * !exclude_guest are error out.
> + */
> +void perf_guest_enter(void)
> +{
> + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
> +
> + lockdep_assert_irqs_disabled();
> +
> + if (__this_cpu_read(__perf_force_exclude_guest))
This should be a WARN_ON_ONCE, no?
> + return;
> +
> + perf_ctx_lock(cpuctx, cpuctx->task_ctx);
> +
> + perf_force_exclude_guest_enter(&cpuctx->ctx);
> + if (cpuctx->task_ctx)
> + perf_force_exclude_guest_enter(cpuctx->task_ctx);
> +
> + perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
> +
> + __this_cpu_write(__perf_force_exclude_guest, true);
> +}
> +EXPORT_SYMBOL_GPL(perf_guest_enter);
> +
> +static void perf_force_exclude_guest_exit(struct perf_event_context *ctx)
> +{
> + struct perf_event_pmu_context *pmu_ctx;
> + struct pmu *pmu;
> +
> + update_context_time(ctx);
> + list_for_each_entry(pmu_ctx, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> + pmu = pmu_ctx->pmu;
> + if (!has_vpmu_passthrough_cap(pmu))
> + continue;
I don't see how we can sanely support a CPU that doesn't support writable
PERF_GLOBAL_STATUS across all PMUs.
> +
> + perf_pmu_disable(pmu);
> + pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu);
> + pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu);
> + perf_pmu_enable(pmu);
> + }
> +}
> +
> +void perf_guest_exit(void)
> +{
> + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
> +
> + lockdep_assert_irqs_disabled();
> +
> + if (!__this_cpu_read(__perf_force_exclude_guest))
WARN_ON_ONCE here too?
> + return;
> +
> + __this_cpu_write(__perf_force_exclude_guest, false);
> +
> + perf_ctx_lock(cpuctx, cpuctx->task_ctx);
> +
> + perf_force_exclude_guest_exit(&cpuctx->ctx);
> + if (cpuctx->task_ctx)
> + perf_force_exclude_guest_exit(cpuctx->task_ctx);
> +
> + perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
> +}
> +EXPORT_SYMBOL_GPL(perf_guest_exit);
> +
> +static inline int perf_force_exclude_guest_check(struct perf_event *event,
> + int cpu, struct task_struct *task)
> +{
> + bool *force_exclude_guest = NULL;
> +
> + if (!has_vpmu_passthrough_cap(event->pmu))
> + return 0;
> +
> + if (event->attr.exclude_guest)
> + return 0;
> +
> + if (cpu != -1) {
> + force_exclude_guest = per_cpu_ptr(&__perf_force_exclude_guest, cpu);
> + } else if (task && (task->flags & PF_VCPU)) {
> + /*
> + * Just need to check the running CPU in the event creation. If the
> + * task is moved to another CPU which supports the force_exclude_guest.
> + * The event will filtered out and be moved to the error stage. See
> + * merge_sched_in().
> + */
> + force_exclude_guest = per_cpu_ptr(&__perf_force_exclude_guest, task_cpu(task));
> + }
These checks are extremely racy, I don't see how this can possibly do the
right thing. PF_VCPU isn't a "this is a vCPU task", it's a "this task is about
to do VM-Enter, or just took a VM-Exit" (the "I'm a virtual CPU" comment in
include/linux/sched.h is wildly misleading, as it's _only_ valid when accounting
time slices).
Digging deeper, I think __perf_force_exclude_guest has similar problems, e.g.
perf_event_create_kernel_counter() calls perf_event_alloc() before acquiring the
per-CPU context mutex.
> + if (force_exclude_guest && *force_exclude_guest)
> + return -EBUSY;
> + return 0;
> +}
> +
> /*
> * Holding the top-level event's child_mutex means that any
> * descendant process that has inherited this event will block
> @@ -11973,6 +12142,11 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
> goto err_ns;
> }
>
> + if (perf_force_exclude_guest_check(event, cpu, task)) {
This should be:
err = perf_force_exclude_guest_check(event, cpu, task);
if (err)
goto err_pmu;
i.e. shouldn't effectively ignore/override the return result.
> + err = -EBUSY;
> + goto err_pmu;
> + }
> +
> /*
> * Disallow uncore-task events. Similarly, disallow uncore-cgroup
> * events (they don't make sense as the cgroup will be different
> --
> 2.34.1
>
next prev parent reply other threads:[~2024-04-11 18:06 UTC|newest]
Thread overview: 182+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-26 8:54 [RFC PATCH 00/41] KVM: x86/pmu: Introduce passthrough vPM Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 01/41] perf: x86/intel: Support PERF_PMU_CAP_VPMU_PASSTHROUGH Xiong Zhang
2024-04-11 17:04 ` Sean Christopherson
2024-04-11 17:21 ` Liang, Kan
2024-04-11 17:24 ` Jim Mattson
2024-04-11 17:46 ` Sean Christopherson
2024-04-11 19:13 ` Liang, Kan
2024-04-11 20:43 ` Sean Christopherson
2024-04-11 21:04 ` Liang, Kan
2024-04-11 19:32 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 02/41] perf: Support guest enter/exit interfaces Xiong Zhang
2024-03-20 16:40 ` Raghavendra Rao Ananta
2024-03-20 17:12 ` Liang, Kan
2024-04-11 18:06 ` Sean Christopherson [this message]
2024-04-11 19:53 ` Liang, Kan
2024-04-12 19:17 ` Sean Christopherson
2024-04-12 20:56 ` Liang, Kan
2024-04-15 16:03 ` Liang, Kan
2024-04-16 5:34 ` Zhang, Xiong Y
2024-04-16 12:48 ` Liang, Kan
2024-04-17 9:42 ` Zhang, Xiong Y
2024-04-18 16:11 ` Sean Christopherson
2024-04-19 1:37 ` Zhang, Xiong Y
2024-04-26 4:09 ` Zhang, Xiong Y
2024-01-26 8:54 ` [RFC PATCH 03/41] perf: Set exclude_guest onto nmi_watchdog Xiong Zhang
2024-04-11 18:56 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 04/41] perf: core/x86: Add support to register a new vector for PMI handling Xiong Zhang
2024-04-11 17:10 ` Sean Christopherson
2024-04-11 19:05 ` Sean Christopherson
2024-04-12 3:56 ` Zhang, Xiong Y
2024-04-13 1:17 ` Mi, Dapeng
2024-01-26 8:54 ` [RFC PATCH 05/41] KVM: x86/pmu: Register PMI handler for passthrough PMU Xiong Zhang
2024-04-11 19:07 ` Sean Christopherson
2024-04-12 5:44 ` Zhang, Xiong Y
2024-01-26 8:54 ` [RFC PATCH 06/41] perf: x86: Add function to switch PMI handler Xiong Zhang
2024-04-11 19:17 ` Sean Christopherson
2024-04-11 19:34 ` Sean Christopherson
2024-04-12 6:03 ` Zhang, Xiong Y
2024-04-12 5:57 ` Zhang, Xiong Y
2024-01-26 8:54 ` [RFC PATCH 07/41] perf/x86: Add interface to reflect virtual LVTPC_MASK bit onto HW Xiong Zhang
2024-04-11 19:21 ` Sean Christopherson
2024-04-12 6:17 ` Zhang, Xiong Y
2024-01-26 8:54 ` [RFC PATCH 08/41] KVM: x86/pmu: Add get virtual LVTPC_MASK bit function Xiong Zhang
2024-04-11 19:22 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 09/41] perf: core/x86: Forbid PMI handler when guest own PMU Xiong Zhang
2024-04-11 19:26 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 10/41] perf: core/x86: Plumb passthrough PMU capability from x86_pmu to x86_pmu_cap Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 11/41] KVM: x86/pmu: Introduce enable_passthrough_pmu module parameter and propage to KVM instance Xiong Zhang
2024-04-11 20:54 ` Sean Christopherson
2024-04-11 21:03 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 12/41] KVM: x86/pmu: Plumb through passthrough PMU to vcpu for Intel CPUs Xiong Zhang
2024-04-11 20:57 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 13/41] KVM: x86/pmu: Add a helper to check if passthrough PMU is enabled Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 14/41] KVM: x86/pmu: Allow RDPMC pass through Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 15/41] KVM: x86/pmu: Manage MSR interception for IA32_PERF_GLOBAL_CTRL Xiong Zhang
2024-04-11 21:21 ` Sean Christopherson
2024-04-11 22:30 ` Jim Mattson
2024-04-11 23:27 ` Sean Christopherson
2024-04-13 2:10 ` Mi, Dapeng
2024-01-26 8:54 ` [RFC PATCH 16/41] KVM: x86/pmu: Create a function prototype to disable MSR interception Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 17/41] KVM: x86/pmu: Implement pmu function for Intel CPU " Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 18/41] KVM: x86/pmu: Intercept full-width GP counter MSRs by checking with perf capabilities Xiong Zhang
2024-04-11 21:23 ` Sean Christopherson
2024-04-11 21:50 ` Jim Mattson
2024-04-12 16:01 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 19/41] KVM: x86/pmu: Whitelist PMU MSRs for passthrough PMU Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 20/41] KVM: x86/pmu: Introduce PMU operation prototypes for save/restore PMU context Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 21/41] KVM: x86/pmu: Introduce function prototype for Intel CPU to " Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 22/41] x86: Introduce MSR_CORE_PERF_GLOBAL_STATUS_SET for passthrough PMU Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 23/41] KVM: x86/pmu: Implement the save/restore of PMU state for Intel CPU Xiong Zhang
2024-04-11 21:26 ` Sean Christopherson
2024-04-13 2:29 ` Mi, Dapeng
2024-04-11 21:44 ` Sean Christopherson
2024-04-11 22:19 ` Jim Mattson
2024-04-11 23:31 ` Sean Christopherson
2024-04-13 3:19 ` Mi, Dapeng
2024-04-13 3:03 ` Mi, Dapeng
2024-04-13 3:34 ` Mingwei Zhang
2024-04-13 4:25 ` Mi, Dapeng
2024-04-15 6:06 ` Mingwei Zhang
2024-04-15 10:04 ` Mi, Dapeng
2024-04-15 16:44 ` Mingwei Zhang
2024-04-15 17:38 ` Sean Christopherson
2024-04-15 17:54 ` Mingwei Zhang
2024-04-15 22:45 ` Sean Christopherson
2024-04-22 2:14 ` maobibo
2024-04-22 17:01 ` Sean Christopherson
2024-04-23 1:01 ` maobibo
2024-04-23 2:44 ` Mi, Dapeng
2024-04-23 2:53 ` maobibo
2024-04-23 3:13 ` Mi, Dapeng
2024-04-23 3:26 ` maobibo
2024-04-23 3:59 ` Mi, Dapeng
2024-04-23 3:55 ` maobibo
2024-04-23 4:23 ` Mingwei Zhang
2024-04-23 6:08 ` maobibo
2024-04-23 6:45 ` Mi, Dapeng
2024-04-23 7:10 ` Mingwei Zhang
2024-04-23 8:24 ` Mi, Dapeng
2024-04-23 8:51 ` maobibo
2024-04-23 16:50 ` Mingwei Zhang
2024-04-23 12:12 ` maobibo
2024-04-23 17:02 ` Mingwei Zhang
2024-04-24 1:07 ` maobibo
2024-04-24 8:18 ` Mi, Dapeng
2024-04-24 15:00 ` Sean Christopherson
2024-04-25 3:55 ` Mi, Dapeng
2024-04-25 4:24 ` Mingwei Zhang
2024-04-25 16:13 ` Liang, Kan
2024-04-25 20:16 ` Mingwei Zhang
2024-04-25 20:43 ` Liang, Kan
2024-04-25 21:46 ` Sean Christopherson
2024-04-26 1:46 ` Mi, Dapeng
2024-04-26 3:12 ` Mingwei Zhang
2024-04-26 4:02 ` Mi, Dapeng
2024-04-26 4:46 ` Mingwei Zhang
2024-04-26 14:09 ` Liang, Kan
2024-04-26 18:41 ` Mingwei Zhang
2024-04-26 19:06 ` Liang, Kan
2024-04-26 19:46 ` Sean Christopherson
2024-04-27 3:04 ` Mingwei Zhang
2024-04-28 0:58 ` Mi, Dapeng
2024-04-28 6:01 ` Mingwei Zhang
2024-04-29 17:44 ` Sean Christopherson
2024-05-01 17:43 ` Mingwei Zhang
2024-05-01 18:00 ` Liang, Kan
2024-05-01 20:36 ` Sean Christopherson
2024-04-29 13:08 ` Liang, Kan
2024-07-17 3:41 ` Mi, Dapeng
2024-04-26 13:53 ` Liang, Kan
2024-04-26 1:50 ` Mi, Dapeng
2024-04-18 21:21 ` Mingwei Zhang
2024-04-18 21:41 ` Mingwei Zhang
2024-04-19 1:02 ` Mi, Dapeng
2024-01-26 8:54 ` [RFC PATCH 24/41] KVM: x86/pmu: Zero out unexposed Counters/Selectors to avoid information leakage Xiong Zhang
2024-04-11 21:36 ` Sean Christopherson
2024-04-11 21:56 ` Jim Mattson
2024-01-26 8:54 ` [RFC PATCH 25/41] KVM: x86/pmu: Introduce macro PMU_CAP_PERF_METRICS Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 26/41] KVM: x86/pmu: Add host_perf_cap field in kvm_caps to record host PMU capability Xiong Zhang
2024-04-11 21:49 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 27/41] KVM: x86/pmu: Clear PERF_METRICS MSR for guest Xiong Zhang
2024-04-11 21:50 ` Sean Christopherson
2024-04-13 3:29 ` Mi, Dapeng
2024-01-26 8:54 ` [RFC PATCH 28/41] KVM: x86/pmu: Switch IA32_PERF_GLOBAL_CTRL at VM boundary Xiong Zhang
2024-04-11 21:54 ` Sean Christopherson
2024-04-11 22:10 ` Jim Mattson
2024-04-11 22:54 ` Sean Christopherson
2024-04-11 23:08 ` Jim Mattson
2024-01-26 8:54 ` [RFC PATCH 29/41] KVM: x86/pmu: Exclude existing vLBR logic from the passthrough PMU Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 30/41] KVM: x86/pmu: Switch PMI handler at KVM context switch boundary Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 31/41] KVM: x86/pmu: Call perf_guest_enter() at PMU context switch Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 32/41] KVM: x86/pmu: Add support for PMU context switch at VM-exit/enter Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 33/41] KVM: x86/pmu: Make check_pmu_event_filter() an exported function Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 34/41] KVM: x86/pmu: Intercept EVENT_SELECT MSR Xiong Zhang
2024-04-11 21:55 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 35/41] KVM: x86/pmu: Allow writing to event selector for GP counters if event is allowed Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 36/41] KVM: x86/pmu: Intercept FIXED_CTR_CTRL MSR Xiong Zhang
2024-04-11 21:56 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 37/41] KVM: x86/pmu: Allow writing to fixed counter selector if counter is exposed Xiong Zhang
2024-04-11 22:03 ` Sean Christopherson
2024-04-13 4:12 ` Mi, Dapeng
2024-01-26 8:54 ` [RFC PATCH 38/41] KVM: x86/pmu: Introduce PMU helper to increment counter Xiong Zhang
2024-01-26 8:54 ` [RFC PATCH 39/41] KVM: x86/pmu: Implement emulated counter increment for passthrough PMU Xiong Zhang
2024-04-11 23:12 ` Sean Christopherson
2024-04-11 23:17 ` Sean Christopherson
2024-01-26 8:54 ` [RFC PATCH 40/41] KVM: x86/pmu: Separate passthrough PMU logic in set/get_msr() from non-passthrough vPMU Xiong Zhang
2024-04-11 23:18 ` Sean Christopherson
2024-04-18 21:54 ` Mingwei Zhang
2024-01-26 8:54 ` [RFC PATCH 41/41] KVM: nVMX: Add nested virtualization support for passthrough PMU Xiong Zhang
2024-04-11 23:21 ` Sean Christopherson
2024-04-11 17:03 ` [RFC PATCH 00/41] KVM: x86/pmu: Introduce passthrough vPM Sean Christopherson
2024-04-12 2:19 ` Zhang, Xiong Y
2024-04-12 18:32 ` Sean Christopherson
2024-04-15 1:06 ` Zhang, Xiong Y
2024-04-15 15:05 ` Sean Christopherson
2024-04-16 5:11 ` Zhang, Xiong Y
2024-04-18 20:46 ` Mingwei Zhang
2024-04-18 21:52 ` Mingwei Zhang
2024-04-19 19:14 ` Sean Christopherson
2024-04-19 22:02 ` Mingwei Zhang
2024-04-11 23:25 ` Sean Christopherson
2024-04-11 23:56 ` Mingwei Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZhgmrczGpccfU-cI@google.com \
--to=seanjc@google.com \
--cc=chao.gao@intel.com \
--cc=dapeng1.mi@linux.intel.com \
--cc=eranian@google.com \
--cc=irogers@google.com \
--cc=jmattson@google.com \
--cc=kan.liang@intel.com \
--cc=kan.liang@linux.intel.com \
--cc=kvm@vger.kernel.org \
--cc=like.xu.linux@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=mizhang@google.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=samantha.alt@intel.com \
--cc=xiong.y.zhang@linux.intel.com \
--cc=zhenyuw@linux.intel.com \
--cc=zhiyuan.lv@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox