From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Luo Gengkun <luogengkun@huaweicloud.com>, peterz@infradead.org
Cc: mingo@redhat.com, acme@kernel.org, namhyung@kernel.org,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com,
ravi.bangoria@amd.com, linux-perf-users@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] perf/core: Fix warning warning due to unordred pmu_ctx_list
Date: Mon, 20 Jan 2025 15:49:09 -0500 [thread overview]
Message-ID: <fac5bb75-8bce-4181-8278-c971d35b8b3a@linux.intel.com> (raw)
In-Reply-To: <20250120114344.632474-1-luogengkun@huaweicloud.com>
A redundant "warning" is in the title.
On 2025-01-20 6:43 a.m., Luo Gengkun wrote:
> Syskaller triggers a warning due to prev_epc->pmu != next_epc->pmu in
> perf_event_swap_task_ctx_data. vmcore shows that two lists have the same
> perf_event_pmu_context, but not in the same order.
>
> The problem is that when inheritance is performed, it traverses the ordered
> groups of events, and inserts the new perf_event_pmu_context into
> child_ctx->pmu_ctx_list which is unordered. So the order of pmu_ctx_list in
> the parent and child may be different.
I think the order of pmu_ctx_list for the parent should be impacted by
the time when an event/pmu is added.
While the order for a child should be impacted by the event order in the
pinned_groups and flexible_groups.
>
> The follow testcase can trigger above warning:
>
> # perf record -e cycles --call-graph lbr -- taskset -c 3 ./a.out &
> # perf stat -e cpu-clock,cs -p xxx // xxx is the pid of a.out
>
> test.c
>
> void main() {
> int count = 0;
> pid_t pid;
>
> printf("%d running\n", getpid());
> sleep(30);
> printf("running\n");
>
> pid = fork();
> if (pid == -1) {
> printf("fork error\n");
> return;
> }
> if (pid == 0) {
> while (1) {
> count++;
> }
> } else {
> while (1) {
> count++;
> }
> }
> }
>
> The testcase first open a lbr event, so it will alloc task_ctx_data, and
> then open tracepoint and software events, so the parent ctx will have 3
> different perf_event_pmu_contexts. When doing inherit, child ctx will
> insert the perf_event_pmu_context in another order then the warning will
> trigger.
>
> To fix this problem, add pmu_ctx_insertion_sort to make sure the
> pmu_ctx_list is ordered.
>
> Fixes: bd2756811766 ("perf: Rewrite core context handling")
> Signed-off-by: Luo Gengkun <luogengkun@huaweicloud.com>
> ---
> kernel/events/core.c | 22 ++++++++++++++++++++--
> 1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 95b01a51139d..1bdff3ef0ce2 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4953,6 +4953,24 @@ find_get_context(struct task_struct *task, struct perf_event *event)
> return ERR_PTR(err);
> }
>
> +/*
> + * This function ensures that ctx->pmu_ctx_list is ordered, so that no warning
> + * is triggered due to prev_epc->pmu != next_epc->pmu.
> + */
> +static void pmu_ctx_insertion_sort(struct perf_event_pmu_context *new,
> + struct perf_event_context *ctx)
> +{
> + struct perf_event_pmu_context *epc;
> +
> + lockdep_assert_held(&ctx->lock);
> +
> + list_for_each_entry(epc, &ctx->pmu_ctx_list, pmu_ctx_entry) {
> + if (epc->pmu > new->pmu)
> + break;
> + }
> + list_add(&new->pmu_ctx_entry, epc->pmu_ctx_entry.prev);
> +}
> +
> static struct perf_event_pmu_context *
> find_get_pmu_context(struct pmu *pmu, struct perf_event_context *ctx,
> struct perf_event *event)
> @@ -4974,7 +4992,7 @@ find_get_pmu_context(struct pmu *pmu, struct perf_event_context *ctx,
> if (!epc->ctx) {
> atomic_set(&epc->refcount, 1);
> epc->embedded = 1;
> - list_add(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list);
> + pmu_ctx_insertion_sort(epc, ctx);
The CPU event and per-task event should have a different ctx.
The warning should only be triggered for the per-task event, right?
If so, I don't think a sort is required here.
> epc->ctx = ctx;
> } else {
> WARN_ON_ONCE(epc->ctx != ctx);
> @@ -5021,7 +5039,7 @@ find_get_pmu_context(struct pmu *pmu, struct perf_event_context *ctx,
> printk(KERN_INFO
> "lgk: ctx %p insert pmu ctx %p, pmu is %p!\n", ctx, epc, epc->pmu);
Seems your debug code. Please send a clean patch.
>
> - list_add(&epc->pmu_ctx_entry, &ctx->pmu_ctx_list);
> + pmu_ctx_insertion_sort(epc, ctx);
I think the pmu_ctx_list has already traversed to find a matched pmu
right before. The traverse in the pmu_ctx_insertion_sort() can be avoided.
Thanks,
Kan
> epc->ctx = ctx;
>
> found_epc:
next prev parent reply other threads:[~2025-01-20 20:49 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-20 11:43 [PATCH] perf/core: Fix warning warning due to unordred pmu_ctx_list Luo Gengkun
2025-01-20 20:49 ` Liang, Kan [this message]
2025-01-21 1:59 ` Luo Gengkun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fac5bb75-8bce-4181-8278-c971d35b8b3a@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=adrian.hunter@intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=irogers@google.com \
--cc=jolsa@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=luogengkun@huaweicloud.com \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.bangoria@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox