public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Yang Jihong <yangjihong1@huawei.com>
Cc: rostedt@goodmis.org, mingo@redhat.com, acme@kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@kernel.org, namhyung@kernel.org,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org
Subject: Re: [PATCH v3] perf/core: Fix reentry problem in perf_output_read_group
Date: Tue, 16 Aug 2022 16:13:32 +0200	[thread overview]
Message-ID: <YvumDL1qz1NjpfEC@worktop.programming.kicks-ass.net> (raw)
In-Reply-To: <20220816091103.257702-1-yangjihong1@huawei.com>

On Tue, Aug 16, 2022 at 05:11:03PM +0800, Yang Jihong wrote:
> perf_output_read_group may respond to IPI request of other cores and invoke
> __perf_install_in_context function. As a result, hwc configuration is modified.
> As a result, the hwc configuration is modified, causing inconsistency and
> unexpected consequences.

>  read_pmevcntrn+0x1e4/0x1ec arch/arm64/kernel/perf_event.c:423
>  armv8pmu_read_evcntr arch/arm64/kernel/perf_event.c:467 [inline]
>  armv8pmu_read_hw_counter arch/arm64/kernel/perf_event.c:475 [inline]
>  armv8pmu_read_counter+0x10c/0x1f0 arch/arm64/kernel/perf_event.c:528
>  armpmu_event_update+0x9c/0x1bc drivers/perf/arm_pmu.c:247
>  armpmu_read+0x24/0x30 drivers/perf/arm_pmu.c:264
>  perf_output_read_group+0x4cc/0x71c kernel/events/core.c:6806
>  perf_output_read+0x78/0x1c4 kernel/events/core.c:6845
>  perf_output_sample+0xafc/0x1000 kernel/events/core.c:6892
>  __perf_event_output kernel/events/core.c:7273 [inline]
>  perf_event_output_forward+0xd8/0x130 kernel/events/core.c:7287
>  __perf_event_overflow+0xbc/0x20c kernel/events/core.c:8943
>  perf_swevent_overflow kernel/events/core.c:9019 [inline]
>  perf_swevent_event+0x274/0x2c0 kernel/events/core.c:9047
>  do_perf_sw_event kernel/events/core.c:9160 [inline]
>  ___perf_sw_event+0x150/0x1b4 kernel/events/core.c:9191
>  __perf_sw_event+0x58/0x7c kernel/events/core.c:9203
>  perf_sw_event include/linux/perf_event.h:1177 [inline]

> Interrupts is not disabled when perf_output_read_group reads PMU counter.

s/is/are/ due to 'interrupts' being plural

Anyway, yes, I suppose this is indeed so. That code expects to run with
IRQs disabled but in the case of software events that isn't so.

> In this case, IPI request may be received from other cores.
> As a result, PMU configuration is modified and an error occurs when
> reading PMU counter:
> 
>                    CPU0                                         CPU1
>                                                     __se_sys_perf_event_open
>                                                       perf_install_in_context
> perf_output_read_group                                  smp_call_function_single
>   for_each_sibling_event(sub, leader) {                   generic_exec_single
>     if ((sub != event) &&                                   remote_function
>         (sub->state == PERF_EVENT_STATE_ACTIVE))                    |
> <enter IPI handler: __perf_install_in_context>   <----RAISE IPI-----+
> __perf_install_in_context
>   ctx_resched
>     event_sched_out
>       armpmu_del
>         ...
>         hwc->idx = -1; // event->hwc.idx is set to -1
> ...
> <exit IPI>
>             sub->pmu->read(sub);
>               armpmu_read
>                 armv8pmu_read_counter
>                   armv8pmu_read_hw_counter
>                     int idx = event->hw.idx; // idx = -1
>                     u64 val = armv8pmu_read_evcntr(idx);
>                       u32 counter = ARMV8_IDX_TO_COUNTER(idx); // invalid counter = 30
>                       read_pmevcntrn(counter) // undefined instruction
> 
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
>  kernel/events/core.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 4e718b93442b..776fe24adcbd 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6895,6 +6895,13 @@ static void perf_output_read_group(struct perf_output_handle *handle,
>  	u64 read_format = event->attr.read_format;
>  	u64 values[6];
>  	int n = 0;
> +	unsigned long flags;
> +
> +	/*
> +	 * Disabling interrupts avoids all counter scheduling
> +	 * (context switches, timer based rotation and IPIs).
> +	 */
> +	local_irq_save(flags);
>  
>  	values[n++] = 1 + leader->nr_siblings;
>  
> @@ -6931,6 +6938,8 @@ static void perf_output_read_group(struct perf_output_handle *handle,
>  
>  		__output_copy(handle, values, n * sizeof(u64));
>  	}
> +
> +	local_irq_restore(flags);
>  }

Specifically I think it is for_each_sibling_event() itself that requires
the context to be stable. Perhaps we should add an assertion there as
well.

Something like so on top, I suppose.. Does that yield more problem
sites?

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index ee8b9ecdc03b..d4d53b9ba71e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -631,7 +631,12 @@ struct pmu_event_list {
 	struct list_head	list;
 };
 
+/*
+ * Iterating the sibling list requires this list to be stable; by ensuring IRQs
+ * are disabled IPIs from perf_{install_in,remove_from}_context() are held off.
+ */
 #define for_each_sibling_event(sibling, event)			\
+	lockdep_assert_irqs_disabled();				\
 	if ((event)->group_leader == (event))			\
 		list_for_each_entry((sibling), &(event)->sibling_list, sibling_list)
 


  reply	other threads:[~2022-08-16 14:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-16  9:11 [PATCH v3] perf/core: Fix reentry problem in perf_output_read_group Yang Jihong
2022-08-16 14:13 ` Peter Zijlstra [this message]
2022-08-16 14:54   ` Mark Rutland
2022-08-16 15:31     ` Peter Zijlstra
2022-08-16 16:39       ` Mark Rutland
2022-08-16 16:45         ` Mark Rutland
2022-08-17  7:23           ` Peter Zijlstra
2022-10-05 11:26         ` [tip: perf/core] perf: Fix lockdep_assert_event_ctx() tip-bot2 for Peter Zijlstra
2022-08-17  3:18   ` [PATCH v3] perf/core: Fix reentry problem in perf_output_read_group Yang Jihong
2022-08-18  2:23   ` Yang Jihong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YvumDL1qz1NjpfEC@worktop.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=yangjihong1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox