linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.
@ 2025-08-14 11:06 Chengdong Li
  2025-08-19 10:26 ` Mark Rutland
  2025-08-20  7:17 ` kernel test robot
  0 siblings, 2 replies; 5+ messages in thread
From: Chengdong Li @ 2025-08-14 11:06 UTC (permalink / raw)
  To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
	jolsa, irogers, adrian.hunter, kan.liang
  Cc: linux-perf-users, linux-kernel, Chengdong Li

It's not allowed to mmap() of inherited per-task counters with CPU ==
-1, this would create a performance issue. But it is not friendly to
developers as current implementation postponed the arguments checking to
perf_mmap(), developer can get an -EINVAL from mmap() but without
any previous error returned from perf_event_open().

This patch improves it by moving the arguments checking from perf_mmap()
to perf_event_open().

Signed-off-by: Chengdong Li <chengdongli@optimatist.com>
---
 kernel/events/core.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 8060c2857bb2..f102adb395ec 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6941,14 +6941,6 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
 	int ret, flags = 0;
 	mapped_f mapped;
 
-	/*
-	 * Don't allow mmap() of inherited per-task counters. This would
-	 * create a performance issue due to all children writing to the
-	 * same rb.
-	 */
-	if (event->cpu == -1 && event->attr.inherit)
-		return -EINVAL;
-
 	if (!(vma->vm_flags & VM_SHARED))
 		return -EINVAL;
 
@@ -13392,6 +13384,18 @@ SYSCALL_DEFINE5(perf_event_open,
 			return -EACCES;
 	}
 
+	/*
+	 * Don't allow perf_event_open() of inherited per-task counters
+	 * with cpu == 1 when sampling. Otherwise, this would create a
+	 * performance issue due to all children writing to the same mmap()
+	 * created ring buffer.
+	 *
+	 * We recommend to call perf_event_open() for all cpus when sampling on
+	 * inherited per-task counters.
+	 */
+	if (attr.sample_freq && attr.inherit && cpu == -1)
+		return -EINVAL;
+
 	if (attr.freq) {
 		if (attr.sample_freq > sysctl_perf_event_sample_rate)
 			return -EINVAL;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.
  2025-08-14 11:06 [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling Chengdong Li
@ 2025-08-19 10:26 ` Mark Rutland
  2025-08-19 13:31   ` Chengdong Li(李成栋)
  2025-08-20  7:17 ` kernel test robot
  1 sibling, 1 reply; 5+ messages in thread
From: Mark Rutland @ 2025-08-19 10:26 UTC (permalink / raw)
  To: Chengdong Li
  Cc: peterz, mingo, acme, namhyung, alexander.shishkin, jolsa, irogers,
	adrian.hunter, kan.liang, linux-perf-users, linux-kernel

On Thu, Aug 14, 2025 at 07:06:25PM +0800, Chengdong Li wrote:
> It's not allowed to mmap() of inherited per-task counters with CPU ==
> -1, this would create a performance issue. But it is not friendly to
> developers as current implementation postponed the arguments checking to
> perf_mmap(), developer can get an -EINVAL from mmap() but without
> any previous error returned from perf_event_open().
> 
> This patch improves it by moving the arguments checking from perf_mmap()
> to perf_event_open().

Why is that an improvement?

IIUC before this patch, it would be possible to read() the event,
whereas now the event cannot be opened at all.

AFAICT this is removing functionality people could legitimately use, so
this doesn't seem like an improvement.

Mark.

> 
> Signed-off-by: Chengdong Li <chengdongli@optimatist.com>
> ---
>  kernel/events/core.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8060c2857bb2..f102adb395ec 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -6941,14 +6941,6 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
>  	int ret, flags = 0;
>  	mapped_f mapped;
>  
> -	/*
> -	 * Don't allow mmap() of inherited per-task counters. This would
> -	 * create a performance issue due to all children writing to the
> -	 * same rb.
> -	 */
> -	if (event->cpu == -1 && event->attr.inherit)
> -		return -EINVAL;
> -
>  	if (!(vma->vm_flags & VM_SHARED))
>  		return -EINVAL;
>  
> @@ -13392,6 +13384,18 @@ SYSCALL_DEFINE5(perf_event_open,
>  			return -EACCES;
>  	}
>  
> +	/*
> +	 * Don't allow perf_event_open() of inherited per-task counters
> +	 * with cpu == 1 when sampling. Otherwise, this would create a
> +	 * performance issue due to all children writing to the same mmap()
> +	 * created ring buffer.
> +	 *
> +	 * We recommend to call perf_event_open() for all cpus when sampling on
> +	 * inherited per-task counters.
> +	 */
> +	if (attr.sample_freq && attr.inherit && cpu == -1)
> +		return -EINVAL;
> +
>  	if (attr.freq) {
>  		if (attr.sample_freq > sysctl_perf_event_sample_rate)
>  			return -EINVAL;
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.
  2025-08-19 10:26 ` Mark Rutland
@ 2025-08-19 13:31   ` Chengdong Li(李成栋)
  2025-08-27 10:44     ` Mark Rutland
  0 siblings, 1 reply; 5+ messages in thread
From: Chengdong Li(李成栋) @ 2025-08-19 13:31 UTC (permalink / raw)
  To: Mark Rutland
  Cc: peterz, mingo, acme, namhyung, alexander.shishkin, jolsa, irogers,
	adrian.hunter, kan.liang, linux-perf-users, linux-kernel

Hi Mark,

On 2025/8/19, 18:26, "Mark Rutland" <mark.rutland@arm.com <mailto:mark.rutland@arm.com>> wrote:


>On Thu, Aug 14, 2025 at 07:06:25PM +0800, Chengdong Li wrote:
>> It's not allowed to mmap() of inherited per-task counters with CPU ==
>> -1, this would create a performance issue. But it is not friendly to
>> developers as current implementation postponed the arguments checking to
>> perf_mmap(), developer can get an -EINVAL from mmap() but without
>> any previous error returned from perf_event_open().
>>
>> This patch improves it by moving the arguments checking from perf_mmap()
>> to perf_event_open().
>
>
>Why is that an improvement?
>
>
>IIUC before this patch, it would be possible to read() the event,
>whereas now the event cannot be opened at all.

That's true, could you provide a use case that using sampling mode but without
ring buffer? From my best knowledge, I think counting mode is more suitable
for read() only.

>
>
>AFAICT this is removing functionality people could legitimately use, so
>this doesn't seem like an improvement.

The problem is that using inherit per-task counter with sampling mode would lead
developer hard to debug why mmap() returns -EINVAL. There is not error returned from
perf_event_open(), everything done seems right but failed at mmap(), and there is
no clue in man open_event_open() as well.

Thanks,
Chengdong

>
>
>Mark.
>
>
>>
>> Signed-off-by: Chengdong Li <chengdongli@optimatist.com <mailto:chengdongli@optimatist.com>>
>> ---
>> kernel/events/core.c | 20 ++++++++++++--------
>> 1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/kernel/events/core.c b/kernel/events/core.c
>> index 8060c2857bb2..f102adb395ec 100644
>> --- a/kernel/events/core.c
>> +++ b/kernel/events/core.c
>> @@ -6941,14 +6941,6 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)
>> int ret, flags = 0;
>> mapped_f mapped;
>>
>> - /*
>> - * Don't allow mmap() of inherited per-task counters. This would
>> - * create a performance issue due to all children writing to the
>> - * same rb.
>> - */
>> - if (event->cpu == -1 && event->attr.inherit)
>> - return -EINVAL;
>> -
>> if (!(vma->vm_flags & VM_SHARED))
>> return -EINVAL;
>>
>> @@ -13392,6 +13384,18 @@ SYSCALL_DEFINE5(perf_event_open,
>> return -EACCES;
>> }
>>
>> + /*
>> + * Don't allow perf_event_open() of inherited per-task counters
>> + * with cpu == 1 when sampling. Otherwise, this would create a
>> + * performance issue due to all children writing to the same mmap()
>> + * created ring buffer.
>> + *
>> + * We recommend to call perf_event_open() for all cpus when sampling on
>> + * inherited per-task counters.
>> + */
>> + if (attr.sample_freq && attr.inherit && cpu == -1)
>> + return -EINVAL;
>> +
>> if (attr.freq) {
>> if (attr.sample_freq > sysctl_perf_event_sample_rate)
>> return -EINVAL;
>> --
>> 2.43.0
>>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.
  2025-08-14 11:06 [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling Chengdong Li
  2025-08-19 10:26 ` Mark Rutland
@ 2025-08-20  7:17 ` kernel test robot
  1 sibling, 0 replies; 5+ messages in thread
From: kernel test robot @ 2025-08-20  7:17 UTC (permalink / raw)
  To: Chengdong Li
  Cc: oe-lkp, lkp, linux-perf-users, linux-kernel, peterz, mingo, acme,
	namhyung, mark.rutland, alexander.shishkin, jolsa, irogers,
	adrian.hunter, kan.liang, Chengdong Li, oliver.sang



Hello,

kernel test robot noticed "perf-sanity-tests.Sigtrap.fail" on:

commit: d692062da890da0c9f0c6bb2d7818c635eecd2d7 ("[PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.")
url: https://github.com/intel-lab-lkp/linux/commits/Chengdong-Li/perf-core-Improve-arguments-checking-of-inherited-per-task-counters-when-sampling/20250814-190823
base: https://git.kernel.org/cgit/linux/kernel/git/perf/perf-tools-next.git perf-tools-next
patch link: https://lore.kernel.org/all/20250814110625.84622-1-chengdongli@optimatist.com/
patch subject: [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.

in testcase: perf-sanity-tests
version: 
with following parameters:

	perf_compiler: gcc
	group: group-01



config: x86_64-rhel-9.4-bpf
compiler: gcc-12
test machine: 20 threads 1 sockets (Commet Lake) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202508200832.1aec09e3-lkp@intel.com


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250820/202508200832.1aec09e3-lkp@intel.com


2025-08-18 01:23:06 sudo /usr/src/linux-perf-x86_64-rhel-9.4-bpf-d692062da890da0c9f0c6bb2d7818c635eecd2d7/tools/perf/perf test 65 -v
 65: Sigtrap                                                         : Running (1 active)
--- start ---
test child forked, pid 7414
FAILED sys_perf_event_open(): Invalid argument
---- end(-1) ----
 65: Sigtrap                                                         : FAILED!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling.
  2025-08-19 13:31   ` Chengdong Li(李成栋)
@ 2025-08-27 10:44     ` Mark Rutland
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Rutland @ 2025-08-27 10:44 UTC (permalink / raw)
  To: Chengdong Li(李成栋)
  Cc: peterz, mingo, acme, namhyung, alexander.shishkin, jolsa, irogers,
	adrian.hunter, kan.liang, linux-perf-users, linux-kernel

On Tue, Aug 19, 2025 at 09:31:46PM +0800, Chengdong Li(李成栋) wrote:
> Hi Mark,
> 
> On 2025/8/19, 18:26, "Mark Rutland" <mark.rutland@arm.com <mailto:mark.rutland@arm.com>> wrote:
> 
> 
> >On Thu, Aug 14, 2025 at 07:06:25PM +0800, Chengdong Li wrote:
> >> It's not allowed to mmap() of inherited per-task counters with CPU ==
> >> -1, this would create a performance issue. But it is not friendly to
> >> developers as current implementation postponed the arguments checking to
> >> perf_mmap(), developer can get an -EINVAL from mmap() but without
> >> any previous error returned from perf_event_open().
> >>
> >> This patch improves it by moving the arguments checking from perf_mmap()
> >> to perf_event_open().
> >
> >Why is that an improvement?
> >
> >IIUC before this patch, it would be possible to read() the event,
> >whereas now the event cannot be opened at all.
> 
> That's true, could you provide a use case that using sampling mode but without
> ring buffer? From my best knowledge, I think counting mode is more suitable
> for read() only.

I could be mistaken, but IIRC the RR folk were using sampling mode to
get notifications upon each sample/overflow, but they don't really care
about the sample itself. They might not care about inherited counters
for that workload, but I'm fairly sure that sampling without mmap is a
real (albeit unusual) use-case.

> >AFAICT this is removing functionality people could legitimately use, so
> >this doesn't seem like an improvement.
> 
> The problem is that using inherit per-task counter with sampling mode would lead
> developer hard to debug why mmap() returns -EINVAL. There is not error returned from
> perf_event_open(), everything done seems right but failed at mmap(), and there is
> no clue in man open_event_open() as well.

Ok, but presumably they do that mmap() shortly after doing the relavant
perf_event_open() calls?

When do people hit this in practice? Do they actually see this when
deploying a workload, or just when developing code that uses sampling?

Mark.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-08-27 10:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-14 11:06 [PATCH] perf/core: Improve arguments checking of inherited per-task counters when sampling Chengdong Li
2025-08-19 10:26 ` Mark Rutland
2025-08-19 13:31   ` Chengdong Li(李成栋)
2025-08-27 10:44     ` Mark Rutland
2025-08-20  7:17 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).