linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
@ 2024-11-30  7:03 kernel test robot
  2024-12-02 20:32 ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: kernel test robot @ 2024-11-30  7:03 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: oe-lkp, lkp, linux-kernel, James Clark, Ravi Bangoria, Kan Liang,
	James Clark, Atish Patra, Mingwei Zhang, Kajol Jain,
	Thomas Richter, Palmer Dabbelt, linux-perf-users, oliver.sang



Hello,

kernel test robot noticed "perf-sanity-tests.Test_data_symbol.fail" on:

commit: af954f76eea56453713ae657f6812d4063f9bc57 ("perf tools: Check fallback error and order")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[test failed on linus/master      7af08b57bcb9ebf78675c50069c54125c0a8b795]
[test failed on linux-next/master f486c8aa16b8172f63bddc70116a0c897a7f3f02]

in testcase: perf-sanity-tests
version: 
with following parameters:

	perf_compiler: gcc



config: x86_64-rhel-8.3-bpf
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com



2024-11-28 08:31:19 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121
121: Test data symbol                                                : FAILED!



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241130/202411301431.799e5531-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-11-30  7:03 [linus:master] [perf tools] af954f76ee: perf-sanity-tests.Test_data_symbol.fail kernel test robot
@ 2024-12-02 20:32 ` Namhyung Kim
  2024-12-04 14:04   ` Oliver Sang
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2024-12-02 20:32 UTC (permalink / raw)
  To: kernel test robot
  Cc: oe-lkp, lkp, linux-kernel, James Clark, Ravi Bangoria, Kan Liang,
	James Clark, Atish Patra, Mingwei Zhang, Kajol Jain,
	Thomas Richter, Palmer Dabbelt, linux-perf-users

Hello,

On Sat, Nov 30, 2024 at 03:03:10PM +0800, kernel test robot wrote:
> 
> 
> Hello,
> 
> kernel test robot noticed "perf-sanity-tests.Test_data_symbol.fail" on:
> 
> commit: af954f76eea56453713ae657f6812d4063f9bc57 ("perf tools: Check fallback error and order")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> [test failed on linus/master      7af08b57bcb9ebf78675c50069c54125c0a8b795]
> [test failed on linux-next/master f486c8aa16b8172f63bddc70116a0c897a7f3f02]
> 
> in testcase: perf-sanity-tests
> version: 
> with following parameters:
> 
> 	perf_compiler: gcc
> 
> 
> 
> config: x86_64-rhel-8.3-bpf
> compiler: gcc-12
> test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com
> 
> 
> 
> 2024-11-28 08:31:19 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121
> 121: Test data symbol                                                : FAILED!

Thanks for the report.  But I have a request.

Can you please run the perf test with -v option so that we can see the
detailed error messages when it failed?

Thanks,
Namhyung

> 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20241130/202411301431.799e5531-lkp@intel.com
> 
> 
> 
> -- 
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-02 20:32 ` Namhyung Kim
@ 2024-12-04 14:04   ` Oliver Sang
  2024-12-04 21:44     ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Oliver Sang @ 2024-12-04 14:04 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: oe-lkp, lkp, linux-kernel, James Clark, Ravi Bangoria, Kan Liang,
	James Clark, Atish Patra, Mingwei Zhang, Kajol Jain,
	Thomas Richter, Palmer Dabbelt, linux-perf-users, oliver.sang

hi, Namhyung Kim,

On Mon, Dec 02, 2024 at 12:32:16PM -0800, Namhyung Kim wrote:
> Hello,
> 
> On Sat, Nov 30, 2024 at 03:03:10PM +0800, kernel test robot wrote:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "perf-sanity-tests.Test_data_symbol.fail" on:
> > 
> > commit: af954f76eea56453713ae657f6812d4063f9bc57 ("perf tools: Check fallback error and order")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > [test failed on linus/master      7af08b57bcb9ebf78675c50069c54125c0a8b795]
> > [test failed on linux-next/master f486c8aa16b8172f63bddc70116a0c897a7f3f02]
> > 
> > in testcase: perf-sanity-tests
> > version: 
> > with following parameters:
> > 
> > 	perf_compiler: gcc
> > 
> > 
> > 
> > config: x86_64-rhel-8.3-bpf
> > compiler: gcc-12
> > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com
> > 
> > 
> > 
> > 2024-11-28 08:31:19 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121
> > 121: Test data symbol                                                : FAILED!
> 
> Thanks for the report.  But I have a request.
> 
> Can you please run the perf test with -v option so that we can see the
> detailed error messages when it failed?

below is the log with '-v'

2024-12-03 11:20:32 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121 -v
121: Test data symbol:
--- start ---
test child forked, pid 143127
 294e400-294e439 l buf1
perf does have symbol 'buf1'
Recording workload...
Waiting for "perf record has started" message
/usr/src/perf_selftests-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/tests/shell/test_data_symbol.sh: line 74: kill: (143139) - No such process
Cleaning up files...
---- end(-1) ----
121: Test data symbol                                                : FAILED!


> 
> Thanks,
> Namhyung
> 
> > 
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20241130/202411301431.799e5531-lkp@intel.com
> > 
> > 
> > 
> > -- 
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki
> > 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-04 14:04   ` Oliver Sang
@ 2024-12-04 21:44     ` Namhyung Kim
  2024-12-04 22:21       ` Namhyung Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2024-12-04 21:44 UTC (permalink / raw)
  To: Oliver Sang
  Cc: oe-lkp, lkp, linux-kernel, James Clark, Ravi Bangoria, Kan Liang,
	James Clark, Atish Patra, Mingwei Zhang, Kajol Jain,
	Thomas Richter, Palmer Dabbelt, linux-perf-users

On Wed, Dec 04, 2024 at 10:04:46PM +0800, Oliver Sang wrote:
> hi, Namhyung Kim,
> 
> On Mon, Dec 02, 2024 at 12:32:16PM -0800, Namhyung Kim wrote:
> > Hello,
> > 
> > On Sat, Nov 30, 2024 at 03:03:10PM +0800, kernel test robot wrote:
> > > 
> > > 
> > > Hello,
> > > 
> > > kernel test robot noticed "perf-sanity-tests.Test_data_symbol.fail" on:
> > > 
> > > commit: af954f76eea56453713ae657f6812d4063f9bc57 ("perf tools: Check fallback error and order")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > [test failed on linus/master      7af08b57bcb9ebf78675c50069c54125c0a8b795]
> > > [test failed on linux-next/master f486c8aa16b8172f63bddc70116a0c897a7f3f02]
> > > 
> > > in testcase: perf-sanity-tests
> > > version: 
> > > with following parameters:
> > > 
> > > 	perf_compiler: gcc
> > > 
> > > 
> > > 
> > > config: x86_64-rhel-8.3-bpf
> > > compiler: gcc-12
> > > test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory
> > > 
> > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > > 
> > > 
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > > | Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com
> > > 
> > > 
> > > 
> > > 2024-11-28 08:31:19 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121
> > > 121: Test data symbol                                                : FAILED!
> > 
> > Thanks for the report.  But I have a request.
> > 
> > Can you please run the perf test with -v option so that we can see the
> > detailed error messages when it failed?
> 
> below is the log with '-v'
> 
> 2024-12-03 11:20:32 sudo /usr/src/linux-perf-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/perf test 121 -v
> 121: Test data symbol:
> --- start ---
> test child forked, pid 143127
>  294e400-294e439 l buf1
> perf does have symbol 'buf1'
> Recording workload...
> Waiting for "perf record has started" message
> /usr/src/perf_selftests-x86_64-rhel-8.3-bpf-af954f76eea56453713ae657f6812d4063f9bc57/tools/perf/tests/shell/test_data_symbol.sh: line 74: kill: (143139) - No such process
> Cleaning up files...
> ---- end(-1) ----
> 121: Test data symbol                                                : FAILED!

Thanks for the log.  I think it failed to run perf mem record at all.

I've set up a Sapphire Rapids and run the test.  It said:

  # perf mem record -avv -C0 true
  DEBUGINFOD_URLS=
  nr_cblocks: 0
  affinity: SYS
  mmap flush: 1
  comp level: 0
  ------------------------------------------------------------
  perf_event_attr:
    type                             4 (cpu)
    size                             136
    config                           0x8203 (mem-loads-aux)
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
    read_format                      ID|LOST
    disabled                         1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
  ------------------------------------------------------------
  perf_event_attr:
    type                             4 (cpu)
    size                             136
    config                           0x1cd (mem-loads)
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
    read_format                      ID|LOST
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    { bp_addr, config1 }             0x1f
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
  sys_perf_event_open failed, error -22
  Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.
  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/mem-loads,ldlat=30/).
  "dmesg | grep -i perf" may provide additional information.

There's an issue with fallback on the inherit bit with the sample read.
I'll take a look.

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-04 21:44     ` Namhyung Kim
@ 2024-12-04 22:21       ` Namhyung Kim
  2024-12-05 15:30         ` Arnaldo Carvalho de Melo
  2024-12-06  2:10         ` Oliver Sang
  0 siblings, 2 replies; 9+ messages in thread
From: Namhyung Kim @ 2024-12-04 22:21 UTC (permalink / raw)
  To: Oliver Sang
  Cc: oe-lkp, lkp, linux-kernel, Arnaldo Carvalho de Melo, James Clark,
	Ravi Bangoria, Kan Liang, James Clark, Atish Patra, Mingwei Zhang,
	Kajol Jain, Thomas Richter, Palmer Dabbelt, linux-perf-users

On Wed, Dec 04, 2024 at 01:44:06PM -0800, Namhyung Kim wrote:
[SNIP]
>   perf_event_attr:
>     type                             4 (cpu)
>     size                             136
>     config                           0x1cd (mem-loads)
>     { sample_period, sample_freq }   4000
>     sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
>     read_format                      ID|LOST
>     freq                             1
>     precise_ip                       3
>     sample_id_all                    1
>     { bp_addr, config1 }             0x1f
>   ------------------------------------------------------------
>   sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
>   sys_perf_event_open failed, error -22
>   Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.
>   Error:
>   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/mem-loads,ldlat=30/).
>   "dmesg | grep -i perf" may provide additional information.
> 
> There's an issue with fallback on the inherit bit with the sample read.
> I'll take a look.

Hmm, no.  It doesn't have neight SAMPLE_READ nor inherit.  So the error
message was misleading.  Maybe it should be printed when it actually
clears the bits.

Anyway, I've tested with the old code and realzed that it might be due
to precise_ip being 3.  I expected it'd return EOPNOTSUPP for the case
but it seems to return EINVAL sometimes.  Then it should check it after
the missing features like below.  Can you please test?

Thanks,
Namhyung


---8<---
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index f745723d486ba962..d22c5df1701eccc5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -2571,12 +2571,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
 	if (err == -EMFILE && rlimit__increase_nofile(&set_rlimit))
 		goto retry_open;
 
-	if (err == -EOPNOTSUPP && evsel__precise_ip_fallback(evsel))
-		goto retry_open;
-
 	if (err == -EINVAL && evsel__detect_missing_features(evsel))
 		goto fallback_missing_features;
 
+	if (evsel__precise_ip_fallback(evsel))
+		goto retry_open;
+
 	if (evsel__handle_error_quirks(evsel, err))
 		goto retry_open;
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-04 22:21       ` Namhyung Kim
@ 2024-12-05 15:30         ` Arnaldo Carvalho de Melo
  2024-12-11 17:27           ` Namhyung Kim
  2024-12-06  2:10         ` Oliver Sang
  1 sibling, 1 reply; 9+ messages in thread
From: Arnaldo Carvalho de Melo @ 2024-12-05 15:30 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, James Clark,
	Ravi Bangoria, Kan Liang, James Clark, Atish Patra, Mingwei Zhang,
	Kajol Jain, Thomas Richter, Palmer Dabbelt, linux-perf-users

On Wed, Dec 04, 2024 at 02:21:06PM -0800, Namhyung Kim wrote:
> On Wed, Dec 04, 2024 at 01:44:06PM -0800, Namhyung Kim wrote:
> [SNIP]
> >   perf_event_attr:
> >     type                             4 (cpu)
> >     size                             136
> >     config                           0x1cd (mem-loads)
> >     { sample_period, sample_freq }   4000
> >     sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
> >     read_format                      ID|LOST
> >     freq                             1
> >     precise_ip                       3
> >     sample_id_all                    1
> >     { bp_addr, config1 }             0x1f
> >   ------------------------------------------------------------
> >   sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
> >   sys_perf_event_open failed, error -22
> >   Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.
> >   Error:
> >   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/mem-loads,ldlat=30/).
> >   "dmesg | grep -i perf" may provide additional information.
> > 
> > There's an issue with fallback on the inherit bit with the sample read.
> > I'll take a look.
> 
> Hmm, no.  It doesn't have neight SAMPLE_READ nor inherit.  So the error
> message was misleading.  Maybe it should be printed when it actually
> clears the bits.
> 
> Anyway, I've tested with the old code and realzed that it might be due
> to precise_ip being 3.  I expected it'd return EOPNOTSUPP for the case
> but it seems to return EINVAL sometimes.  Then it should check it after
> the missing features like below.  Can you please test?

Before:

root@number:/tmp# perf mem record -a sleep 1s
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/mem-loads,ldlat=30/).
"dmesg | grep -i perf" may provide additional information.

root@number:/tmp# 

With your patch:

root@number:/tmp# perf mem record -a sleep 1s
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 11.211 MB perf.data (14616 samples) ]
root@number:/tmp# perf evlist
cpu_atom/mem-loads,ldlat=30/P
cpu_atom/mem-stores/P
cpu_core/mem-loads-aux/
cpu_core/mem-loads,ldlat=30/
cpu_core/mem-stores/P
dummy:u
# Tip: use 'perf evlist -g' to show group information
root@number:/tmp# perf evlist -v
cpu_atom/mem-loads,ldlat=30/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x1f
cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
cpu_core/mem-loads-aux/: type: 4 (cpu_core), size: 136, config: 0x8203 (mem-loads-aux), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
cpu_core/mem-loads,ldlat=30/: type: 4 (cpu_core), size: 136, config: 0x1cd (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, freq: 1, precise_ip: 2, sample_id_all: 1, { bp_addr, config1 }: 0x1f
cpu_core/mem-stores/P: type: 4 (cpu_core), size: 136, config: 0x2cd (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
# Tip: use 'perf evlist -g' to show group information
root@number:/tmp#

But there is something strange, 'cpu_core/mem-loads-aux/' doesn't have
/P, i.e. shouldn't try to set precise_ip to 3, but according to 'perf
evlist -v' it is setting it to 3.

I thought maybe it could be related to groups, but:

root@number:/tmp# perf evlist -g
cpu_atom/mem-loads,ldlat=30/P
cpu_atom/mem-stores/P
{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
cpu_core/mem-stores/P
dummy:u
root@number:/tmp# 

But then, in tools/perf/arch/x86/util/mem-events.c

struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
        E("ldlat-loads",        "%s/mem-loads,ldlat=%u/P",      "mem-loads",    true,   0),
        E("ldlat-stores",       "%s/mem-stores/P",              "mem-stores",   false,  0),
        E(NULL,                 NULL,                           NULL,           false,  0),
};

struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX] = {
        E("ldlat-loads",        "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "mem-loads",    true,   MEM_LOADS_AUX),
        E("ldlat-stores",       "%s/mem-stores/P",              "mem-stores",   false,  0),
        E(NULL,                 NULL,                           NULL,           false,  0),
};

It has the :P for that group, maybe that is going to fallback?

I tried:

Breakpoint 1, evsel__precise_ip_fallback (evsel=0xf4a260) at util/evsel.c:1969
1969	{
(gdb) bt
#0  evsel__precise_ip_fallback (evsel=0xf4a260) at util/evsel.c:1969
#1  0x00000000005dfb09 in evsel__open_cpu (evsel=0xf4a260, cpus=0xf53840, threads=0xf5bfa0, start_cpu_map_idx=0, end_cpu_map_idx=16) at util/evsel.c:2577
#2  0x00000000005dfc54 in evsel__open (evsel=0xf4a260, cpus=0xf53840, threads=0xf5bfa0) at util/evsel.c:2603
#3  0x000000000042cbea in record__open (rec=0xec2ce0 <record>) at builtin-record.c:1370
#4  0x00000000004304a1 in __cmd_record (rec=0xec2ce0 <record>, argc=2, argv=0xf58180) at builtin-record.c:2489
#5  0x0000000000434840 in cmd_record (argc=2, argv=0xf58180) at builtin-record.c:4260
#6  0x0000000000469e93 in __cmd_record (argc=3, argv=0x7fffffffde20, mem=0x7fffffffd260, options=0x7fffffffd080) at builtin-mem.c:170
#7  0x000000000046b2f3 in cmd_mem (argc=4, argv=0x7fffffffde20) at builtin-mem.c:538
#8  0x00000000004c0414 in run_builtin (p=0xec6098 <commands+696>, argc=5, argv=0x7fffffffde20) at perf.c:351
#9  0x00000000004c06bb in handle_internal_command (argc=5, argv=0x7fffffffde20) at perf.c:404
#10 0x00000000004c0814 in run_argv (argcp=0x7fffffffdc0c, argv=0x7fffffffdc00) at perf.c:448
#11 0x00000000004c0b5d in main (argc=5, argv=0x7fffffffde20) at perf.c:560
(gdb) print evsel__name(evsel)
$1 = 0xf529b0 "cpu_core/mem-loads,ldlat=30/"
(gdb) p evsel->core.attr.precise_ip 
$2 = 3
(gdb) p evsel->precise_max
$3 = true
(gdb)

And it fell back to precise_ip=2, the previous attempt at opening with 3
resulted in EINVAL.

It should have that precise level reflected in the evsel name :-\

Ran out of time, hope the above helps.

Apart from that, from a purely regression fix, your patch gets the
previous behaviour, from this isolated test I made, so:

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

- Arnaldo

> Thanks,
> Namhyung
> 
> 
> ---8<---
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index f745723d486ba962..d22c5df1701eccc5 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2571,12 +2571,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>  	if (err == -EMFILE && rlimit__increase_nofile(&set_rlimit))
>  		goto retry_open;
>  
> -	if (err == -EOPNOTSUPP && evsel__precise_ip_fallback(evsel))
> -		goto retry_open;
> -
>  	if (err == -EINVAL && evsel__detect_missing_features(evsel))
>  		goto fallback_missing_features;
>  
> +	if (evsel__precise_ip_fallback(evsel))
> +		goto retry_open;
> +
>  	if (evsel__handle_error_quirks(evsel, err))
>  		goto retry_open;
>  

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-04 22:21       ` Namhyung Kim
  2024-12-05 15:30         ` Arnaldo Carvalho de Melo
@ 2024-12-06  2:10         ` Oliver Sang
  1 sibling, 0 replies; 9+ messages in thread
From: Oliver Sang @ 2024-12-06  2:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: oe-lkp, lkp, linux-kernel, Arnaldo Carvalho de Melo, James Clark,
	Ravi Bangoria, Kan Liang, James Clark, Atish Patra, Mingwei Zhang,
	Kajol Jain, Thomas Richter, Palmer Dabbelt, linux-perf-users,
	oliver.sang

hi, Namhyung Kim,

On Wed, Dec 04, 2024 at 02:21:06PM -0800, Namhyung Kim wrote:
> On Wed, Dec 04, 2024 at 01:44:06PM -0800, Namhyung Kim wrote:
> [SNIP]
> >   perf_event_attr:
> >     type                             4 (cpu)
> >     size                             136
> >     config                           0x1cd (mem-loads)
> >     { sample_period, sample_freq }   4000
> >     sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
> >     read_format                      ID|LOST
> >     freq                             1
> >     precise_ip                       3
> >     sample_id_all                    1
> >     { bp_addr, config1 }             0x1f
> >   ------------------------------------------------------------
> >   sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
> >   sys_perf_event_open failed, error -22
> >   Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.
> >   Error:
> >   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/mem-loads,ldlat=30/).
> >   "dmesg | grep -i perf" may provide additional information.
> > 
> > There's an issue with fallback on the inherit bit with the sample read.
> > I'll take a look.
> 
> Hmm, no.  It doesn't have neight SAMPLE_READ nor inherit.  So the error
> message was misleading.  Maybe it should be printed when it actually
> clears the bits.
> 
> Anyway, I've tested with the old code and realzed that it might be due
> to precise_ip being 3.  I expected it'd return EOPNOTSUPP for the case
> but it seems to return EINVAL sometimes.  Then it should check it after
> the missing features like below.  Can you please test?

sorry that we are refining our config these days, then broke the test for this
case for now. we will fix it but it will delay the test for your patch.

fortunately, we saw Arnaldo tested patch. hope our delay won't cause too much
inconvenience.

> 
> Thanks,
> Namhyung
> 
> 
> ---8<---
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index f745723d486ba962..d22c5df1701eccc5 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2571,12 +2571,12 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus,
>  	if (err == -EMFILE && rlimit__increase_nofile(&set_rlimit))
>  		goto retry_open;
>  
> -	if (err == -EOPNOTSUPP && evsel__precise_ip_fallback(evsel))
> -		goto retry_open;
> -
>  	if (err == -EINVAL && evsel__detect_missing_features(evsel))
>  		goto fallback_missing_features;
>  
> +	if (evsel__precise_ip_fallback(evsel))
> +		goto retry_open;
> +
>  	if (evsel__handle_error_quirks(evsel, err))
>  		goto retry_open;
>  
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-05 15:30         ` Arnaldo Carvalho de Melo
@ 2024-12-11 17:27           ` Namhyung Kim
  2024-12-12  2:00             ` Oliver Sang
  0 siblings, 1 reply; 9+ messages in thread
From: Namhyung Kim @ 2024-12-11 17:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Oliver Sang, oe-lkp, lkp, linux-kernel, James Clark,
	Ravi Bangoria, Kan Liang, James Clark, Atish Patra, Mingwei Zhang,
	Kajol Jain, Thomas Richter, Palmer Dabbelt, linux-perf-users

On Thu, Dec 05, 2024 at 12:30:18PM -0300, Arnaldo Carvalho de Melo wrote:
> On Wed, Dec 04, 2024 at 02:21:06PM -0800, Namhyung Kim wrote:
> > On Wed, Dec 04, 2024 at 01:44:06PM -0800, Namhyung Kim wrote:
> > [SNIP]
> > >   perf_event_attr:
> > >     type                             4 (cpu)
> > >     size                             136
> > >     config                           0x1cd (mem-loads)
> > >     { sample_period, sample_freq }   4000
> > >     sample_type                      IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT
> > >     read_format                      ID|LOST
> > >     freq                             1
> > >     precise_ip                       3
> > >     sample_id_all                    1
> > >     { bp_addr, config1 }             0x1f
> > >   ------------------------------------------------------------
> > >   sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
> > >   sys_perf_event_open failed, error -22
> > >   Using PERF_SAMPLE_READ / :S modifier is not compatible with inherit, falling back to no-inherit.
> > >   Error:
> > >   The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu/mem-loads,ldlat=30/).
> > >   "dmesg | grep -i perf" may provide additional information.
> > > 
> > > There's an issue with fallback on the inherit bit with the sample read.
> > > I'll take a look.
> > 
> > Hmm, no.  It doesn't have neight SAMPLE_READ nor inherit.  So the error
> > message was misleading.  Maybe it should be printed when it actually
> > clears the bits.
> > 
> > Anyway, I've tested with the old code and realzed that it might be due
> > to precise_ip being 3.  I expected it'd return EOPNOTSUPP for the case
> > but it seems to return EINVAL sometimes.  Then it should check it after
> > the missing features like below.  Can you please test?
> 
> Before:
> 
> root@number:/tmp# perf mem record -a sleep 1s
> Error:
> The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/mem-loads,ldlat=30/).
> "dmesg | grep -i perf" may provide additional information.
> 
> root@number:/tmp# 
> 
> With your patch:
> 
> root@number:/tmp# perf mem record -a sleep 1s
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 11.211 MB perf.data (14616 samples) ]
> root@number:/tmp# perf evlist
> cpu_atom/mem-loads,ldlat=30/P
> cpu_atom/mem-stores/P
> cpu_core/mem-loads-aux/
> cpu_core/mem-loads,ldlat=30/
> cpu_core/mem-stores/P
> dummy:u
> # Tip: use 'perf evlist -g' to show group information
> root@number:/tmp# perf evlist -v
> cpu_atom/mem-loads,ldlat=30/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x1f
> cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
> cpu_core/mem-loads-aux/: type: 4 (cpu_core), size: 136, config: 0x8203 (mem-loads-aux), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
> cpu_core/mem-loads,ldlat=30/: type: 4 (cpu_core), size: 136, config: 0x1cd (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, freq: 1, precise_ip: 2, sample_id_all: 1, { bp_addr, config1 }: 0x1f
> cpu_core/mem-stores/P: type: 4 (cpu_core), size: 136, config: 0x2cd (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 3, sample_id_all: 1
> dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
> # Tip: use 'perf evlist -g' to show group information
> root@number:/tmp#
> 
> But there is something strange, 'cpu_core/mem-loads-aux/' doesn't have
> /P, i.e. shouldn't try to set precise_ip to 3, but according to 'perf
> evlist -v' it is setting it to 3.
> 
> I thought maybe it could be related to groups, but:
> 
> root@number:/tmp# perf evlist -g
> cpu_atom/mem-loads,ldlat=30/P
> cpu_atom/mem-stores/P
> {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}
> cpu_core/mem-stores/P
> dummy:u
> root@number:/tmp# 
> 
> But then, in tools/perf/arch/x86/util/mem-events.c
> 
> struct perf_mem_event perf_mem_events_intel[PERF_MEM_EVENTS__MAX] = {
>         E("ldlat-loads",        "%s/mem-loads,ldlat=%u/P",      "mem-loads",    true,   0),
>         E("ldlat-stores",       "%s/mem-stores/P",              "mem-stores",   false,  0),
>         E(NULL,                 NULL,                           NULL,           false,  0),
> };
> 
> struct perf_mem_event perf_mem_events_intel_aux[PERF_MEM_EVENTS__MAX] = {
>         E("ldlat-loads",        "{%s/mem-loads-aux/,%s/mem-loads,ldlat=%u/}:P", "mem-loads",    true,   MEM_LOADS_AUX),
>         E("ldlat-stores",       "%s/mem-stores/P",              "mem-stores",   false,  0),
>         E(NULL,                 NULL,                           NULL,           false,  0),
> };
> 
> It has the :P for that group, maybe that is going to fallback?
> 
> I tried:
> 
> Breakpoint 1, evsel__precise_ip_fallback (evsel=0xf4a260) at util/evsel.c:1969
> 1969	{
> (gdb) bt
> #0  evsel__precise_ip_fallback (evsel=0xf4a260) at util/evsel.c:1969
> #1  0x00000000005dfb09 in evsel__open_cpu (evsel=0xf4a260, cpus=0xf53840, threads=0xf5bfa0, start_cpu_map_idx=0, end_cpu_map_idx=16) at util/evsel.c:2577
> #2  0x00000000005dfc54 in evsel__open (evsel=0xf4a260, cpus=0xf53840, threads=0xf5bfa0) at util/evsel.c:2603
> #3  0x000000000042cbea in record__open (rec=0xec2ce0 <record>) at builtin-record.c:1370
> #4  0x00000000004304a1 in __cmd_record (rec=0xec2ce0 <record>, argc=2, argv=0xf58180) at builtin-record.c:2489
> #5  0x0000000000434840 in cmd_record (argc=2, argv=0xf58180) at builtin-record.c:4260
> #6  0x0000000000469e93 in __cmd_record (argc=3, argv=0x7fffffffde20, mem=0x7fffffffd260, options=0x7fffffffd080) at builtin-mem.c:170
> #7  0x000000000046b2f3 in cmd_mem (argc=4, argv=0x7fffffffde20) at builtin-mem.c:538
> #8  0x00000000004c0414 in run_builtin (p=0xec6098 <commands+696>, argc=5, argv=0x7fffffffde20) at perf.c:351
> #9  0x00000000004c06bb in handle_internal_command (argc=5, argv=0x7fffffffde20) at perf.c:404
> #10 0x00000000004c0814 in run_argv (argcp=0x7fffffffdc0c, argv=0x7fffffffdc00) at perf.c:448
> #11 0x00000000004c0b5d in main (argc=5, argv=0x7fffffffde20) at perf.c:560
> (gdb) print evsel__name(evsel)
> $1 = 0xf529b0 "cpu_core/mem-loads,ldlat=30/"
> (gdb) p evsel->core.attr.precise_ip 
> $2 = 3
> (gdb) p evsel->precise_max
> $3 = true
> (gdb)
> 
> And it fell back to precise_ip=2, the previous attempt at opening with 3
> resulted in EINVAL.
> 
> It should have that precise level reflected in the evsel name :-\
> 
> Ran out of time, hope the above helps.
> 
> Apart from that, from a purely regression fix, your patch gets the
> previous behaviour, from this isolated test I made, so:
> 
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Applied to perf-tools, thanks!

Best regards,
Namhyung


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linus:master] [perf tools]  af954f76ee: perf-sanity-tests.Test_data_symbol.fail
  2024-12-11 17:27           ` Namhyung Kim
@ 2024-12-12  2:00             ` Oliver Sang
  0 siblings, 0 replies; 9+ messages in thread
From: Oliver Sang @ 2024-12-12  2:00 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, oe-lkp, lkp, linux-kernel, James Clark,
	Ravi Bangoria, Kan Liang, James Clark, Atish Patra, Mingwei Zhang,
	Kajol Jain, Thomas Richter, Palmer Dabbelt, linux-perf-users,
	oliver.sang

hi,  Namhyung,

On Wed, Dec 11, 2024 at 09:27:12AM -0800, Namhyung Kim wrote:

[...]

> > It should have that precise level reflected in the evsel name :-\
> > 
> > Ran out of time, hope the above helps.
> > 
> > Apart from that, from a purely regression fix, your patch gets the
> > previous behaviour, from this isolated test I made, so:
> > 
> > Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> Applied to perf-tools, thanks!

sorry for late. just FYI.

we finally finished the test with your patch, and confirmed
perf-sanity-tests.Test_data_symbol can pass now.

Tested-by: kernel test robot <oliver.sang@intel.com>

> 
> Best regards,
> Namhyung
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-12-12  2:00 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-30  7:03 [linus:master] [perf tools] af954f76ee: perf-sanity-tests.Test_data_symbol.fail kernel test robot
2024-12-02 20:32 ` Namhyung Kim
2024-12-04 14:04   ` Oliver Sang
2024-12-04 21:44     ` Namhyung Kim
2024-12-04 22:21       ` Namhyung Kim
2024-12-05 15:30         ` Arnaldo Carvalho de Melo
2024-12-11 17:27           ` Namhyung Kim
2024-12-12  2:00             ` Oliver Sang
2024-12-06  2:10         ` Oliver Sang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).