linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Petlan <mpetlan@redhat.com>
To: Arnaldo Carvalho de Melo <arnaldo.melo@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>,
	Arnaldo de Melo <acme@redhat.com>,
	 vmolnaro@redhat.com,
	linux-perf-users <linux-perf-users@vger.kernel.org>
Subject: Re: perf test fail :: "perf stat --bpf-counters --for-each-cgroup test"
Date: Fri, 19 Jul 2024 13:05:02 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LRH.2.20.2407191303000.11376@Diego> (raw)
In-Reply-To: <CA+JHD90TkDVHPw4jqxMX2guqsg-8xrqD2iiEfZ_akixvVYZKZg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5022 bytes --]

On Fri, 19 Jul 2024, Arnaldo Carvalho de Melo wrote:
> On Fri, Jul 19, 2024, 6:50 AM Michael Petlan <mpetlan@redhat.com> wrote:
>       Hello Namhyung,
> 
>       we were investigating some test failures of the testcase mentioned
>       in $subj. We have narrowed it down to:
> 
>           # perf stat -C 0,1 --for-each-cgroup system.slice,user.slice -e cycles -- taskset -c 1 perf test -w thloop
> 
>           Performance counter stats for 'CPU(s) 0,1':
>                <not counted>      cycles                           system.slice
>                3,020,401,084      cycles                           user.slice                       
> 
>                1.009787097 seconds time elapsed
> 
>       As seen, the system.slice is not counted properly in our case. It
>       happens even without bpf-counters being involved.
> 
>       There were rumours that it might be caused due to too small system
>       load, but it apparently happens even when the load was replaced by
>       "thloop" workload from perf-test's workload library. However, even
>       so, if the load was insufficient, we'd see a value – 0 instead of
>       "not counted". The "<not counted>" result is printed if the counter
>       wasn't properly enabled and running.
> 
>       Have you encountered this problem? What could cause it?
> 
> 
> What does running with -vvv says? Some inconclusive error coming from the kernel? 

Nothing obvious:

# perf stat -vvv -C 0,1 --for-each-cgroup system.slice,user.slice -e cpu-clock taskset -c 0 perf test -w thloop
Using CPUID GenuineIntel-6-6A-6
Control descriptor is not initialized
Opening: cpu-clock 
------------------------------------------------------------
perf_event_attr:   
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0 (PERF_COUNT_SW_CPU_CLOCK)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 3  cpu 0  group_fd -1  flags 0xc = 5
Opening: cpu-clock 
------------------------------------------------------------
perf_event_attr:   
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0 (PERF_COUNT_SW_CPU_CLOCK)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 4  cpu 0  group_fd -1  flags 0xc = 6
Opening: cpu-clock
------------------------------------------------------------
perf_event_attr:
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0 (PERF_COUNT_SW_CPU_CLOCK)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 3  cpu 1  group_fd -1  flags 0xc = 7
Opening: cpu-clock
------------------------------------------------------------
perf_event_attr:   
  type                             1 (PERF_TYPE_SOFTWARE)
  size                             136
  config                           0 (PERF_COUNT_SW_CPU_CLOCK)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 4  cpu 1  group_fd -1  flags 0xc = 9
cpu-clock: 0: 0 0 0
cpu-clock: 0: 1004758163 1004761145 1004761145
cpu-clock: 1: 0 0 0
cpu-clock: 1: 60896 62271 62271
cpu-clock: 0 0 0   
cpu-clock: 1004819059 1004823416 1004823416

 Performance counter stats for 'CPU(s) 0,1':

     <not counted> msec cpu-clock                        system.slice
          1,004.82 msec cpu-clock                        user.slice       #    0.999 CPUs utilized

       1.005824026 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
        echo 0 > /proc/sys/kernel/nmi_watchdog
        perf stat ...
        echo 1 > /proc/sys/kernel/nmi_watchdog

....

The nmi_watchdog message is irrelevant, it does not work no matter what is set there.

> Maybe retsnoop can narrow it down? 

Will try. Thanks.
> 
> https://github.com/anakryiko/retsnoop
> 
> - Arnaldo 

Michael
> 
> 
> 
>       Thanks.
>       Michael
> 
> 
> 

  parent reply	other threads:[~2024-07-19 11:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-19  9:50 perf test fail :: "perf stat --bpf-counters --for-each-cgroup test" Michael Petlan
     [not found] ` <CA+JHD90TkDVHPw4jqxMX2guqsg-8xrqD2iiEfZ_akixvVYZKZg@mail.gmail.com>
2024-07-19 11:05   ` Michael Petlan [this message]
2024-11-01 10:15     ` Michael Petlan
2024-11-04 19:52       ` Namhyung Kim
2024-07-20  0:30 ` Namhyung Kim
2024-07-23  9:36   ` Michael Petlan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.20.2407191303000.11376@Diego \
    --to=mpetlan@redhat.com \
    --cc=acme@redhat.com \
    --cc=arnaldo.melo@gmail.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=vmolnaro@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).