From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>,
linux-perf-users@vger.kernel.org, namhyung@kernel.org,
Ingo Molnar <mingo@redhat.com>,
Alexander Shishkin <alexander.shishkin@linux.intel.com>,
Peter Zijlstra <peterz@infradead.org>
Subject: Re: measuring system wide CPU usage ignoring idle process
Date: Thu, 23 Nov 2017 11:42:20 -0300 [thread overview]
Message-ID: <20171123144220.GB8789@kernel.org> (raw)
In-Reply-To: <20171123142100.GA7066@krava>
Em Thu, Nov 23, 2017 at 03:21:00PM +0100, Jiri Olsa escreveu:
> On Thu, Nov 23, 2017 at 03:09:31PM +0100, Jiri Olsa wrote:
> > On Thu, Nov 23, 2017 at 02:40:36PM +0100, Milian Wolff wrote:
> > > On Tuesday, November 21, 2017 12:44:38 AM CET Jiri Olsa wrote:
> > > > On Mon, Nov 20, 2017 at 09:24:42PM +0100, Milian Wolff wrote:
> > > > > On Montag, 20. November 2017 15:29:08 CET Jiri Olsa wrote:
> > > > > > On Mon, Nov 20, 2017 at 03:00:46PM +0100, Milian Wolff wrote:
> > > > > > > Hey all,
> > > > > > >
> > > > > > > colleagues of mine just brought this inconvenient perf stat behavior
> > > > > > > to my
> > > > > > > attention:
> > > > > > >
> > > > > > > $ perf stat -a -e cpu-clock,task-clock,cycles,instructions sleep 1
> > > > > > >
> > > > > > > Performance counter stats for 'system wide':
> > > > > > > 4004.501439 cpu-clock (msec) # 4.000 CPUs
> > > > > > > utilized
> > > > > > > 4004.526474 task-clock (msec) # 4.000 CPUs
> > > > > > > utilized
> > > > > > > 945,906,029 cycles # 0.236 GHz
> > > > > > > 461,861,241 instructions # 0.49 insn per
> > > > > > > cycle
> > > > > > >
> > > > > > > 1.001247082 seconds time elapsed
> > > > > > >
> > > > > > > This shows that cpu-clock and task-clock are incremented also for the
> > > > > > > idle
> > > > > > > processes. Is there some trick to exclude that time, such that the CPU
> > > > > > > utilization drops below 100% when doing `perf stat -a`?
> > > > > >
> > > > > > I dont think it's the idle process you see, I think it's the managing
> > > > > > overhead before the 'sleep 1' task goes actualy to sleep
> > > > > >
> > > > > > there's some user space code before it gets into the sleep syscall,
> > > > > > and there's some possible kernel scheduling/syscall/irq code with
> > > > > > events already enabled and counting
> > > > >
> > > > > Sorry for being unclear: I was talking about the task-clock and cpu-clock
> > > > > values which you omitted from your measurements below. My example also
> > > > > shows that the counts for cycles and instructions are fine. But the
> > > > > cpu-clock and task-clock are useless as they always sum up to essentially
> > > > > `$nproc*$runtime`. What I'm hoping for are fractional values for the "N
> > > > > CPUs utilized".
> > > > ugh my bad.. anyway by using -a you create cpu counters
> > > > which never unschedule, so those times will be same
> > > > as the 'sleep 1' run length
Humm, what role perf_event_attr.exclude_idle has here?
> > > >
> > > > but not sure now how to get the real utilization.. will check
> > >
> > > did you have a chance to check the above? I'd be really interested in knowing
> > > whether there is an existing workaround. If not, would it be feasible to patch
> > > perf to get the desired behavior? I'd be willing to look into this. This would
> > > probably require changes on the kernel side though, or how could this be
> > > fixed?
> >
> > hi,
> > I haven't found any good way yet.. I ended up with following
> > patch to allow attach counters to idle process, which got
> > me the count/behaviour you need (with few tools changes in
> > my perf/idle branch)
> >
> > but I'm not sure it's the best idea ;-) there might
> > be better way.. CC-ing Ingo, Peter and Alexander
>
> also I was thinking we might add 'idle' line into perf top ;-)
> shouldn't be that hard once we have the counter
Humm...
What is wrong with perf_event_attr.exclude_idle? :-)
From include/uapi/linux/perf_event.h:
exclude_idle : 1, /* don't count when idle */
But it is not being set:
[root@jouet ~]# perf stat -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1
Using CPUID GenuineIntel-6-3D
intel_pt default config: tsc,pt,branch
------------------------------------------------------------
perf_event_attr:
type 1
size 112
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
------------------------------------------------------------
perf_event_attr:
type 1
size 112
config 0x1
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
size 112
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
------------------------------------------------------------
perf_event_attr:
size 112
config 0x1
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 19
cpu-clock: 0: 1001547771 1001547617 1001547617
cpu-clock: 1: 1001552938 1001552742 1001552742
cpu-clock: 2: 1001555120 1001554407 1001554407
cpu-clock: 3: 1001563889 1001563570 1001563570
cpu-clock: 4006219718 4006218336 4006218336
task-clock: 0: 1001603894 1001603894 1001603894
task-clock: 1: 1001616140 1001616140 1001616140
task-clock: 2: 1001617338 1001617338 1001617338
task-clock: 3: 1001621998 1001621998 1001621998
task-clock: 4006459370 4006459370 4006459370
cycles: 0: 71757776 1001642926 1001642926
cycles: 1: 23188411 1001651335 1001651335
cycles: 2: 24665622 1001654878 1001654878
cycles: 3: 79907293 1001659590 1001659590
cycles: 199519102 4006608729 4006608729
instructions: 0: 40314068 1001677791 1001677791
instructions: 1: 13525409 1001682314 1001682314
instructions: 2: 14247277 1001682655 1001682655
instructions: 3: 23286057 1001685112 1001685112
instructions: 91372811 4006727872 4006727872
Performance counter stats for 'system wide':
4006.219718 cpu-clock (msec) # 3.999 CPUs utilized
4006.459370 task-clock (msec) # 3.999 CPUs utilized
199,519,102 cycles # 0.050 GHz
91,372,811 instructions # 0.46 insn per cycle
1.001749823 seconds time elapsed
[root@jouet ~]#
So the I tried the patch at the end of this messagem, but it doesn't
seem to affect software counters such as cpu-clock and task-clock:
[root@jouet ~]# perf stat --no-idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1m
Performance counter stats for 'system wide':
240005.027025 cpu-clock (msec) # 4.000 CPUs utilized
240005.150119 task-clock (msec) # 4.000 CPUs utilized
2,658,680,286 cycles # 0.011 GHz
1,109,111,339 instructions # 0.42 insn per cycle
60.001361214 seconds time elapsed
[root@jouet ~]# perf stat --idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1m
Performance counter stats for 'system wide':
240006.825047 cpu-clock (msec) # 4.000 CPUs utilized
240006.964995 task-clock (msec) # 4.000 CPUs utilized
2,784,702,480 cycles # 0.012 GHz
1,210,285,863 instructions # 0.43 insn per cycle
60.001806963 seconds time elapsed
[root@jouet ~]#
[root@jouet ~]# perf stat -vv --no-idle -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
exclude_idle 1
exclude_idle 1
exclude_idle 1
exclude_idle 1
[root@jouet ~]# perf stat -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
[root@jouet ~]# perf stat --idle -vv -a -e cpu-clock,task-clock,cycles,instructions sleep 1 |& grep exclude_idle
[root@jouet ~]#
Time to look at the kernel...
- Arnaldo
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 59af5a8419e2..32860537e114 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -144,6 +144,7 @@ typedef int (*aggr_get_id_t)(struct cpu_map *m, int cpu);
static int run_count = 1;
static bool no_inherit = false;
+static bool idle = true;
static volatile pid_t child_pid = -1;
static bool null_run = false;
static int detailed_run = 0;
@@ -237,6 +238,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
attr->read_format |= PERF_FORMAT_ID|PERF_FORMAT_GROUP;
attr->inherit = !no_inherit;
+ attr->exclude_idle = !idle;
/*
* Some events get initialized with sample_(period/type) set,
@@ -1890,6 +1892,7 @@ static const struct option stat_options[] = {
OPT_CALLBACK('M', "metrics", &evsel_list, "metric/metric group list",
"monitor specified metrics or metric groups (separated by ,)",
parse_metric_groups),
+ OPT_BOOLEAN(0, "idle", &idle, "Measure when idle"),
OPT_END()
};
next prev parent reply other threads:[~2017-11-23 14:42 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 14:00 measuring system wide CPU usage ignoring idle process Milian Wolff
2017-11-20 14:29 ` Jiri Olsa
2017-11-20 20:24 ` Milian Wolff
2017-11-20 23:44 ` Jiri Olsa
2017-11-23 13:40 ` Milian Wolff
2017-11-23 14:09 ` Jiri Olsa
2017-11-23 14:21 ` Jiri Olsa
2017-11-23 14:42 ` Arnaldo Carvalho de Melo [this message]
2017-11-23 15:12 ` Jiri Olsa
2017-11-23 18:59 ` Arnaldo Carvalho de Melo
2017-11-24 8:14 ` Jiri Olsa
2017-11-23 15:15 ` Peter Zijlstra
2018-04-17 13:41 ` Arnaldo Carvalho de Melo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171123144220.GB8789@kernel.org \
--to=acme@kernel.org \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@redhat.com \
--cc=linux-perf-users@vger.kernel.org \
--cc=milian.wolff@kdab.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).