* [BUG] perf_events: ctx_flexible_sched_in()
@ 2010-02-01 12:20 Stephane Eranian
2010-02-01 12:27 ` Peter Zijlstra
0 siblings, 1 reply; 2+ messages in thread
From: Stephane Eranian @ 2010-02-01 12:20 UTC (permalink / raw)
To: Peter Zijlstra
Cc: eranian, linux-kernel, mingo, paulus, davem, fweisbec,
perfmon2-devel
Hi,
I believe there is something wrong with ctx_flexible_sched_in().
The function does not allow maximizing PMU usage because of
the way can_add_hw is managed. Basically, as soon as a group
fail to be scheduled in, then no other group can. I believe this
is not optimum. You need to skip the group that fails and keep
scanning the list. There may be other groups which can be
scheduled.
Here is an example to illustrate the issue:
$ task -ebaclears,div,instructions_retired,fp_assist noploop 5
noploop for 5 seconds
908 baclears (scaled from 74.97% of time)
0 div (scaled from 50.01% of time)
11328128990 instructions_retired (scaled from 74.99% of time)
0 fp_assist (scaled from 50.00% of time)
Here div, fp_assist can only go on counter 1. There is no explicit
grouping. On Intel Core, you have 2 generic, 3 fixed counters.
Instruction_retired can go on a fixed counter. Thus, I was
expecting baclears and instructions_retired to always be scheduled.
The other two would alternate at 50% each. While you get the latter
behavior, you are not getting full utilization for the other two.
Once I modify ctx_flexible_sched_in():
$ ./task -ebaclears,div,instructions_retired,fp_assist noploop 5
noploop for 5 seconds
658 baclears
0 div (scaled from 50.01% of time)
11726844342 instructions_retired
0 fp_assist (scaled from 50.00% of time)
I get the right result. Thus, I think, we need to drop can_add_hw
from ctx_flexible_sched_in().
Am I missing something in the role of can_add_hw?
If not, then I I will provide a patch to get the optimum behavior.
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [BUG] perf_events: ctx_flexible_sched_in()
2010-02-01 12:20 [BUG] perf_events: ctx_flexible_sched_in() Stephane Eranian
@ 2010-02-01 12:27 ` Peter Zijlstra
0 siblings, 0 replies; 2+ messages in thread
From: Peter Zijlstra @ 2010-02-01 12:27 UTC (permalink / raw)
To: Stephane Eranian
Cc: eranian, linux-kernel, mingo, paulus, davem, fweisbec,
perfmon2-devel
On Mon, 2010-02-01 at 13:20 +0100, Stephane Eranian wrote:
> The function does not allow maximizing PMU usage because of
> the way can_add_hw is managed. Basically, as soon as a group
> fail to be scheduled in, then no other group can. I believe this
> is not optimum. You need to skip the group that fails and keep
> scanning the list. There may be other groups which can be
> scheduled.
Yeah, I saw that too, we need a new hw_ callback for that, or more
structured error values out of ->enable that distinguish between this
event won't fit and pmu full.
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-02-01 12:27 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-01 12:20 [BUG] perf_events: ctx_flexible_sched_in() Stephane Eranian
2010-02-01 12:27 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox