Using perf with cgroups and containers

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Using perf with cgroups and containers
@ 2014-11-26 16:59 William Cohen
  2014-11-26 20:52 ` Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: William Cohen @ 2014-11-26 16:59 UTC (permalink / raw)
  To: linux-perf-users

Hi,

I have been looking at how perf supports cgroups and containers.  The
"-G" option allows limiting the data collected to a particular cgroup.
Thus, one can use the option to collect some information about a
particular cgroup with something like:

$ sudo perf stat -a -e cycles  -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -e instructions -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -- sleep 1

 Performance counter stats for 'system wide':

         9,668,237      cycles                    machine.slice/machine-qemu\x2drhel7\x2dx86_64.scope [82.28%]
         4,685,886      instructions              machine.slice/machine-qemu\x2drhel7\x2dx86_64.scope #    0.48  insns per cycle         [82.28%]

       1.001359839 seconds time elapsed

However, this approach seems to be awkward.  It requires specifying
the cgroup for each event.  It also requires the system-wide option
("-a") to get information for all the tasks in the cgroup and
superuser privileges.  Thus, even if all the tasks are owned by the
user running the perf command, the command still needs superuser
privileges.

Another limitation is when within a container there doesn't seem to be
a way of doing the equivalent to a "perf record -a ..." to collect
related to that container.  When running within a container going to
get something like the following:

# perf stat -a ls
Error:
You may not have permission to collect system-wide stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid:
 -1 - Not paranoid at all
  0 - Disallow raw tracepoint access for unpriv
  1 - Disallow cpu events for unpriv
  2 - Disallow kernel profiling for unpriv

There is a middle ground between monitoring a process/set of processes
and monitoring the entire machine that perf could do a better job of.
Have three broad categories to scope the data collection: pid, cgroup,
and container.

For pid perf can be monitoring itself or a set of processes.  perf has
checks to make sure that the monitoring process has permission to
monitor the other processes. 

For cgroup monitoring perf implementation only works for systemwide
monitoring "-a".  Why should someone need to specify "-a" when they
specify the cgroup to monitor?  It seems like this should operater
much more like the selection of processes to monitor.

For containers one might want to either monitor a container from
outside for system health or from the inside while doing development.
Currently perf can monitor from outside but isn't able to monitor from
inside.

Can something be done to improve the usability of perf for cgroups and
containers?

-Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Using perf with cgroups and containers
  2014-11-26 16:59 Using perf with cgroups and containers William Cohen
@ 2014-11-26 20:52 ` Andi Kleen
  2014-11-26 21:29   ` William Cohen
  0 siblings, 1 reply; 4+ messages in thread
From: Andi Kleen @ 2014-11-26 20:52 UTC (permalink / raw)
  To: William Cohen; +Cc: linux-perf-users

William Cohen <wcohen@redhat.com> writes:

> Hi,
>
> I have been looking at how perf supports cgroups and containers.  The
> "-G" option allows limiting the data collected to a particular cgroup.
> Thus, one can use the option to collect some information about a
> particular cgroup with something like:
>
> $ sudo perf stat -a -e cycles  -G
> machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -e instructions
> -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -- sleep 1

You can specify multiple events with -e. Typically you should anyways,
to define appropiate groups with {}

perf record -a -e cycles,instructions -G cgroup  ...

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Using perf with cgroups and containers
  2014-11-26 20:52 ` Andi Kleen
@ 2014-11-26 21:29   ` William Cohen
  2014-11-28 18:00     ` Andi Kleen
  0 siblings, 1 reply; 4+ messages in thread
From: William Cohen @ 2014-11-26 21:29 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-perf-users

On 11/26/2014 03:52 PM, Andi Kleen wrote:
> William Cohen <wcohen@redhat.com> writes:
> 
>> Hi,
>>
>> I have been looking at how perf supports cgroups and containers.  The
>> "-G" option allows limiting the data collected to a particular cgroup.
>> Thus, one can use the option to collect some information about a
>> particular cgroup with something like:
>>
>> $ sudo perf stat -a -e cycles  -G
>> machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -e instructions
>> -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -- sleep 1
> 
> You can specify multiple events with -e. Typically you should anyways,
> to define appropiate groups with {}
> 
> perf record -a -e cycles,instructions -G cgroup  ...
> 
> -Andi
> 

Hi Andi,

Is there some where that explain use of the "{}" for event grouping?  The various perf man pages I have looked at (perf-record, perf-stat, and perf) don't seem to mention it. When reading the following from "man perf-record" it sounded like the comman separated event list wouldn't work:

      -G name,..., --cgroup name,...
           monitor only in the container (cgroup) called "name". This option
           is available only in per-cpu mode. The cgroup filesystem must be
           mounted. All threads belonging to container "name" are monitored
           when they run on the monitored CPUs. Multiple cgroups can be
           provided. Each cgroup is applied to the corresponding event, i.e.,
           first cgroup to first event, second cgroup to second event and so
           on. It is possible to provide an empty cgroup (monitor all the
           time) using, e.g., -G foo,,bar. Cgroups must have corresponding
           events, i.e., they always refer to events defined earlier on the
           command line.

The results looks pretty questionable on my machine with the version of perf and kernel I am using:

$ uname -a
Linux santana 3.17.3-200.fc20.x86_64 #1 SMP Fri Nov 14 19:45:42 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ rpm -q perf
perf-3.17.3-200.fc20.x86_64
$ sudo perf stat -a -e cycles,instructions -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -- sleep .2

 Performance counter stats for 'system wide':

         2,983,776      cycles                    machine.slice/machine-qemu\x2drhel7\x2dx86_64.scope [74.96%]
        87,972,874      instructions              #   29.48  insns per cycle         [100.00%]

       0.201151985 seconds time elapsed

$ sudo perf stat -a -e "{cycles,instructions}" -G machine.slice/machine-qemu\\x2drhel7\\x2dx86_64.scope -- sleep .2

 Performance counter stats for 'system wide':

         2,512,934      cycles                    machine.slice/machine-qemu\x2drhel7\x2dx86_64.scope [82.53%]
       813,334,082      instructions              #   323.66  insns per cycle         [ 0.09%]

       0.201360285 seconds time elapsed


-Will

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Using perf with cgroups and containers
  2014-11-26 21:29   ` William Cohen
@ 2014-11-28 18:00     ` Andi Kleen
  0 siblings, 0 replies; 4+ messages in thread
From: Andi Kleen @ 2014-11-28 18:00 UTC (permalink / raw)
  To: William Cohen; +Cc: Andi Kleen, linux-perf-users

> Is there some where that explain use of the "{}" for event grouping? 

The only good description I know of is in the ucevent documentation.

https://github.com/andikleen/pmu-tools/tree/master/ucevent#grouping-event-scheduling-and-measurement-inaccuracy

Yes the documentation probably needs to be improved (as in many other
ways)

> 
>       -G name,..., --cgroup name,...
>            monitor only in the container (cgroup) called "name". This option
>            is available only in per-cpu mode. The cgroup filesystem must be
>            mounted. All threads belonging to container "name" are monitored
>            when they run on the monitored CPUs. Multiple cgroups can be
>            provided. Each cgroup is applied to the corresponding event, i.e.,
>            first cgroup to first event, second cgroup to second event and so
>            on. It is possible to provide an empty cgroup (monitor all the
>            time) using, e.g., -G foo,,bar. Cgroups must have corresponding
>            events, i.e., they always refer to events defined earlier on the
>            command line.
> 
> The results looks pretty questionable on my machine with the version of perf and kernel I am using:

You're right it doesn't work as I described. Should probably fix it,
my way would make a lot more sense :-)

I guess the current interface was more aimed at scripts.


-Andi
-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-11-28 18:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-26 16:59 Using perf with cgroups and containers William Cohen
2014-11-26 20:52 ` Andi Kleen
2014-11-26 21:29   ` William Cohen
2014-11-28 18:00     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).