From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756902Ab0IUJoQ (ORCPT ); Tue, 21 Sep 2010 05:44:16 -0400 Received: from bombadil.infradead.org ([18.85.46.34]:55426 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755497Ab0IUJoP convert rfc822-to-8bit (ORCPT ); Tue, 21 Sep 2010 05:44:15 -0400 Subject: Re: [RFC PATCH 0/2] perf_events: add support for per-cpu per-cgroup monitoring (v3) From: Peter Zijlstra To: eranian@google.com Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, paulus@samba.org, davem@davemloft.net, fweisbec@gmail.com, perfmon2-devel@lists.sf.net, eranian@gmail.com, robert.richter@amd.com, acme@redhat.com, Paul Menage , Li Zefan , Balbir Singh In-Reply-To: <1285061899.2275.824.camel@laptop> References: <4c88dc9c.991ce30a.3d91.3e0e@mx.google.com> <1285061899.2275.824.camel@laptop> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT Date: Tue, 21 Sep 2010 11:43:48 +0200 Message-ID: <1285062228.2275.826.camel@laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2010-09-21 at 11:38 +0200, Peter Zijlstra wrote: > On Thu, 2010-09-09 at 15:05 +0200, Stephane Eranian wrote: > > The cgroup to monitor is designated by passing a file descriptor opened > > on a new per-cgroup file in the cgroup filesystem (perf_event.perf). The > > option must be activated by setting perf_event_attr.cgroup=1 and passing > > a valid file descriptor in perf_event_attr.cgroup_fd. Those are the only > > two ABI extensions. > > > +++ b/include/linux/perf_event.h > > @@ -215,8 +215,9 @@ struct perf_event_attr { > > */ > > precise_ip : 2, /* skid constraint */ > > mmap_data : 1, /* non-exec mmap data */ > > + cgroup : 1, /* cgroup aggregation */ > > > > - __reserved_1 : 46; > > + __reserved_1 : 45; > > > > union { > > __u32 wakeup_events; /* wakeup every n events */ > > @@ -226,6 +227,8 @@ struct perf_event_attr { > > __u32 bp_type; > > __u64 bp_addr; > > __u64 bp_len; > > + > > + int cgroup_fd; > > }; > > > > /* > > I'm not sure I like this much.. so we attach to {pid,cpu}, for nodes we > can use cpu_to_node(cpu), which would suggest to use > cgroup_of_task(pid), except that a task can be part of multiple cgroups, > so its not unique. > > One thing we could do is pass this cgroup identifier in the pid field > and use PERF_FLAG_CGROUP or something. Currently the syscall signature > uses pid_t, but I think we can safely change that to int. > > You create a special new file in the cgroup stuff, I'm not sure about > that either, but its not something I feel too strongly about, why > wouldn't a fd of any file or even directory of that cgroup work? Do the > cgroup people have an opinion? Ahh, I just read more of the patch, and you create a full perf cgroup, in which case cgroup_of_task(pid) will work, simply pick the perf cgroup's tasks. No need to actually create that file, open it and pass fds around, just pick a task from that cgroup and attach to the cgroup through that.