linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: William Cohen <wcohen@redhat.com>
To: Andi Kleen <andi@firstfloor.org>
Cc: Elazar Leibovich <elazar.leibovich@ravellosystems.com>,
	linux-perf-users@vger.kernel.org,
	Stephane Eranian <eranian@google.com>
Subject: Re: Why the need to do a perf_event_open syscall for each cpu on the system?
Date: Tue, 17 Mar 2015 11:30:48 -0400	[thread overview]
Message-ID: <550848A8.10003@redhat.com> (raw)
In-Reply-To: <87twxjvg92.fsf@tassilo.jf.intel.com>

On 03/17/2015 10:40 AM, Andi Kleen wrote:
> William Cohen <wcohen@redhat.com> writes:
>>
>> Making user-space set up performance events for each cpu certainly
>> simplifies the kernel code for system-wide monitoring. The cgroup
>> support is essentially like system-wide monitoring with additional
>> filtering on the cgroup and things get more complicated using the perf
>> cgroup support when the cgroups are not pinned to a particular
>> processor, O(cgroups*cpus) opens and reads.  If the cgroups is scaled
>> up at the same rate as cpus, this would be O(cpus^2).  I am wondering
> 
> Using O() notation here is misleading because a perf event 
> is not an algorithmic step. It's just a data structure in memory,
> associated with a file descriptor.  But the number of active
> events at a time is always limited by the number of counters
> in the CPU (ignoring software events here) and is comparable
> small.
> 
> The memory usage is not a significant problem, it is dwarfed by other
> data structures per CPU.  Usually the main problem people run into is
> running out of file descriptors because most systems still run with a
> ulimit -n default of 1024, which is easy to reach with even a small
> number of event groups on a system with a moderate number of CPUs.
> 
> However ulimit -n can be easily fixed: just increase it. Arguably
> the distribution defaults should probably be increased.
> 
> -Andi
> 

Hi Andi,

O() notation can be used to describe both time and space, Reading the perf counters is O(cpu^2) in time. As mentioned the number of file descriptors required is going to be grow pretty large quickly as the number of cpus/cgroups increase. 32 cpus and 32 cgroups would be 1024 file descriptors; 80 cpus and 80 cgroups would 6400 file descriptors.  There are machines more than 80 processors.  Does it make sense to have multiple thousands of file descriptors for performance monitoring?

Making the user-space responsible opening and reading the counters from each processor simplifies the kernel code.  However, is making user-space do this a better solution than doing this system-wide setup and aggregation in the kernel?  How much overhead is there for all the user-space/kernel-space transitions to read out 100's of values from the kernel versus doing that in the kernel?

-Will

      reply	other threads:[~2015-03-17 15:30 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-13 18:49 Why the need to do a perf_event_open syscall for each cpu on the system? William Cohen
2015-03-13 21:14 ` Vince Weaver
2015-03-15  5:15 ` Elazar Leibovich
2015-03-16 14:47   ` William Cohen
2015-03-17  0:51     ` Stephane Eranian
2015-03-17 14:40     ` Andi Kleen
2015-03-17 15:30       ` William Cohen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=550848A8.10003@redhat.com \
    --to=wcohen@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=elazar.leibovich@ravellosystems.com \
    --cc=eranian@google.com \
    --cc=linux-perf-users@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).