linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf for analyzing userspace contention
@ 2011-10-14  6:36 Roland Dreier
  2011-10-14  6:58 ` Maucci, Cyrille
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Roland Dreier @ 2011-10-14  6:36 UTC (permalink / raw)
  To: linux-perf-users

Hi everyone,

Hope this question from a kernel hacker about profiling userspace
isn't too dumb...

Anyway, suppose I have a multithreaded userspace app that uses a bunch of
pthread_mutexes, and I want to figure out which locks are hot and/or heavily
contended.  What's the best way to do that?  Is perf the right, or is there
something better?  (This seems like such an obvious thing to want that there
must be some good way to get this data, I hope)

Thanks!
  Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: perf for analyzing userspace contention
  2011-10-14  6:36 perf for analyzing userspace contention Roland Dreier
@ 2011-10-14  6:58 ` Maucci, Cyrille
  2011-10-14  7:21   ` Roland Dreier
  2011-10-14  9:42 ` Stefan Hajnoczi
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Maucci, Cyrille @ 2011-10-14  6:58 UTC (permalink / raw)
  To: Roland Dreier, linux-perf-users@vger.kernel.org

So are those mutexes acquired/released from different call sites or from one single generic function of yours like LOCK/UNLOCK and the mutex is passed as a param?

If from different call paths, I'd say perf is ok for you to troubleshoot.
If from one single generic call paths, I don't know if perf already has the same capability that HP Caliper has on HPUX to provide a breakdown.

++Cyrille

-----Original Message-----
From: linux-perf-users-owner@vger.kernel.org [mailto:linux-perf-users-owner@vger.kernel.org] On Behalf Of Roland Dreier
Sent: Friday, October 14, 2011 8:36 AM
To: linux-perf-users@vger.kernel.org
Subject: perf for analyzing userspace contention

Hi everyone,

Hope this question from a kernel hacker about profiling userspace isn't too dumb...

Anyway, suppose I have a multithreaded userspace app that uses a bunch of pthread_mutexes, and I want to figure out which locks are hot and/or heavily contended.  What's the best way to do that?  Is perf the right, or is there something better?  (This seems like such an obvious thing to want that there must be some good way to get this data, I hope)

Thanks!
  Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14  6:58 ` Maucci, Cyrille
@ 2011-10-14  7:21   ` Roland Dreier
  2011-10-14  7:33     ` Maucci, Cyrille
  0 siblings, 1 reply; 10+ messages in thread
From: Roland Dreier @ 2011-10-14  7:21 UTC (permalink / raw)
  To: Maucci, Cyrille; +Cc: linux-perf-users@vger.kernel.org

On Thu, Oct 13, 2011 at 11:58 PM, Maucci, Cyrille <cyrille.maucci@hp.com> wrote:
> So are those mutexes acquired/released from different call sites or from one single generic function of yours like LOCK/UNLOCK and the mutex is passed as a param?

I have lots of code paths taking lots of different classes of locks.

My problem is I don't know what to ask perf for to see if I have a mutex that
is heavily contended, and so some threads are spending a lot of time asleep
waiting to acquire it.  As I said, this is probably pretty basic
stuff, but I'm not
usually much of a userspace guy...

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: perf for analyzing userspace contention
  2011-10-14  7:21   ` Roland Dreier
@ 2011-10-14  7:33     ` Maucci, Cyrille
  0 siblings, 0 replies; 10+ messages in thread
From: Maucci, Cyrille @ 2011-10-14  7:33 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-perf-users@vger.kernel.org

So I don't know posix mutexex implementation on linux but on HPUX they usually spin first then yield then spin again etc... Until sleep.
So tools like perf, reporting cpu consumption showing some mutexes path would reveal the pb.
On HPUX, posix mutexes spin/yield/sleep can be tuned through env variables.
If this is the same for Linux, you could temporarily set the # of spins to a very high value, to 'emphasize' the hotness of the contention through CPU usage and therefore something 'perf' was meant for since day1.

++Cyrille

-----Original Message-----
From: roland.dreier@gmail.com [mailto:roland.dreier@gmail.com] On Behalf Of Roland Dreier
Sent: Friday, October 14, 2011 9:21 AM
To: Maucci, Cyrille
Cc: linux-perf-users@vger.kernel.org
Subject: Re: perf for analyzing userspace contention

On Thu, Oct 13, 2011 at 11:58 PM, Maucci, Cyrille <cyrille.maucci@hp.com> wrote:
> So are those mutexes acquired/released from different call sites or from one single generic function of yours like LOCK/UNLOCK and the mutex is passed as a param?

I have lots of code paths taking lots of different classes of locks.

My problem is I don't know what to ask perf for to see if I have a mutex that is heavily contended, and so some threads are spending a lot of time asleep waiting to acquire it.  As I said, this is probably pretty basic stuff, but I'm not usually much of a userspace guy...

Thanks,
  Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14  6:36 perf for analyzing userspace contention Roland Dreier
  2011-10-14  6:58 ` Maucci, Cyrille
@ 2011-10-14  9:42 ` Stefan Hajnoczi
  2011-10-14 16:58   ` Roland Dreier
  2011-10-14 21:25 ` David Ahern
  2011-10-17 23:43 ` Arun Sharma
  3 siblings, 1 reply; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-10-14  9:42 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-perf-users

On Fri, Oct 14, 2011 at 7:36 AM, Roland Dreier <roland@kernel.org> wrote:
> Anyway, suppose I have a multithreaded userspace app that uses a bunch of
> pthread_mutexes, and I want to figure out which locks are hot and/or heavily
> contended.  What's the best way to do that?  Is perf the right, or is there
> something better?  (This seems like such an obvious thing to want that there
> must be some good way to get this data, I hope)

Hi Roland,
I haven't tried this tool myself yet but have been wanting to for sometime:
http://git.0pointer.de/?p=mutrace.git

Here's the full overview of what it does:
http://0pointer.de/blog/projects/mutrace.html

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14  9:42 ` Stefan Hajnoczi
@ 2011-10-14 16:58   ` Roland Dreier
  0 siblings, 0 replies; 10+ messages in thread
From: Roland Dreier @ 2011-10-14 16:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: linux-perf-users

On Fri, Oct 14, 2011 at 2:42 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> I haven't tried this tool myself yet but have been wanting to for sometime:
> http://git.0pointer.de/?p=mutrace.git

Thanks a lot for the reminder!  Now that you mention it, I remember Lennart's
announcement, but I had completely forgotten about it.  Giving it a shot now.

 - R.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14  6:36 perf for analyzing userspace contention Roland Dreier
  2011-10-14  6:58 ` Maucci, Cyrille
  2011-10-14  9:42 ` Stefan Hajnoczi
@ 2011-10-14 21:25 ` David Ahern
  2011-10-14 22:43   ` Arnaldo Carvalho de Melo
  2011-10-17 23:43 ` Arun Sharma
  3 siblings, 1 reply; 10+ messages in thread
From: David Ahern @ 2011-10-14 21:25 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-perf-users



On 10/14/2011 12:36 AM, Roland Dreier wrote:
> Anyway, suppose I have a multithreaded userspace app that uses a bunch of
> pthread_mutexes, and I want to figure out which locks are hot and/or heavily
> contended.  What's the best way to do that?  Is perf the right, or is there
> something better?  (This seems like such an obvious thing to want that there
> must be some good way to get this data, I hope)

You can capture futex entry and exit with:

perf trace -e syscalls:sys_enter_futex -e syscalls:sys_exit_futex

add -g to get the callchains. From there perf-script will dump the events.

David

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14 21:25 ` David Ahern
@ 2011-10-14 22:43   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-10-14 22:43 UTC (permalink / raw)
  To: David Ahern; +Cc: Roland Dreier, linux-perf-users

Em Fri, Oct 14, 2011 at 03:25:49PM -0600, David Ahern escreveu:
> 
> 
> On 10/14/2011 12:36 AM, Roland Dreier wrote:
> > Anyway, suppose I have a multithreaded userspace app that uses a bunch of
> > pthread_mutexes, and I want to figure out which locks are hot and/or heavily
> > contended.  What's the best way to do that?  Is perf the right, or is there
> > something better?  (This seems like such an obvious thing to want that there
> > must be some good way to get this data, I hope)
> 
> You can capture futex entry and exit with:
> 
> perf trace -e syscalls:sys_enter_futex -e syscalls:sys_exit_futex
> 
> add -g to get the callchains. From there perf-script will dump the events.

Right, if userspace was compiled with -fno-omit-frame-pointer, instant
karma :-)

With what we have now in tip/master:

perf top -G

And enjoy it live :-)

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-14  6:36 perf for analyzing userspace contention Roland Dreier
                   ` (2 preceding siblings ...)
  2011-10-14 21:25 ` David Ahern
@ 2011-10-17 23:43 ` Arun Sharma
  2011-10-18  0:12   ` David Ahern
  3 siblings, 1 reply; 10+ messages in thread
From: Arun Sharma @ 2011-10-17 23:43 UTC (permalink / raw)
  To: Roland Dreier; +Cc: linux-perf-users

On Thu, Oct 13, 2011 at 11:36:23PM -0700, Roland Dreier wrote:
> Hi everyone,
> 
> Hope this question from a kernel hacker about profiling userspace
> isn't too dumb...
> 
> Anyway, suppose I have a multithreaded userspace app that uses a bunch of
> pthread_mutexes, and I want to figure out which locks are hot and/or heavily
> contended.  What's the best way to do that?  Is perf the right, or is there
> something better?  (This seems like such an obvious thing to want that there
> must be some good way to get this data, I hope)

Sampling on sys_futex entry/exit is a good start. But it doesn't tell
you if the thread spent 1us or 1ms waiting on the futex. We really need
to add weights to the profile based on sleep times (or other criteria).

You might want to follow further discussion in the thread with the
subject "Profiling sleep times?".

perf record -ag -e cs -- sleep 1

is a more general version of this. It helps in understanding what's
causing the thread to give up CPU voluntarily (you might have to filter
the output a bit).

 -Arun

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: perf for analyzing userspace contention
  2011-10-17 23:43 ` Arun Sharma
@ 2011-10-18  0:12   ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2011-10-18  0:12 UTC (permalink / raw)
  To: Arun Sharma; +Cc: Roland Dreier, linux-perf-users



On 10/17/2011 05:43 PM, Arun Sharma wrote:
> On Thu, Oct 13, 2011 at 11:36:23PM -0700, Roland Dreier wrote:
>> Hi everyone,
>>
>> Hope this question from a kernel hacker about profiling userspace
>> isn't too dumb...
>>
>> Anyway, suppose I have a multithreaded userspace app that uses a bunch of
>> pthread_mutexes, and I want to figure out which locks are hot and/or heavily
>> contended.  What's the best way to do that?  Is perf the right, or is there
>> something better?  (This seems like such an obvious thing to want that there
>> must be some good way to get this data, I hope)
> 
> Sampling on sys_futex entry/exit is a good start. But it doesn't tell
> you if the thread spent 1us or 1ms waiting on the futex. We really need
> to add weights to the profile based on sleep times (or other criteria).

Doesn't the delta tell you the time it spends waiting?

> 
> You might want to follow further discussion in the thread with the
> subject "Profiling sleep times?".
> 
> perf record -ag -e cs -- sleep 1

And augmenting with cs (isn't the -c1 required? I've always added it for
software events) tells you if it was scheduled out.

David


> 
> is a more general version of this. It helps in understanding what's
> causing the thread to give up CPU voluntarily (you might have to filter
> the output a bit).
> 
>  -Arun
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-10-18  0:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-14  6:36 perf for analyzing userspace contention Roland Dreier
2011-10-14  6:58 ` Maucci, Cyrille
2011-10-14  7:21   ` Roland Dreier
2011-10-14  7:33     ` Maucci, Cyrille
2011-10-14  9:42 ` Stefan Hajnoczi
2011-10-14 16:58   ` Roland Dreier
2011-10-14 21:25 ` David Ahern
2011-10-14 22:43   ` Arnaldo Carvalho de Melo
2011-10-17 23:43 ` Arun Sharma
2011-10-18  0:12   ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).