From: Frederic Weisbecker <fweisbec@gmail.com>
To: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Paul Mackerras <paulus@samba.org>,
Tom Zanussi <tzanussi@gmail.com>,
Steven Rostedt <srostedt@redhat.com>
Subject: Re: [PATCH 2/2] perf lock: New subcommand "lock" to perf for analyzing lock statistics
Date: Mon, 7 Dec 2009 05:41:26 +0100 [thread overview]
Message-ID: <20091207044125.GB5262@nowhere> (raw)
In-Reply-To: <1260156884-8474-2-git-send-email-mitake@dcl.info.waseda.ac.jp>
On Mon, Dec 07, 2009 at 12:34:44PM +0900, Hitoshi Mitake wrote:
> This patch adds new subcommand "lock" to perf for analyzing lock usage statistics.
> Current perf lock is very primitive. This cannot provide the minimum functions.
> Of course I continue to working on this.
> But too big patch is not good thing for you, so I post this.
Oh great!
Yeah, the work can be done incrementally.
>
> And I found some important problem, so I'd like to ask your opinion.
> For another issue, this patch depends on the previous one.
> The previous one is very dirty and temporary, I cannot sign on it, so I cannot sign on this too...
The previous one looks rather good actually.
> First, it seems that current locks (spinlock, rwlock, mutex) has no numeric ID.
> So we can't treat rq->lock on CPU 0 and rq->lock on CPU 1 as different things.
> Symbol name of locks cannot be complete ID.
> This is the result of current ugly data structure for lock_stat
> (data structure for stat per lock in builtin-lock.c).
> Hash table will solve the problem of speed,
> but it is not a radical solution.
> I understand it is hard to implement numeric IDs for locks,
> but it is required seriously, do you have some ideas?
Indeed. I think every lock instance has its own lockdep_map.
And this lockdep_map is passed in every lock event but is
only used to retrieve the name of the lock.
Why not adding the address of the lockdep_map in the event?
> Second, there's a lot of lack of information from trace events.
> For example, current lock event subsystem cannot provide the time between
> lock_acquired and lock_release.
> But this time is already measured in lockdep, and we can obtain it
> from /proc/lock_stat.
> But /proc/lock_stat provides information from boot time only.
> So I have to modify wide area of kernel including lockdep, may I do this?
I think this is more something to compute in a state machine:
lock_release - lock_acquired event.
This is what we do with sched events in perf sched latency
Also I think we should remove the field that gives the time waited
between lock_acquire and lock_acquired. This is more something that
should be done in userspace instead of calculating in from the kernel.
This brings overhead in the wrong place.
>
> Third, siginificant overhead :-(
>
> % perf bench sched messaging # Without perf lock rec
>
> Total time: 0.436 [sec]
>
> % sudo ./perf lock rec perf bench sched messaging # With perf lock rec
>
> Total time: 4.677 [sec]
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 106.065 MB perf.data (~4634063 samples) ]
>
> Over 10 times! No one can ignore this...
I think that the lock events are much more sensible than the sched events,
and that by nature: these are very high frequency events class, probably the
highest among every event classes we have (the worst beeing function tracing :)
But still, you're right, there are certainly various things we need to
optimize in this area.
More than 8 times slower is high.
>
> This is example of using perf lock prof:
> % sudo ./perf lock prof # Outputs in pager
> ------------------------------------------------------------------------------------------
> Lock | Acquired | Max wait ns | Min wait ns | Total wait ns |
> --------------------------------------------------------------------------------------------
> &q->lock 30 0 0 0
> &ctx->lock 3912 0 0 0
> event_mutex 2 0 0 0
> &newf->file_lock 1008 0 0 0
> dcache_lock 444 0 0 0
> &dentry->d_lock 1164 0 0 0
> &ctx->mutex 2 0 0 0
> &child->perf_event_mutex 2 0 0 0
> &event->child_mutex 18 0 0 0
> &f->f_lock 2 0 0 0
> &event->mmap_mutex 2 0 0 0
> &sb->s_type->i_mutex_key 259 0 0 0
> &sem->wait_lock 27205 0 0 0
> &(&ip->i_iolock)->mr_lock 130 0 0 0
> &(&ip->i_lock)->mr_lock 6376 0 0 0
> &parent->list_lock 9149 7367 146 527013
> &inode->i_data.tree_lock 12175 0 0 0
> &inode->i_data.private_lock 6097 0 0 0
Very nice and promising!
I can't wait to try it.
next prev parent reply other threads:[~2009-12-07 4:41 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-12 6:43 [PATCH][RFC] Measuring term of acquiring spinlock Hitoshi Mitake
2009-11-12 7:39 ` Ingo Molnar
2009-11-13 4:21 ` Hitoshi Mitake
2009-11-13 8:17 ` Ingo Molnar
2009-11-13 8:24 ` Peter Zijlstra
2009-11-13 8:40 ` Frederic Weisbecker
2009-11-13 8:51 ` Ingo Molnar
2009-11-13 9:06 ` [PATCH] tracing: Rename lockdep event subsystem into lock Frederic Weisbecker
2009-11-13 9:10 ` Peter Zijlstra
2009-11-13 9:26 ` Ingo Molnar
2009-11-13 9:36 ` Frederic Weisbecker
2009-11-13 9:31 ` [tip:perf/core] tracing: Rename 'lockdep' event subsystem into 'lock' tip-bot for Frederic Weisbecker
2009-11-13 9:57 ` tip-bot for Frederic Weisbecker
2009-11-13 10:51 ` [PATCH][RFC] Measuring term of acquiring spinlock Frederic Weisbecker
2009-11-15 1:20 ` Hitoshi Mitake
2009-11-15 2:21 ` Frederic Weisbecker
2009-11-15 8:38 ` Hitoshi Mitake
2009-12-07 3:34 ` [PATCH 1/2] Does raw_field_ptr() supports __data_loc? Hitoshi Mitake
2009-12-07 4:02 ` Frederic Weisbecker
2009-12-07 16:09 ` Steven Rostedt
2009-12-07 3:34 ` [PATCH 2/2] perf lock: New subcommand "lock" to perf for analyzing lock statistics Hitoshi Mitake
2009-12-07 4:41 ` Frederic Weisbecker [this message]
2009-12-07 7:27 ` Ingo Molnar
2009-12-07 8:38 ` Xiao Guangrong
2009-12-07 15:00 ` Hitoshi Mitake
2009-12-07 16:38 ` Steven Rostedt
2009-12-07 19:48 ` Frederic Weisbecker
2009-12-07 19:57 ` Frederic Weisbecker
2009-12-08 1:31 ` Xiao Guangrong
2009-12-07 14:57 ` Hitoshi Mitake
2009-12-07 14:51 ` Hitoshi Mitake
2009-12-07 20:16 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091207044125.GB5262@nowhere \
--to=fweisbec@gmail.com \
--cc=a.p.zijlstra@chello.nl \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=mitake@dcl.info.waseda.ac.jp \
--cc=paulus@samba.org \
--cc=srostedt@redhat.com \
--cc=tzanussi@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.