From: Frederic Weisbecker <fweisbec@gmail.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Paul Mackerras <paulus@samba.org>,
Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>,
Li Zefan <lizf@cn.fujitsu.com>,
Lai Jiangshan <laijs@cn.fujitsu.com>,
Masami Hiramatsu <mhiramat@redhat.com>
Subject: Re: [RFC GIT PULL] perf/trace/lock optimization/scalability improvements
Date: Wed, 3 Feb 2010 21:50:12 +0100 [thread overview]
Message-ID: <20100203205009.GB5068@nowhere> (raw)
In-Reply-To: <20100203102540.GQ5733@kernel.dk>
On Wed, Feb 03, 2010 at 11:25:41AM +0100, Jens Axboe wrote:
> On Wed, Feb 03 2010, Frederic Weisbecker wrote:
> > Hi,
> >
> > There are many things that happen in this patchset, treating
> > different problems:
> >
> > - remove most of the string copy overhead in fast path
> > - open the way for lock class oriented profiling (as
> > opposite to lock instance profiling. Both can be useful
> > in different ways).
> > - remove the buffers muliplexing (less contention)
> > - event injection support
> > - remove violent lock events recursion (only 2 among 3, the remaining
> > one is detailed below).
> >
> > Some differences, by running:
> > perf lock record perf sched pipe -l 100000
> >
> > Before the patchset:
> >
> > Total time: 91.015 [sec]
> >
> > 910.157300 usecs/op
> > 1098 ops/sec
> >
> > After this patchset applied:
> >
> > Total time: 43.706 [sec]
> >
> > 437.062080 usecs/op
> > 2288 ops/sec
>
> This does a lot better here, even if it isn't exactly stellar
> performance. It generates a LOT of data:
>
> root@nehalem:/dev/shm # time perf lock rec -fg ls
> perf.data perf.data.old
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 137.224 MB perf.data (~5995421
> samples) ]
Doh, 137 MB for a single ls :)
That said we don't have yet support for callchains in perf lock,
and callchains can fill the buffer quickly, especially on lock
events. You can drop the -g option for now.
>
> real 0m3.320s
> user 0m0.000s
> sys 0m3.220s
>
> Without -g, it has 1.688s real and 1.590s sys time.
Ok.
> So while this is orders of magnitude better than the previous patchset,
> it's still not anywhere near lean. But I expect you know that, just
> consider this a 'I tested it and this is what happened' report :-)
Ok, thanks a lot, the fact you can test on a 64 threads box is critically
helpful.
I also wonder what happens after this patch applied:
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 98fd360..254b3d4 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3094,7 +3094,8 @@ static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
if (event->parent)
event = event->parent;
- return task_pid_nr_ns(p, event->ns);
+ return p->pid;
}
In my box it has increased the speed from 2x this patchset.
I wonder if the tool becomes usable for you with that.
Otherwise, it means we have other things to fix, and
the result of:
perf record -g -f perf lock record sleep 6
perf report
would be very nice to have.
Thanks!
next prev parent reply other threads:[~2010-02-03 20:50 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-03 9:14 [RFC GIT PULL] perf/trace/lock optimization/scalability improvements Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 01/11] tracing: Add lock_class_init event Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 02/11] tracing: Introduce TRACE_EVENT_INJECT Frederic Weisbecker
2010-02-05 14:08 ` Steven Rostedt
2010-02-05 14:47 ` Steven Rostedt
2010-02-05 14:53 ` Peter Zijlstra
2010-02-05 15:07 ` Steven Rostedt
2010-02-06 12:20 ` Frederic Weisbecker
2010-02-06 13:19 ` Steven Rostedt
2010-02-10 10:04 ` Frederic Weisbecker
2010-02-10 14:05 ` Steven Rostedt
2010-02-11 18:57 ` Frederic Weisbecker
2010-02-11 19:23 ` Steven Rostedt
2010-02-03 9:14 ` [PATCH 03/11] tracing: Inject lock_class_init events on registration Frederic Weisbecker
2010-02-05 14:13 ` Steven Rostedt
2010-02-05 14:30 ` Peter Zijlstra
2010-02-05 14:44 ` Steven Rostedt
2010-02-03 9:14 ` [PATCH 04/11] tracing: Add lock class id in lock_acquire event Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 05/11] perf: New PERF_EVENT_IOC_INJECT ioctl Frederic Weisbecker
2010-02-03 9:19 ` Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 06/11] perf: Handle injection ioctl with trace events Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 07/11] perf: Handle injection iotcl for tracepoints from perf record Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 08/11] perf/lock: Add support for lock_class_init events Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 09/11] tracing: Remove the lock name from most lock events Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 10/11] tracing/perf: Fix lock events recursions in the fast path Frederic Weisbecker
2010-02-04 15:47 ` Paul E. McKenney
2010-02-05 2:38 ` Lai Jiangshan
2010-02-05 9:45 ` Peter Zijlstra
2010-02-05 9:50 ` Peter Zijlstra
2010-02-05 10:49 ` Ingo Molnar
2010-02-05 12:10 ` Peter Zijlstra
2010-02-05 12:12 ` Peter Zijlstra
2010-02-05 13:01 ` Peter Zijlstra
2010-02-06 11:12 ` Frederic Weisbecker
2010-02-06 11:24 ` Peter Zijlstra
2010-02-06 11:40 ` Frederic Weisbecker
2010-02-06 14:17 ` Peter Zijlstra
2010-02-06 16:10 ` Frederic Weisbecker
2010-02-07 9:45 ` Peter Zijlstra
2010-02-10 10:17 ` Frederic Weisbecker
2010-02-28 22:24 ` Frederic Weisbecker
2010-02-03 9:14 ` [PATCH 11/11] perf lock: Drop the buffers multiplexing dependency Frederic Weisbecker
2010-02-03 10:25 ` [RFC GIT PULL] perf/trace/lock optimization/scalability improvements Jens Axboe
2010-02-03 20:50 ` Frederic Weisbecker [this message]
2010-02-03 21:21 ` Jens Axboe
2010-02-03 22:13 ` Frederic Weisbecker
2010-02-04 19:40 ` Jens Axboe
2010-02-06 10:37 ` Frederic Weisbecker
2010-02-03 10:26 ` Ingo Molnar
2010-02-03 21:26 ` Frederic Weisbecker
2010-02-03 10:33 ` Peter Zijlstra
2010-02-03 22:07 ` Frederic Weisbecker
2010-02-04 6:33 ` Ingo Molnar
2010-02-07 17:10 ` Peter Zijlstra
2010-02-10 10:49 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100203205009.GB5068@nowhere \
--to=fweisbec@gmail.com \
--cc=acme@redhat.com \
--cc=jens.axboe@oracle.com \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=mhiramat@redhat.com \
--cc=mingo@elte.hu \
--cc=mitake@dcl.info.waseda.ac.jp \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.