public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Jens Axboe <jens.axboe@oracle.com>
Cc: Ingo Molnar <mingo@elte.hu>, LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Arnaldo Carvalho de Melo <acme@redhat.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Paul Mackerras <paulus@samba.org>,
	Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>,
	Li Zefan <lizf@cn.fujitsu.com>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	Masami Hiramatsu <mhiramat@redhat.com>
Subject: Re: [RFC GIT PULL] perf/trace/lock optimization/scalability improvements
Date: Wed, 3 Feb 2010 21:50:12 +0100	[thread overview]
Message-ID: <20100203205009.GB5068@nowhere> (raw)
In-Reply-To: <20100203102540.GQ5733@kernel.dk>

On Wed, Feb 03, 2010 at 11:25:41AM +0100, Jens Axboe wrote:
> On Wed, Feb 03 2010, Frederic Weisbecker wrote:
> > Hi,
> > 
> > There are many things that happen in this patchset, treating
> > different problems:
> > 
> > - remove most of the string copy overhead in fast path
> > - open the way for lock class oriented profiling (as
> >   opposite to lock instance profiling. Both can be useful
> >   in different ways).
> > - remove the buffers muliplexing (less contention)
> > - event injection support
> > - remove violent lock events recursion (only 2 among 3, the remaining
> >   one is detailed below).
> > 
> > Some differences, by running:
> > 	perf lock record perf sched pipe -l 100000
> > 
> > Before the patchset:
> > 
> > 	Total time: 91.015 [sec]
> > 
> > 	     910.157300 usecs/op
> > 		   1098 ops/sec
> > 
> > After this patchset applied:
> > 
> > 	Total time: 43.706 [sec]
> > 
> > 	     437.062080 usecs/op
> > 		   2288 ops/sec
> 
> This does a lot better here, even if it isn't exactly stellar
> performance. It generates a LOT of data:
> 
> root@nehalem:/dev/shm # time perf lock rec -fg ls
> perf.data  perf.data.old
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 137.224 MB perf.data (~5995421
> samples) ]



Doh, 137 MB for a single ls :)

That said we don't have yet support for callchains in perf lock,
and callchains can fill the buffer quickly, especially on lock
events. You can drop the -g option for now.


> 
> real    0m3.320s
> user    0m0.000s
> sys     0m3.220s
> 
> Without -g, it has 1.688s real and 1.590s sys time.


Ok.


> So while this is orders of magnitude better than the previous patchset,
> it's still not anywhere near lean. But I expect you know that, just
> consider this a 'I tested it and this is what happened' report :-)


Ok, thanks a lot, the fact you can test on a 64 threads box is critically
helpful.

I also wonder what happens after this patch applied:

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 98fd360..254b3d4 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -3094,7 +3094,8 @@ static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
        if (event->parent)
                event = event->parent;
 
-       return task_pid_nr_ns(p, event->ns);
+       return p->pid;
 }

In my box it has increased the speed from 2x this patchset.

I wonder if the tool becomes usable for you with that.
Otherwise, it means we have other things to fix, and
the result of:

	perf record -g -f perf lock record sleep 6
	perf report

would be very nice to have.

Thanks!


  reply	other threads:[~2010-02-03 20:50 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-03  9:14 [RFC GIT PULL] perf/trace/lock optimization/scalability improvements Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 01/11] tracing: Add lock_class_init event Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 02/11] tracing: Introduce TRACE_EVENT_INJECT Frederic Weisbecker
2010-02-05 14:08   ` Steven Rostedt
2010-02-05 14:47   ` Steven Rostedt
2010-02-05 14:53     ` Peter Zijlstra
2010-02-05 15:07       ` Steven Rostedt
2010-02-06 12:20         ` Frederic Weisbecker
2010-02-06 13:19           ` Steven Rostedt
2010-02-10 10:04             ` Frederic Weisbecker
2010-02-10 14:05               ` Steven Rostedt
2010-02-11 18:57                 ` Frederic Weisbecker
2010-02-11 19:23                   ` Steven Rostedt
2010-02-03  9:14 ` [PATCH 03/11] tracing: Inject lock_class_init events on registration Frederic Weisbecker
2010-02-05 14:13   ` Steven Rostedt
2010-02-05 14:30     ` Peter Zijlstra
2010-02-05 14:44       ` Steven Rostedt
2010-02-03  9:14 ` [PATCH 04/11] tracing: Add lock class id in lock_acquire event Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 05/11] perf: New PERF_EVENT_IOC_INJECT ioctl Frederic Weisbecker
2010-02-03  9:19   ` Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 06/11] perf: Handle injection ioctl with trace events Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 07/11] perf: Handle injection iotcl for tracepoints from perf record Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 08/11] perf/lock: Add support for lock_class_init events Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 09/11] tracing: Remove the lock name from most lock events Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 10/11] tracing/perf: Fix lock events recursions in the fast path Frederic Weisbecker
2010-02-04 15:47   ` Paul E. McKenney
2010-02-05  2:38     ` Lai Jiangshan
2010-02-05  9:45       ` Peter Zijlstra
2010-02-05  9:50         ` Peter Zijlstra
2010-02-05 10:49           ` Ingo Molnar
2010-02-05 12:10             ` Peter Zijlstra
2010-02-05 12:12               ` Peter Zijlstra
2010-02-05 13:01                 ` Peter Zijlstra
2010-02-06 11:12                   ` Frederic Weisbecker
2010-02-06 11:24                     ` Peter Zijlstra
2010-02-06 11:40                       ` Frederic Weisbecker
2010-02-06 14:17                         ` Peter Zijlstra
2010-02-06 16:10                           ` Frederic Weisbecker
2010-02-07  9:45                             ` Peter Zijlstra
2010-02-10 10:17                               ` Frederic Weisbecker
2010-02-28 22:24                   ` Frederic Weisbecker
2010-02-03  9:14 ` [PATCH 11/11] perf lock: Drop the buffers multiplexing dependency Frederic Weisbecker
2010-02-03 10:25 ` [RFC GIT PULL] perf/trace/lock optimization/scalability improvements Jens Axboe
2010-02-03 20:50   ` Frederic Weisbecker [this message]
2010-02-03 21:21     ` Jens Axboe
2010-02-03 22:13       ` Frederic Weisbecker
2010-02-04 19:40     ` Jens Axboe
2010-02-06 10:37       ` Frederic Weisbecker
2010-02-03 10:26 ` Ingo Molnar
2010-02-03 21:26   ` Frederic Weisbecker
2010-02-03 10:33 ` Peter Zijlstra
2010-02-03 22:07   ` Frederic Weisbecker
2010-02-04  6:33     ` Ingo Molnar
2010-02-07 17:10     ` Peter Zijlstra
2010-02-10 10:49       ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100203205009.GB5068@nowhere \
    --to=fweisbec@gmail.com \
    --cc=acme@redhat.com \
    --cc=jens.axboe@oracle.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=mhiramat@redhat.com \
    --cc=mingo@elte.hu \
    --cc=mitake@dcl.info.waseda.ac.jp \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox