* Q: perf log mode?
@ 2011-01-12 14:06 Kirill Smelkov
2011-01-12 14:08 ` Peter Zijlstra
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Kirill Smelkov @ 2011-01-12 14:06 UTC (permalink / raw)
To: linux-kernel
Cc: Arnaldo Carvalho de Melo, Frederic Weisbecker, Ingo Molnar,
Mike Galbraith, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
Tom Zanussi
Hello up there,
I'm trying to use perf together with e.g. kprobes as a tool to show what
is happening with my system in "live-log" mode. The problem is, for
seldom events, actual info output is largely delayed because perf reads
sample data in whole pages. Could something be done with it or am I'm
missing something? Here is detailed description:
Say I'm interested in ICMP packet processing, so I (1) install
kprobe in icmp_rcv:
$ perf probe -L icmp_rcv | grep -U8 '^ *43\>'
goto error;
}
38 if (!pskb_pull(skb, sizeof(*icmph)))
goto error;
icmph = icmp_hdr(skb);
43 ICMPMSGIN_INC_STATS_BH(net, icmph->type);
/*
* 18 is the highest 'known' ICMP type. Anything else is a mystery
*
* RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently
* discarded.
*/
50 if (icmph->type > NR_ICMP_TYPES)
goto error;
$ perf probe icmp_rcv:43 'type=icmph->type'
(2) Write/generate processing script for it:
$ cat trace-icmp.py
[...]
def trace_begin():
print "in trace_begin"
def trace_end():
print "in trace_end"
def probe__icmp_rcv(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
__probe_ip, type):
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
print "__probe_ip=%u, type=%u\n" % \
(__probe_ip, type),
[...]
then (3) try to see events log (this is the best variant I could come up
with up to now):
$ perf record -a -R -c 1 -m 1 -e probe:icmp_rcv -o - | \
perf script -i - -s trace-icmp.py
As I wrote in the introduction, most of the time there is no output, and
seldom a bunch of events comes in one go, e.g.
probe__icmp_rcv 0 16443.348252154 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16450.348146741 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16455.348151997 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16456.348148274 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16458.348144398 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16461.348152021 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16464.348140741 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16470.348143846 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16472.348144669 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16475.348151825 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16478.348140674 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16479.348150712 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16481.348147982 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16485.348147250 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16486.348150900 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16491.348154918 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 0 16497.348155844 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16446.348146828 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16447.348153243 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16449.348151942 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16454.348148286 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16462.348159458 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 1 16465.348151748 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16442.349363280 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16448.348142037 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16451.348153582 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16453.348147189 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16460.348146440 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16466.348144503 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16473.348156748 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16474.348151359 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16487.348151515 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16488.348151928 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16489.348151897 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16492.348148233 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 2 16498.348153051 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16444.348149546 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16445.348146280 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16452.348138865 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16457.348151554 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16459.348149756 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16463.348150932 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16467.348146132 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16468.348140904 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16469.348143768 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16471.348148436 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16476.348145084 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16477.348146054 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16480.348144285 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16482.348144761 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16483.348148015 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16484.348147443 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16490.348141855 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16493.348152237 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16494.348140068 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16495.348158109 0 swapper __probe_ip=3240938970, type=0
probe__icmp_rcv 3 16496.348127986 0 swapper __probe_ip=3240938970, type=0
Judging from timestamps, the events were coming with 1-2 seconds period
(ping <host> running) and for e.g. first event we got 96-43=53 seconds
delay (which is happening already at `perf record` stage -- I've
monitored it's raw output).
Is it somehow possible to display events instantly as they are coming?
Thanks,
Kirill
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 14:06 Q: perf log mode? Kirill Smelkov
@ 2011-01-12 14:08 ` Peter Zijlstra
2011-01-12 14:42 ` Frederic Weisbecker
2011-01-12 15:21 ` Arnaldo Carvalho de Melo
2011-01-14 7:25 ` [tip:perf/urgent] perf record: Add "nodelay" mode, disabled by default tip-bot for Kirill Smelkov
2 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2011-01-12 14:08 UTC (permalink / raw)
To: Kirill Smelkov
Cc: linux-kernel, Arnaldo Carvalho de Melo, Frederic Weisbecker,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, 2011-01-12 at 17:06 +0300, Kirill Smelkov wrote:
> I'm trying to use perf together with e.g. kprobes as a tool to show what
> is happening with my system in "live-log" mode. The problem is, for
> seldom events, actual info output is largely delayed because perf reads
> sample data in whole pages. Could something be done with it or am I'm
> missing something? Here is detailed description:
perf_event_attr = {
.watermark = 0,
.wakeup_events = 1,
};
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 14:08 ` Peter Zijlstra
@ 2011-01-12 14:42 ` Frederic Weisbecker
2011-01-12 14:53 ` Peter Zijlstra
2011-01-12 14:59 ` Kirill Smelkov
0 siblings, 2 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2011-01-12 14:42 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kirill Smelkov, linux-kernel, Arnaldo Carvalho de Melo,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, Jan 12, 2011 at 03:08:08PM +0100, Peter Zijlstra wrote:
> On Wed, 2011-01-12 at 17:06 +0300, Kirill Smelkov wrote:
> > I'm trying to use perf together with e.g. kprobes as a tool to show what
> > is happening with my system in "live-log" mode. The problem is, for
> > seldom events, actual info output is largely delayed because perf reads
> > sample data in whole pages. Could something be done with it or am I'm
> > missing something? Here is detailed description:
>
> perf_event_attr = {
> .watermark = 0,
> .wakeup_events = 1,
> };
Which is perhaps something we want as a default when perf record -c 1
and the output is the pipe mode.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 14:42 ` Frederic Weisbecker
@ 2011-01-12 14:53 ` Peter Zijlstra
2011-01-12 15:10 ` Frederic Weisbecker
2011-01-12 14:59 ` Kirill Smelkov
1 sibling, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2011-01-12 14:53 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Kirill Smelkov, linux-kernel, Arnaldo Carvalho de Melo,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, 2011-01-12 at 15:42 +0100, Frederic Weisbecker wrote:
> On Wed, Jan 12, 2011 at 03:08:08PM +0100, Peter Zijlstra wrote:
> > On Wed, 2011-01-12 at 17:06 +0300, Kirill Smelkov wrote:
> > > I'm trying to use perf together with e.g. kprobes as a tool to show what
> > > is happening with my system in "live-log" mode. The problem is, for
> > > seldom events, actual info output is largely delayed because perf reads
> > > sample data in whole pages. Could something be done with it or am I'm
> > > missing something? Here is detailed description:
> >
> > perf_event_attr = {
> > .watermark = 0,
> > .wakeup_events = 1,
> > };
>
>
> Which is perhaps something we want as a default when perf record -c 1
> and the output is the pipe mode.
No, definitely not, esp for -c1 you want large buffers because the event
can come at very high freq.
Nor does pipe mode have anything to do with it, the whole script set-up
plain stinks and should not be using pipe mode, pipe mode should only be
used to pipe data over the network and other remote profiling like
things.
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 14:53 ` Peter Zijlstra
@ 2011-01-12 15:10 ` Frederic Weisbecker
0 siblings, 0 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2011-01-12 15:10 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kirill Smelkov, linux-kernel, Arnaldo Carvalho de Melo,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, Jan 12, 2011 at 03:53:41PM +0100, Peter Zijlstra wrote:
> On Wed, 2011-01-12 at 15:42 +0100, Frederic Weisbecker wrote:
> > On Wed, Jan 12, 2011 at 03:08:08PM +0100, Peter Zijlstra wrote:
> > > On Wed, 2011-01-12 at 17:06 +0300, Kirill Smelkov wrote:
> > > > I'm trying to use perf together with e.g. kprobes as a tool to show what
> > > > is happening with my system in "live-log" mode. The problem is, for
> > > > seldom events, actual info output is largely delayed because perf reads
> > > > sample data in whole pages. Could something be done with it or am I'm
> > > > missing something? Here is detailed description:
> > >
> > > perf_event_attr = {
> > > .watermark = 0,
> > > .wakeup_events = 1,
> > > };
> >
> >
> > Which is perhaps something we want as a default when perf record -c 1
> > and the output is the pipe mode.
>
> No, definitely not, esp for -c1 you want large buffers because the event
> can come at very high freq.
Right this should not be a default actually if we have high freq events
we don't want wake up at each of them.
> Nor does pipe mode have anything to do with it, the whole script set-up
> plain stinks
Oh? What's wrong with the scripts infrastructure?
> and should not be using pipe mode, pipe mode should only be
> used to pipe data over the network and other remote profiling like
> things.
But the scripts are just an endpoint over many other possibilities,
regardless of what is in the middle: perf.data or pipe streaming.
Where is a limitation on using scripts here?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Q: perf log mode?
2011-01-12 14:42 ` Frederic Weisbecker
2011-01-12 14:53 ` Peter Zijlstra
@ 2011-01-12 14:59 ` Kirill Smelkov
2011-01-12 15:02 ` Peter Zijlstra
1 sibling, 1 reply; 13+ messages in thread
From: Kirill Smelkov @ 2011-01-12 14:59 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Peter Zijlstra, linux-kernel, Arnaldo Carvalho de Melo,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, Jan 12, 2011 at 03:42:43PM +0100, Frederic Weisbecker wrote:
> On Wed, Jan 12, 2011 at 03:08:08PM +0100, Peter Zijlstra wrote:
> > On Wed, 2011-01-12 at 17:06 +0300, Kirill Smelkov wrote:
> > > I'm trying to use perf together with e.g. kprobes as a tool to show what
> > > is happening with my system in "live-log" mode. The problem is, for
> > > seldom events, actual info output is largely delayed because perf reads
> > > sample data in whole pages. Could something be done with it or am I'm
> > > missing something? Here is detailed description:
> >
> > perf_event_attr = {
> > .watermark = 0,
> > .wakeup_events = 1,
> > };
Ah, thanks for pointing it out!
> Which is perhaps something we want as a default when perf record -c 1
> and the output is the pipe mode.
Maybe not always, because as I understand it, tracepoints are handled
with -c 1, but there could be lots of them, and in such cases buffering
helps.
Anyway, maybe something like this could be useful?
---- 8< ----
From: Kirill Smelkov <kirr@mns.spb.ru>
Date: Wed, 12 Jan 2011 17:43:16 +0300
Subject: [PATCH] perf record: Add support for "nodelay" mode
Sometimes there is a need to use perf in "live-log" mode. The problem
is, for seldom events, actual info output is largely delayed because
perf-record reads sample data in whole pages.
So for such scenarious, add flag for perf-record to go in "nodelay"
mode. To track e.g. what's going on in icmp_rcv while ping is running
Use it with something like this:
(1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>'
goto error;
}
38 if (!pskb_pull(skb, sizeof(*icmph)))
goto error;
icmph = icmp_hdr(skb);
43 ICMPMSGIN_INC_STATS_BH(net, icmph->type);
/*
* 18 is the highest 'known' ICMP type. Anything else is a mystery
*
* RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently
* discarded.
*/
50 if (icmph->type > NR_ICMP_TYPES)
goto error;
$ perf probe icmp_rcv:43 'type=icmph->type'
(2) $ cat trace-icmp.py
[...]
def trace_begin():
print "in trace_begin"
def trace_end():
print "in trace_end"
def probe__icmp_rcv(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
__probe_ip, type):
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
print "__probe_ip=%u, type=%u\n" % \
(__probe_ip, type),
[...]
(3) $ perf record -a -D -e probe:icmp_rcv -o - | \
perf script -i - -s trace-icmp.py
Thanks to Peter Zijlstra for pointing how to do it.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
---
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 8 ++++++++
2 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 52462ae..e032716 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -61,6 +61,9 @@ OPTIONS
-r::
--realtime=::
Collect data with this RT SCHED_FIFO priority.
+-D::
+--no-delay::
+ Collect data without buffering.
-A::
--append::
Append to the output file to do incremental profiling.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1210e64..df6064a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -49,6 +49,7 @@ static int pipe_output = 0;
static const char *output_name = "perf.data";
static int group = 0;
static int realtime_prio = 0;
+static bool nodelay = false;
static bool raw_samples = false;
static bool sample_id_all_avail = true;
static bool system_wide = false;
@@ -307,6 +308,11 @@ static void create_counter(struct perf_evsel *evsel, int cpu)
attr->sample_type |= PERF_SAMPLE_CPU;
}
+ if (nodelay) {
+ attr->watermark = 0;
+ attr->wakeup_events = 1;
+ }
+
attr->mmap = track;
attr->comm = track;
attr->inherit = !no_inherit;
@@ -843,6 +849,8 @@ const struct option record_options[] = {
"record events on existing thread id"),
OPT_INTEGER('r', "realtime", &realtime_prio,
"collect data with this RT SCHED_FIFO priority"),
+ OPT_BOOLEAN('D', "no-delay", &nodelay,
+ "collect data without buffering"),
OPT_BOOLEAN('R', "raw-samples", &raw_samples,
"collect raw sample records from all opened counters"),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
--
1.7.4.rc1.7.g2cf08
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 14:59 ` Kirill Smelkov
@ 2011-01-12 15:02 ` Peter Zijlstra
2011-01-12 16:30 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2011-01-12 15:02 UTC (permalink / raw)
To: Kirill Smelkov
Cc: Frederic Weisbecker, linux-kernel, Arnaldo Carvalho de Melo,
Ingo Molnar, Mike Galbraith, Paul Mackerras, Stephane Eranian,
Tom Zanussi
On Wed, 2011-01-12 at 17:59 +0300, Kirill Smelkov wrote:
>
> Anyway, maybe something like this could be useful?
>
I guess that would be ok, but I'll leave that to Arnaldo.
Also note that Arnaldo is revamping the whole script interface to make
it possible to open and use the perf fds from the script language itself
(the only sane way), completely removing the dependency on perf-record.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Q: perf log mode?
2011-01-12 15:02 ` Peter Zijlstra
@ 2011-01-12 16:30 ` Arnaldo Carvalho de Melo
2011-01-13 8:26 ` Kirill Smelkov
0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-12 16:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kirill Smelkov, Frederic Weisbecker, linux-kernel, Ingo Molnar,
Mike Galbraith, Paul Mackerras, Stephane Eranian, Tom Zanussi
Em Wed, Jan 12, 2011 at 04:02:33PM +0100, Peter Zijlstra escreveu:
> On Wed, 2011-01-12 at 17:59 +0300, Kirill Smelkov wrote:
> >
> > Anyway, maybe something like this could be useful?
> >
> I guess that would be ok, but I'll leave that to Arnaldo.
It is OK, I'll apply it to my perf/core, for the next merge window.
> Also note that Arnaldo is revamping the whole script interface to make
> it possible to open and use the perf fds from the script language itself
> (the only sane way), completely removing the dependency on perf-record.
Yep, I'm librarising the ad-hoc counter creation routines we have in
stat, top and record, factoring the common parts, etc, and creating
'perf test' regression test routines for these new library functions.
After I finish the mmap bits, which should be soon, I'll start work on
having a python binding for these routines and experiment plugging it to
some other projects I worked on.
I'll also work on converting a few of the scripts we have to provide
examples of use of this python binding.
Other bindings and a C library can come later, after we settle on an API
using the python binding as the guinea pig.
Parts of this got into this merge window, for instance, perf stat uses
the perf_evsel api to open and read counters, and perf test has new
routines to stress them, new stuff I'm stashing at the perf/test branch
in:
http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=shortlog;h=refs/heads/perf/test
- Arnaldo
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Q: perf log mode?
2011-01-12 16:30 ` Arnaldo Carvalho de Melo
@ 2011-01-13 8:26 ` Kirill Smelkov
0 siblings, 0 replies; 13+ messages in thread
From: Kirill Smelkov @ 2011-01-13 8:26 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Peter Zijlstra, Frederic Weisbecker, linux-kernel, Ingo Molnar,
Mike Galbraith, Paul Mackerras, Stephane Eranian, Tom Zanussi
On Wed, Jan 12, 2011 at 02:30:22PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 12, 2011 at 04:02:33PM +0100, Peter Zijlstra escreveu:
> > On Wed, 2011-01-12 at 17:59 +0300, Kirill Smelkov wrote:
> > >
> > > Anyway, maybe something like this could be useful?
> > >
> > I guess that would be ok, but I'll leave that to Arnaldo.
>
> It is OK, I'll apply it to my perf/core, for the next merge window.
Thanks
> > Also note that Arnaldo is revamping the whole script interface to make
> > it possible to open and use the perf fds from the script language itself
> > (the only sane way), completely removing the dependency on perf-record.
>
> Yep, I'm librarising the ad-hoc counter creation routines we have in
> stat, top and record, factoring the common parts, etc, and creating
> 'perf test' regression test routines for these new library functions.
>
> After I finish the mmap bits, which should be soon, I'll start work on
> having a python binding for these routines and experiment plugging it to
> some other projects I worked on.
>
> I'll also work on converting a few of the scripts we have to provide
> examples of use of this python binding.
>
> Other bindings and a C library can come later, after we settle on an API
> using the python binding as the guinea pig.
>
> Parts of this got into this merge window, for instance, perf stat uses
> the perf_evsel api to open and read counters, and perf test has new
> routines to stress them, new stuff I'm stashing at the perf/test branch
> in:
>
> http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=shortlog;h=refs/heads/perf/test
Thanks for heads up and good luck with libification,
Kirill
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Q: perf log mode?
2011-01-12 14:06 Q: perf log mode? Kirill Smelkov
2011-01-12 14:08 ` Peter Zijlstra
@ 2011-01-12 15:21 ` Arnaldo Carvalho de Melo
2011-01-12 16:31 ` Kirill Smelkov
2011-01-14 7:25 ` [tip:perf/urgent] perf record: Add "nodelay" mode, disabled by default tip-bot for Kirill Smelkov
2 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-12 15:21 UTC (permalink / raw)
To: Kirill Smelkov
Cc: linux-kernel, Frederic Weisbecker, Ingo Molnar, Mike Galbraith,
Paul Mackerras, Peter Zijlstra, Stephane Eranian, Tom Zanussi
Em Wed, Jan 12, 2011 at 05:06:13PM +0300, Kirill Smelkov escreveu:
> I'm trying to use perf together with e.g. kprobes as a tool to show what
> is happening with my system in "live-log" mode. The problem is, for
> seldom events, actual info output is largely delayed because perf reads
> sample data in whole pages. Could something be done with it or am I'm
> missing something? Here is detailed description:
<SNIP>
>
> Judging from timestamps, the events were coming with 1-2 seconds period
> (ping <host> running) and for e.g. first event we got 96-43=53 seconds
> delay (which is happening already at `perf record` stage -- I've
> monitored it's raw output).
>
> Is it somehow possible to display events instantly as they are coming?
Look at how perf top works:
while (1) {
int hits = samples;
perf_session__mmap_read(session);
if (hits == samples)
ret = poll(evsel_list->pollfd, evsel_list->nr_fds, 100);
}
It will only call poll if no events were found so there should be no
delays.
perf script instead uses perf_session__process_events() and that, as
requested by 'perf script' queues events to make sure they are ordered,
so probably the problem is related to this latency, probably we'll need
to use some timer to flush this queue more often.
Try tweaking:
.ordering_requires_timestamps = true,
.ordered_samples = true,
in builtin-script.c, if you disable those there should be no queueing
and we'll test this theory.
Also try checking if there isn't any buffering in action because it uses
pipes to connect the record with the python script by printing the event
as it comes in the record part.
- Arnaldo
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 15:21 ` Arnaldo Carvalho de Melo
@ 2011-01-12 16:31 ` Kirill Smelkov
2011-01-12 17:07 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 13+ messages in thread
From: Kirill Smelkov @ 2011-01-12 16:31 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: linux-kernel, Frederic Weisbecker, Ingo Molnar, Mike Galbraith,
Paul Mackerras, Peter Zijlstra, Stephane Eranian, Tom Zanussi
On Wed, Jan 12, 2011 at 01:21:12PM -0200, Arnaldo Carvalho de Melo wrote:
> Em Wed, Jan 12, 2011 at 05:06:13PM +0300, Kirill Smelkov escreveu:
> > I'm trying to use perf together with e.g. kprobes as a tool to show what
> > is happening with my system in "live-log" mode. The problem is, for
> > seldom events, actual info output is largely delayed because perf reads
> > sample data in whole pages. Could something be done with it or am I'm
> > missing something? Here is detailed description:
>
> <SNIP>
> >
> > Judging from timestamps, the events were coming with 1-2 seconds period
> > (ping <host> running) and for e.g. first event we got 96-43=53 seconds
> > delay (which is happening already at `perf record` stage -- I've
> > monitored it's raw output).
> >
> > Is it somehow possible to display events instantly as they are coming?
>
> Look at how perf top works:
>
> while (1) {
> int hits = samples;
>
> perf_session__mmap_read(session);
>
> if (hits == samples)
> ret = poll(evsel_list->pollfd, evsel_list->nr_fds, 100);
> }
>
> It will only call poll if no events were found so there
> should be no delays.
^^^^^^^^^^^^^^^^^^^
Hmm, if I understand it correctly, initially right after start,
perf-record sees no events, goes to poll, and then we are sleeping there
until the kernel will wake up us. As Peter pointed it, there are
parameters which control wakeup rule, and for "wakeup immediately" we
have to tune them.
I'm maybe wrong here, since I have only brief picture about how things
work, but anyway...
> perf script instead uses perf_session__process_events() and that, as
> requested by 'perf script' queues events to make sure they are ordered,
> so probably the problem is related to this latency, probably we'll need
> to use some timer to flush this queue more often.
>
> Try tweaking:
>
> .ordering_requires_timestamps = true,
> .ordered_samples = true,
>
> in builtin-script.c, if you disable those there should be no queueing
> and we'll test this theory.
>
> Also try checking if there isn't any buffering in action because it uses
> pipes to connect the record with the python script by printing the event
> as it comes in the record part.
As I wrote in my first mail, maybe a bit not explicitly (sorry then), the
latency is introduced in perf-record. I'm telling this because I've
visually monitored perf-record output to console (yes, binary output,
but it does not matter) to see whether there is output progress at all.
After initial startup there was no progress for long time and then bunch
of data after 30-40 seconds.
And with the patch posted inline within reply-to-Peter mail, the delay
goes away, so could you please consider applying it?
Thanks,
Kirill
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: Q: perf log mode?
2011-01-12 16:31 ` Kirill Smelkov
@ 2011-01-12 17:07 ` Arnaldo Carvalho de Melo
0 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2011-01-12 17:07 UTC (permalink / raw)
To: Kirill Smelkov
Cc: linux-kernel, Frederic Weisbecker, Ingo Molnar, Mike Galbraith,
Paul Mackerras, Peter Zijlstra, Stephane Eranian, Tom Zanussi
Em Wed, Jan 12, 2011 at 07:31:32PM +0300, Kirill Smelkov escreveu:
> As I wrote in my first mail, maybe a bit not explicitly (sorry then), the
> latency is introduced in perf-record. I'm telling this because I've
Ok, all is well and we'll tune the kernel to do it as we need using your
patch.
> visually monitored perf-record output to console (yes, binary output,
> but it does not matter) to see whether there is output progress at all.
>
> After initial startup there was no progress for long time and then bunch
> of data after 30-40 seconds.
>
>
> And with the patch posted inline within reply-to-Peter mail, the delay
> goes away, so could you please consider applying it?
I will.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [tip:perf/urgent] perf record: Add "nodelay" mode, disabled by default
2011-01-12 14:06 Q: perf log mode? Kirill Smelkov
2011-01-12 14:08 ` Peter Zijlstra
2011-01-12 15:21 ` Arnaldo Carvalho de Melo
@ 2011-01-14 7:25 ` tip-bot for Kirill Smelkov
2 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Kirill Smelkov @ 2011-01-14 7:25 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, eranian, paulus, acme, kirr, hpa, mingo, tzanussi,
peterz, efault, fweisbec, tglx
Commit-ID: acac03fa15a8684bb60489ed87b5aae5258c0838
Gitweb: http://git.kernel.org/tip/acac03fa15a8684bb60489ed87b5aae5258c0838
Author: Kirill Smelkov <kirr@mns.spb.ru>
AuthorDate: Wed, 12 Jan 2011 17:59:36 +0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 13 Jan 2011 11:38:44 -0200
perf record: Add "nodelay" mode, disabled by default
Sometimes there is a need to use perf in "live-log" mode. The problem
is, for seldom events, actual info output is largely delayed because
perf-record reads sample data in whole pages.
So for such scenarious, add flag for perf-record to go in "nodelay"
mode. To track e.g. what's going on in icmp_rcv while ping is running
Use it with something like this:
(1) $ perf probe -L icmp_rcv | grep -U8 '^ *43\>'
goto error;
}
38 if (!pskb_pull(skb, sizeof(*icmph)))
goto error;
icmph = icmp_hdr(skb);
43 ICMPMSGIN_INC_STATS_BH(net, icmph->type);
/*
* 18 is the highest 'known' ICMP type. Anything else is a mystery
*
* RFC 1122: 3.2.2 Unknown ICMP messages types MUST be silently
* discarded.
*/
50 if (icmph->type > NR_ICMP_TYPES)
goto error;
$ perf probe icmp_rcv:43 'type=icmph->type'
(2) $ cat trace-icmp.py
[...]
def trace_begin():
print "in trace_begin"
def trace_end():
print "in trace_end"
def probe__icmp_rcv(event_name, context, common_cpu,
common_secs, common_nsecs, common_pid, common_comm,
__probe_ip, type):
print_header(event_name, common_cpu, common_secs, common_nsecs,
common_pid, common_comm)
print "__probe_ip=%u, type=%u\n" % \
(__probe_ip, type),
[...]
(3) $ perf record -a -D -e probe:icmp_rcv -o - | \
perf script -i - -s trace-icmp.py
Thanks to Peter Zijlstra for pointing how to do it.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>, Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20110112140613.GA11698@tugrik.mns.mnsspb.ru>
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 8 ++++++++
2 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 52462ae..e032716 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -61,6 +61,9 @@ OPTIONS
-r::
--realtime=::
Collect data with this RT SCHED_FIFO priority.
+-D::
+--no-delay::
+ Collect data without buffering.
-A::
--append::
Append to the output file to do incremental profiling.
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 1210e64..df6064a 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -49,6 +49,7 @@ static int pipe_output = 0;
static const char *output_name = "perf.data";
static int group = 0;
static int realtime_prio = 0;
+static bool nodelay = false;
static bool raw_samples = false;
static bool sample_id_all_avail = true;
static bool system_wide = false;
@@ -307,6 +308,11 @@ static void create_counter(struct perf_evsel *evsel, int cpu)
attr->sample_type |= PERF_SAMPLE_CPU;
}
+ if (nodelay) {
+ attr->watermark = 0;
+ attr->wakeup_events = 1;
+ }
+
attr->mmap = track;
attr->comm = track;
attr->inherit = !no_inherit;
@@ -843,6 +849,8 @@ const struct option record_options[] = {
"record events on existing thread id"),
OPT_INTEGER('r', "realtime", &realtime_prio,
"collect data with this RT SCHED_FIFO priority"),
+ OPT_BOOLEAN('D', "no-delay", &nodelay,
+ "collect data without buffering"),
OPT_BOOLEAN('R', "raw-samples", &raw_samples,
"collect raw sample records from all opened counters"),
OPT_BOOLEAN('a', "all-cpus", &system_wide,
^ permalink raw reply related [flat|nested] 13+ messages in thread
end of thread, other threads:[~2011-01-14 7:26 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-01-12 14:06 Q: perf log mode? Kirill Smelkov
2011-01-12 14:08 ` Peter Zijlstra
2011-01-12 14:42 ` Frederic Weisbecker
2011-01-12 14:53 ` Peter Zijlstra
2011-01-12 15:10 ` Frederic Weisbecker
2011-01-12 14:59 ` Kirill Smelkov
2011-01-12 15:02 ` Peter Zijlstra
2011-01-12 16:30 ` Arnaldo Carvalho de Melo
2011-01-13 8:26 ` Kirill Smelkov
2011-01-12 15:21 ` Arnaldo Carvalho de Melo
2011-01-12 16:31 ` Kirill Smelkov
2011-01-12 17:07 ` Arnaldo Carvalho de Melo
2011-01-14 7:25 ` [tip:perf/urgent] perf record: Add "nodelay" mode, disabled by default tip-bot for Kirill Smelkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox