From: Li Zefan <lizf@cn.fujitsu.com>
To: Ian Munsie <imunsie@au1.ibm.com>
Cc: linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@redhat.com>,
Masami Hiramatsu <mhiramat@redhat.com>,
Jiri Olsa <jolsa@redhat.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2] ftrace: record command lines at more appropriate moment
Date: Thu, 29 Jul 2010 11:01:53 +0800 [thread overview]
Message-ID: <4C50EF21.1020202@cn.fujitsu.com> (raw)
In-Reply-To: <1280319410-18120-1-git-send-email-imunsie@au1.ibm.com>
Ian Munsie wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
>
> Previously, when tracing was activated through debugfs, regardless of
> which tracing plugin (if any) were activated, the probe_sched_switch and
> probe_sched_wakeup probes from the sched_switch plugin would be
> activated. This appears to have been a hack to use them to record the
> command lines of active processes as they were scheduled.
>
> That approach would suffer if many processes were being scheduled that
> were not generating events as they would consume entries in the
> saved_cmdlines buffer that could otherwise have been used by other
> processes that were actually generating events.
>
> It also had the problem that events could be mis-attributed - in the
> common situation of a process forking then execing a new process, the
> change of the process command would not be noticed for some time after
> the exec until the process was next scheduled.
>
> If the trace was read after the fact this would generally go unnoticed
> because at some point the process would be scheduled and the entry in
> the saved_cmdlines buffer would be updated so that the new command would
> be reported when the trace was eventually read. However, if the events
> were being read live (e.g. through trace_pipe), the events just after
> the exec and before the process was next scheduled would show the
> incorrect command (though the PID would be correct).
>
> This patch removes the sched_switch hack altogether and instead records
> the commands at a more appropriate moment - when a new trace event is
> committed onto the ftrace ring buffer. This means that the recorded
> command line is much more likely to be correct when the trace is read,
> either live or after the fact, so long as the command line still resides
> in the saved_cmdlines buffer.
>
> It is still not guaranteed to be correct in all situations. For instance
> if the trace is read after the fact rather than live (consider events
> generated by a process before an exec - in the below example they would
> be attributed to sleep rather than stealpid since the entry in
> saved_cmdlines would have changed before the event was read), but this
> is no different to the current situation and the alternative would be to
> store the command line with each and every event.
>
> terminal 1: grep '\-12345' /sys/kernel/debug/tracing/trace_pipe
> terminal 2: ./stealpid 12345 `which sleep` 0.1
>
> Before:
> stealpid-12345 [003] 86.001826: sys_clone -> 0x0
> stealpid-12345 [003] 86.002013: compat_sys_execve(ufilename: ffaaabef, argv: ffaaa7ec, envp: ffaaa7f8)
> stealpid-12345 [002] 86.002292: sys_restart_syscall -> 0x0
> stealpid-12345 [002] 86.002336: sys_brk(brk: 0)
> stealpid-12345 [002] 86.002338: sys_brk -> 0x1007a000
> ...
> stealpid-12345 [002] 86.002582: sys_mmap(addr: 0, len: 1000, prot: 3, flags: 22, fd: ffffffff, offset: 0)
> stealpid-12345 [002] 86.002586: sys_mmap -> 0xf7c21000
> sleep-12345 [002] 86.002771: sys_mprotect(start: ffe8000, len: 4000, prot: 1)
> sleep-12345 [002] 86.002780: sys_mprotect -> 0x0
> ...
>
> After:
> stealpid-12345 [003] 1368.823626: sys_clone -> 0x0
> stealpid-12345 [003] 1368.823820: compat_sys_execve(ufilename: fffa6bef, argv: fffa5afc, envp: fffa5b08)
> sleep-12345 [002] 1368.824125: sys_restart_syscall -> 0x0
> sleep-12345 [002] 1368.824173: sys_brk(brk: 0)
> sleep-12345 [002] 1368.824175: sys_brk -> 0x104ae000
> ...
>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
I've tested your patch using lmbench(ctx):
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
(trace-off)
Linux 2.6.35- 2.1300 2.2100 2.0800 2.5900 2.1400 2.59000 2.19000
Linux 2.6.35- 2.1400 2.2000 2.0800 2.6000 2.0900 2.56000 2.15000
(all events on)
Linux 2.6.35- 2.8000 2.9600 2.7200 3.2500 2.8200 3.24000 2.98000
Linux 2.6.35- 2.7100 2.6900 2.7300 3.2200 2.8500 3.25000 2.79000
(all events on without cmdline-recording)
Linux 2.6.35- 2.6100 2.6900 2.5800 3.0300 2.5800 3.04000 2.67000
Linux 2.6.35- 2.5800 2.5900 2.5600 3.0300 2.6600 3.04000 2.61000
(your patch applied)
Linux 2.6.35- 2.7100 2.8000 2.7200 3.2100 2.8400 3.24000 2.82000
Linux 2.6.35- 2.6600 2.8400 2.6900 3.1900 2.7600 3.27000 2.78000
So with your patch applied, the performance is still worse than just disabling
cmdline-recording.
The performance may be worse if I choose some other benchmarks.
I'd suggest another approch, that we add a tracepiont in set_task_comm()
to record cmdname, and that's how perf does.
> ---
>
> Changes since v1 addressing feedback from Li Zefan:
> * Inline trace_save_cmdline and tracing_record_cmdline for a marginal speed
> gain when recording command lines.
> * Move call to tracing_record_cmdline from tracing_generic_entry_update to
> __trace_buffer_unlock_commit to avoid the overhead when using perf or if the
> event was filtered out.
>
> kernel/trace/trace.c | 7 +++----
> kernel/trace/trace_events.c | 11 -----------
> kernel/trace/trace_functions.c | 2 --
> kernel/trace/trace_functions_graph.c | 2 --
> kernel/trace/trace_sched_switch.c | 10 ----------
> 5 files changed, 3 insertions(+), 29 deletions(-)
next prev parent reply other threads:[~2010-07-29 2:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-28 2:29 [PATCH 1/2] ftrace: record command lines at more appropriate moment Ian Munsie
2010-07-28 2:29 ` [PATCH 2/2] Revert "tracing: Allow to disable cmdline recording" Ian Munsie
2010-07-28 2:55 ` [PATCH 1/2] ftrace: record command lines at more appropriate moment Li Zefan
2010-07-28 11:44 ` Ian Munsie
2010-07-28 12:16 ` [PATCH v2] " Ian Munsie
2010-07-29 2:50 ` Frederic Weisbecker
2010-07-29 10:26 ` Ian Munsie
2010-07-29 3:01 ` Li Zefan [this message]
2010-07-29 3:04 ` Frederic Weisbecker
2010-07-29 1:58 ` [PATCH 1/2] " Frederic Weisbecker
2010-07-29 10:30 ` Ian Munsie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C50EF21.1020202@cn.fujitsu.com \
--to=lizf@cn.fujitsu.com \
--cc=fweisbec@gmail.com \
--cc=imunsie@au1.ibm.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@redhat.com \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.