From: Li Zefan <lizf@cn.fujitsu.com>
To: Ian Munsie <imunsie@au1.ibm.com>
Cc: linux-kernel@vger.kernel.org,
Steven Rostedt <rostedt@goodmis.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Ingo Molnar <mingo@redhat.com>,
Masami Hiramatsu <mhiramat@redhat.com>,
Jiri Olsa <jolsa@redhat.com>, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH v2] ftrace: record command lines at more appropriate moment
Date: Thu, 29 Jul 2010 11:01:53 +0800 [thread overview]
Message-ID: <4C50EF21.1020202@cn.fujitsu.com> (raw)
In-Reply-To: <1280319410-18120-1-git-send-email-imunsie@au1.ibm.com>
Ian Munsie wrote:
> From: Ian Munsie <imunsie@au1.ibm.com>
>
> Previously, when tracing was activated through debugfs, regardless of
> which tracing plugin (if any) were activated, the probe_sched_switch and
> probe_sched_wakeup probes from the sched_switch plugin would be
> activated. This appears to have been a hack to use them to record the
> command lines of active processes as they were scheduled.
>
> That approach would suffer if many processes were being scheduled that
> were not generating events as they would consume entries in the
> saved_cmdlines buffer that could otherwise have been used by other
> processes that were actually generating events.
>
> It also had the problem that events could be mis-attributed - in the
> common situation of a process forking then execing a new process, the
> change of the process command would not be noticed for some time after
> the exec until the process was next scheduled.
>
> If the trace was read after the fact this would generally go unnoticed
> because at some point the process would be scheduled and the entry in
> the saved_cmdlines buffer would be updated so that the new command would
> be reported when the trace was eventually read. However, if the events
> were being read live (e.g. through trace_pipe), the events just after
> the exec and before the process was next scheduled would show the
> incorrect command (though the PID would be correct).
>
> This patch removes the sched_switch hack altogether and instead records
> the commands at a more appropriate moment - when a new trace event is
> committed onto the ftrace ring buffer. This means that the recorded
> command line is much more likely to be correct when the trace is read,
> either live or after the fact, so long as the command line still resides
> in the saved_cmdlines buffer.
>
> It is still not guaranteed to be correct in all situations. For instance
> if the trace is read after the fact rather than live (consider events
> generated by a process before an exec - in the below example they would
> be attributed to sleep rather than stealpid since the entry in
> saved_cmdlines would have changed before the event was read), but this
> is no different to the current situation and the alternative would be to
> store the command line with each and every event.
>
> terminal 1: grep '\-12345' /sys/kernel/debug/tracing/trace_pipe
> terminal 2: ./stealpid 12345 `which sleep` 0.1
>
> Before:
> stealpid-12345 [003] 86.001826: sys_clone -> 0x0
> stealpid-12345 [003] 86.002013: compat_sys_execve(ufilename: ffaaabef, argv: ffaaa7ec, envp: ffaaa7f8)
> stealpid-12345 [002] 86.002292: sys_restart_syscall -> 0x0
> stealpid-12345 [002] 86.002336: sys_brk(brk: 0)
> stealpid-12345 [002] 86.002338: sys_brk -> 0x1007a000
> ...
> stealpid-12345 [002] 86.002582: sys_mmap(addr: 0, len: 1000, prot: 3, flags: 22, fd: ffffffff, offset: 0)
> stealpid-12345 [002] 86.002586: sys_mmap -> 0xf7c21000
> sleep-12345 [002] 86.002771: sys_mprotect(start: ffe8000, len: 4000, prot: 1)
> sleep-12345 [002] 86.002780: sys_mprotect -> 0x0
> ...
>
> After:
> stealpid-12345 [003] 1368.823626: sys_clone -> 0x0
> stealpid-12345 [003] 1368.823820: compat_sys_execve(ufilename: fffa6bef, argv: fffa5afc, envp: fffa5b08)
> sleep-12345 [002] 1368.824125: sys_restart_syscall -> 0x0
> sleep-12345 [002] 1368.824173: sys_brk(brk: 0)
> sleep-12345 [002] 1368.824175: sys_brk -> 0x104ae000
> ...
>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
I've tested your patch using lmbench(ctx):
Context switching - times in microseconds - smaller is better
-------------------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
(trace-off)
Linux 2.6.35- 2.1300 2.2100 2.0800 2.5900 2.1400 2.59000 2.19000
Linux 2.6.35- 2.1400 2.2000 2.0800 2.6000 2.0900 2.56000 2.15000
(all events on)
Linux 2.6.35- 2.8000 2.9600 2.7200 3.2500 2.8200 3.24000 2.98000
Linux 2.6.35- 2.7100 2.6900 2.7300 3.2200 2.8500 3.25000 2.79000
(all events on without cmdline-recording)
Linux 2.6.35- 2.6100 2.6900 2.5800 3.0300 2.5800 3.04000 2.67000
Linux 2.6.35- 2.5800 2.5900 2.5600 3.0300 2.6600 3.04000 2.61000
(your patch applied)
Linux 2.6.35- 2.7100 2.8000 2.7200 3.2100 2.8400 3.24000 2.82000
Linux 2.6.35- 2.6600 2.8400 2.6900 3.1900 2.7600 3.27000 2.78000
So with your patch applied, the performance is still worse than just disabling
cmdline-recording.
The performance may be worse if I choose some other benchmarks.
I'd suggest another approch, that we add a tracepiont in set_task_comm()
to record cmdname, and that's how perf does.
> ---
>
> Changes since v1 addressing feedback from Li Zefan:
> * Inline trace_save_cmdline and tracing_record_cmdline for a marginal speed
> gain when recording command lines.
> * Move call to tracing_record_cmdline from tracing_generic_entry_update to
> __trace_buffer_unlock_commit to avoid the overhead when using perf or if the
> event was filtered out.
>
> kernel/trace/trace.c | 7 +++----
> kernel/trace/trace_events.c | 11 -----------
> kernel/trace/trace_functions.c | 2 --
> kernel/trace/trace_functions_graph.c | 2 --
> kernel/trace/trace_sched_switch.c | 10 ----------
> 5 files changed, 3 insertions(+), 29 deletions(-)
next prev parent reply other threads:[~2010-07-29 2:57 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-28 2:29 [PATCH 1/2] ftrace: record command lines at more appropriate moment Ian Munsie
2010-07-28 2:29 ` [PATCH 2/2] Revert "tracing: Allow to disable cmdline recording" Ian Munsie
2010-07-28 2:55 ` [PATCH 1/2] ftrace: record command lines at more appropriate moment Li Zefan
2010-07-28 11:44 ` Ian Munsie
2010-07-28 12:16 ` [PATCH v2] " Ian Munsie
2010-07-29 2:50 ` Frederic Weisbecker
2010-07-29 10:26 ` Ian Munsie
2010-07-29 3:01 ` Li Zefan [this message]
2010-07-29 3:04 ` Frederic Weisbecker
2010-07-29 1:58 ` [PATCH 1/2] " Frederic Weisbecker
2010-07-29 10:30 ` Ian Munsie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C50EF21.1020202@cn.fujitsu.com \
--to=lizf@cn.fujitsu.com \
--cc=fweisbec@gmail.com \
--cc=imunsie@au1.ibm.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@redhat.com \
--cc=mingo@redhat.com \
--cc=rostedt@goodmis.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox