From: Ingo Molnar <mingo@elte.hu>
To: Li Zefan <lizf@cn.fujitsu.com>
Cc: Jens Axboe <jens.axboe@oracle.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2] blktrace: fix output of BLK_TC_PC events
Date: Wed, 1 Apr 2009 13:00:16 +0200 [thread overview]
Message-ID: <20090401110016.GA15442@elte.hu> (raw)
In-Reply-To: <49D32387.9040106@cn.fujitsu.com>
* Li Zefan <lizf@cn.fujitsu.com> wrote:
> BLK_TC_PC events should be treated differently with BLK_TC_FS events.
>
> Before this patch:
>
> # echo 1 > /sys/block/sda/sda1/trace/enable
> # echo pc > /sys/block/sda/sda1/trace/act_mask
> # echo blk > /debugfs/tracing/current_tracer
> # (generate some BLK_TC_PC events)
> # cat trace
> bash-2184 [000] 1774.275413: 8,7 I N [bash]
> bash-2184 [000] 1774.275435: 8,7 D N [bash]
> bash-2184 [000] 1774.275540: 8,7 I R [bash]
> bash-2184 [000] 1774.275547: 8,7 D R [bash]
> ksoftirqd/0-4 [000] 1774.275580: 8,7 C N 0 [0]
> bash-2184 [000] 1774.275648: 8,7 I R [bash]
> bash-2184 [000] 1774.275653: 8,7 D R [bash]
> ksoftirqd/0-4 [000] 1774.275682: 8,7 C N 0 [0]
> bash-2184 [000] 1774.275739: 8,7 I R [bash]
> bash-2184 [000] 1774.275744: 8,7 D R [bash]
> ksoftirqd/0-4 [000] 1774.275771: 8,7 C N 0 [0]
> bash-2184 [000] 1774.275804: 8,7 I R [bash]
> bash-2184 [000] 1774.275808: 8,7 D R [bash]
> ksoftirqd/0-4 [000] 1774.275836: 8,7 C N 0 [0]
>
> After this patch:
>
> # cat trace
> bash-2263 [000] 366.782149: 8,7 I N 0 (00 ..) [bash]
> bash-2263 [000] 366.782323: 8,7 D N 0 (00 ..) [bash]
> bash-2263 [000] 366.782557: 8,7 I R 8 (25 00 ..) [bash]
> bash-2263 [000] 366.782560: 8,7 D R 8 (25 00 ..) [bash]
> ksoftirqd/0-4 [000] 366.782582: 8,7 C N (25 00 ..) [0]
> bash-2263 [000] 366.782648: 8,7 I R 8 (5a 00 3f 00) [bash]
> bash-2263 [000] 366.782650: 8,7 D R 8 (5a 00 3f 00) [bash]
> ksoftirqd/0-4 [000] 366.782669: 8,7 C N (5a 00 3f 00) [0]
> bash-2263 [000] 366.782710: 8,7 I R 8 (5a 00 08 00) [bash]
> bash-2263 [000] 366.782713: 8,7 D R 8 (5a 00 08 00) [bash]
> ksoftirqd/0-4 [000] 366.782730: 8,7 C N (5a 00 08 00) [0]
> bash-2263 [000] 366.783375: 8,7 I R 36 (5a 00 08 00) [bash]
> bash-2263 [000] 366.783379: 8,7 D R 36 (5a 00 08 00) [bash]
> ksoftirqd/0-4 [000] 366.783404: 8,7 C N (5a 00 08 00) [0]
>
> This is what we do with PC events in user-space blktrace.
>
> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
> ---
> kernel/trace/blktrace.c | 88 ++++++++++++++++++++++++++++++++++++++++++----
> 1 files changed, 80 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
> index 464a6e5..fd1ff49 100644
> --- a/kernel/trace/blktrace.c
> +++ b/kernel/trace/blktrace.c
> @@ -970,6 +970,16 @@ static inline const void *pdu_start(const struct trace_entry *ent)
> return te_blk_io_trace(ent) + 1;
> }
>
> +static inline u32 t_action(const struct trace_entry *ent)
> +{
> + return te_blk_io_trace(ent)->action;
> +}
> +
> +static inline u32 t_bytes(const struct trace_entry *ent)
> +{
> + return te_blk_io_trace(ent)->bytes;
> +}
> +
> static inline u32 t_sec(const struct trace_entry *ent)
> {
> return te_blk_io_trace(ent)->bytes >> 9;
> @@ -1030,25 +1040,87 @@ static int blk_log_action(struct trace_iterator *iter, const char *act)
> MAJOR(t->device), MINOR(t->device), act, rwbs);
> }
>
> +static int blk_log_dump_pdu(struct trace_seq *s, const struct trace_entry *ent)
> +{
> + const char *pdu_buf;
> + int pdu_len;
> + int i, end, ret;
> +
> + pdu_buf = pdu_start(ent);
> + pdu_len = te_blk_io_trace(ent)->pdu_len;
> +
> + if (!pdu_len)
> + return 1;
> +
> + /* find the last zero that needs to be printed */
> + for (end = pdu_len - 1; end >= 0; end--)
> + if (pdu_buf[end])
> + break;
> + end++;
> +
> + if (!trace_seq_putc(s, '('))
> + return 0;
> +
> + for (i = 0; i < pdu_len; i++) {
> +
> + ret = trace_seq_printf(s, "%s%02x",
> + i == 0 ? "" : " ", pdu_buf[i]);
> + if (!ret)
> + return ret;
> +
> + /*
> + * stop when the rest is just zeroes and indicate so
> + * with a ".." appended
> + */
> + if (i == end && end != pdu_len - 1)
> + return trace_seq_puts(s, " ..) ");
> + }
> +
> + return trace_seq_puts(s, ") ");
> +}
> +
> static int blk_log_generic(struct trace_seq *s, const struct trace_entry *ent)
> {
> char cmd[TASK_COMM_LEN];
>
> trace_find_cmdline(ent->pid, cmd);
>
> - if (t_sec(ent))
> - return trace_seq_printf(s, "%llu + %u [%s]\n",
> - t_sector(ent), t_sec(ent), cmd);
> - return trace_seq_printf(s, "[%s]\n", cmd);
> + if (t_action(ent) & BLK_TC_ACT(BLK_TC_PC)) {
> + int ret;
> +
> + ret = trace_seq_printf(s, "%u ", t_bytes(ent));
> + if (!ret)
> + return 0;
> + ret = blk_log_dump_pdu(s, ent);
> + if (!ret)
> + return 0;
> + return trace_seq_printf(s, "[%s]\n", cmd);
> + } else {
> + if (t_sec(ent))
> + return trace_seq_printf(s, "%llu + %u [%s]\n",
> + t_sector(ent), t_sec(ent), cmd);
> + return trace_seq_printf(s, "[%s]\n", cmd);
> + }
> }
>
> static int blk_log_with_error(struct trace_seq *s,
> const struct trace_entry *ent)
> {
> - if (t_sec(ent))
> - return trace_seq_printf(s, "%llu + %u [%d]\n", t_sector(ent),
> - t_sec(ent), t_error(ent));
> - return trace_seq_printf(s, "%llu [%d]\n", t_sector(ent), t_error(ent));
> + if (t_action(ent) & BLK_TC_ACT(BLK_TC_PC)) {
> + int ret;
> +
> + ret = blk_log_dump_pdu(s, ent);
> + if (ret)
> + return trace_seq_printf(s, "[%d]\n", t_error(ent));
> + return 0;
> + } else {
> + if (t_sec(ent))
> + return trace_seq_printf(s, "%llu + %u [%d]\n",
> + t_sector(ent),
> + t_sec(ent), t_error(ent));
> + return trace_seq_printf(s, "%llu [%d]\n",
> + t_sector(ent), t_error(ent));
> + }
> }
>
> static int blk_log_remap(struct trace_seq *s, const struct trace_entry *ent)
Neat shortcut - packet command requests can be quite large in
special cases, right?
Sidenote. Right now we reserve the full packet length in the trace
buffer, then copy the full packet into the trace buffer:
if (pdu_len)
memcpy((void *) t + sizeof(*t), pdu_data, pdu_len);
... just to scan it a second time during output and discard much of
it later on if it has a lot of trailing zeroes.
It might make sense to pre-compress: by extending the trace record
format with a trailing-zeroes field.
But that would make it more complex and key to tracers is simplicity
of data acquisition.
Dunno, probably not worth it, except if data stream bloat becomes a
real issue (which i'd not expect them to become in this case).
Anyway - Jens, the two patches look good to me, do they look fine to
you too?
Ingo
next prev parent reply other threads:[~2009-04-01 11:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-01 8:18 [PATCH 1/2] blktrace: fix output of unknown events Li Zefan
2009-04-01 8:19 ` [PATCH 2/2] blktrace: fix output of BLK_TC_PC events Li Zefan
2009-04-01 11:00 ` Ingo Molnar [this message]
2009-04-01 12:52 ` Arnaldo Carvalho de Melo
2009-04-11 8:15 ` Li Zefan
2009-04-12 13:07 ` Ingo Molnar
2009-04-12 13:10 ` Jens Axboe
2009-04-12 13:37 ` Ingo Molnar
2009-04-13 0:43 ` Li Zefan
2009-04-13 0:40 ` Li Zefan
2009-04-13 22:22 ` [tip:tracing/core] " tip-bot for Li Zefan
2009-04-01 12:46 ` [PATCH 1/2] blktrace: fix output of unknown events Arnaldo Carvalho de Melo
2009-04-13 22:22 ` [tip:tracing/core] " tip-bot for Li Zefan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090401110016.GA15442@elte.hu \
--to=mingo@elte.hu \
--cc=acme@redhat.com \
--cc=fweisbec@gmail.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lizf@cn.fujitsu.com \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox