From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: [for-next][PATCH 6/7] tracing: Move trace_handle_return() out of line
Date: Tue, 28 Mar 2017 18:16:51 -0400 [thread overview]
Message-ID: <20170328221729.723524143@goodmis.org> (raw)
In-Reply-To: 20170328221645.326712684@goodmis.org
[-- Attachment #1: 0006-tracing-Move-trace_handle_return-out-of-line.patch --]
[-- Type: text/plain, Size: 6189 bytes --]
From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Currently trace_handle_return() looks like this:
static inline enum print_line_t trace_handle_return(struct trace_seq *s)
{
return trace_seq_has_overflowed(s) ?
TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
}
Where trace_seq_overflowed(s) is:
static inline bool trace_seq_has_overflowed(struct trace_seq *s)
{
return s->full || seq_buf_has_overflowed(&s->seq);
}
And seq_buf_has_overflowed(&s->seq) is:
static inline bool
seq_buf_has_overflowed(struct seq_buf *s)
{
return s->len > s->size;
}
Making trace_handle_return() into:
return (s->full || (s->seq->len > s->seq->size)) ?
TRACE_TYPE_PARTIAL_LINE :
TRACE_TYPE_HANDLED;
One would think this is not an issue to keep as an inline. But because this
is used in the TRACE_EVENT() macro, it is extended for every tracepoint in
the system. Taking a look at a single tracepoint x86_irq_vector (was the
first one I randomly chosen). As trace_handle_return is used in the
TRACE_EVENT() macro of trace_raw_output_##call() we disassemble
trace_raw_output_x86_irq_vector and do a diff:
- is the original
+ is the out-of-line code
I removed identical lines that were different just due to different
addresses.
--- /tmp/irq-vec-orig 2017-03-16 09:12:48.569384851 -0400
+++ /tmp/irq-vec-ool 2017-03-16 09:13:39.378153385 -0400
@@ -6,27 +6,23 @@
53 push %rbx
48 89 fb mov %rdi,%rbx
4c 8b a7 c0 20 00 00 mov 0x20c0(%rdi),%r12
e8 f7 72 13 00 callq ffffffff81155c80 <trace_raw_output_prep>
83 f8 01 cmp $0x1,%eax
74 05 je ffffffff8101e993 <trace_raw_output_x86_irq_vector+0x23>
5b pop %rbx
41 5c pop %r12
5d pop %rbp
c3 retq
41 8b 54 24 08 mov 0x8(%r12),%edx
- 48 8d bb 98 10 00 00 lea 0x1098(%rbx),%rdi
+ 48 81 c3 98 10 00 00 add $0x1098,%rbx
- 48 c7 c6 7b 8a a0 81 mov $0xffffffff81a08a7b,%rsi
+ 48 c7 c6 ab 8a a0 81 mov $0xffffffff81a08aab,%rsi
- e8 c5 85 13 00 callq ffffffff81156f70 <trace_seq_printf>
=== here's the start of the main difference ===
+ 48 89 df mov %rbx,%rdi
+ e8 62 7e 13 00 callq ffffffff81156810 <trace_seq_printf>
- 8b 93 b8 20 00 00 mov 0x20b8(%rbx),%edx
- 31 c0 xor %eax,%eax
- 85 d2 test %edx,%edx
- 75 11 jne ffffffff8101e9c8 <trace_raw_output_x86_irq_vector+0x58>
- 48 8b 83 a8 20 00 00 mov 0x20a8(%rbx),%rax
- 48 39 83 a0 20 00 00 cmp %rax,0x20a0(%rbx)
- 0f 93 c0 setae %al
+ 48 89 df mov %rbx,%rdi
+ e8 4a c5 12 00 callq ffffffff8114af00 <trace_handle_return>
5b pop %rbx
- 0f b6 c0 movzbl %al,%eax
=== end ===
41 5c pop %r12
5d pop %rbp
c3 retq
If you notice, the original has 22 bytes of text more than the out of line
version. As this is for every TRACE_EVENT() defined in the system, this can
become quite large.
text data bss dec hex filename
8690305 5450490 1298432 15439227 eb957b vmlinux-orig
8681725 5450490 1298432 15430647 eb73f7 vmlinux-handle
This change has a total of 8580 bytes in savings.
$ objdump -dr /tmp/vmlinux-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
324
That's 324 tracepoints. But this does not include modules (which contain
many more tracepoints). For an allyesconfig build:
$ objdump -dr vmlinux-allyes-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
1401
That's 1401 tracepoints giving us:
text data bss dec hex filename
137920629 140221067 53264384 331406080 13c0db00 vmlinux-allyes-orig
137827709 140221067 53264384 331313160 13bf7008 vmlinux-allyes-handle
92920 bytes in savings!!!
Link: http://lkml.kernel.org/r/20170315021431.13107-2-andi@firstfloor.org
Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
include/linux/trace_events.h | 11 +----------
kernel/trace/trace.c | 12 ++++++++++++
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 0af63c4381b9..a556805eff8a 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -138,16 +138,7 @@ enum print_line_t {
TRACE_TYPE_NO_CONSUME = 3 /* Handled but ask to not consume */
};
-/*
- * Several functions return TRACE_TYPE_PARTIAL_LINE if the trace_seq
- * overflowed, and TRACE_TYPE_HANDLED otherwise. This helper function
- * simplifies those functions and keeps them in sync.
- */
-static inline enum print_line_t trace_handle_return(struct trace_seq *s)
-{
- return trace_seq_has_overflowed(s) ?
- TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
-}
+enum print_line_t trace_handle_return(struct trace_seq *s);
void tracing_generic_entry_update(struct trace_entry *entry,
unsigned long flags,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4fa8e8f3c765..b5d4b80f2d45 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1998,6 +1998,18 @@ void tracing_record_cmdline(struct task_struct *tsk)
__this_cpu_write(trace_cmdline_save, false);
}
+/*
+ * Several functions return TRACE_TYPE_PARTIAL_LINE if the trace_seq
+ * overflowed, and TRACE_TYPE_HANDLED otherwise. This helper function
+ * simplifies those functions and keeps them in sync.
+ */
+enum print_line_t trace_handle_return(struct trace_seq *s)
+{
+ return trace_seq_has_overflowed(s) ?
+ TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
+}
+EXPORT_SYMBOL_GPL(trace_handle_return);
+
void
tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags,
int pc)
--
2.10.2
next prev parent reply other threads:[~2017-03-28 22:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-28 22:16 [for-next][PATCH 0/7] tracing: [4.12] Allow function tracing to start earlier in boot up Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 1/7] tracing: Split tracing initialization into two for early initialization Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 2/7] ftrace: Move ftrace_init() to right after memory initialization Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 3/7] tracing: Postpone tracer start-up tests till the system is more robust Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 4/7] ftrace: Have function tracing start in early boot up Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 5/7] ftrace: Allow for function tracing to record init functions on " Steven Rostedt
2017-03-28 22:16 ` Steven Rostedt
2017-03-28 22:16 ` Steven Rostedt [this message]
2017-03-28 22:16 ` [for-next][PATCH 7/7] ftrace/x86: Do no run CPU sync when there is only one CPU online Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170328221729.723524143@goodmis.org \
--to=rostedt@goodmis.org \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.