All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: [for-next][PATCH 6/7] tracing: Move trace_handle_return() out of line
Date: Tue, 28 Mar 2017 18:16:51 -0400	[thread overview]
Message-ID: <20170328221729.723524143@goodmis.org> (raw)
In-Reply-To: 20170328221645.326712684@goodmis.org

[-- Attachment #1: 0006-tracing-Move-trace_handle_return-out-of-line.patch --]
[-- Type: text/plain, Size: 6189 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Currently trace_handle_return() looks like this:

 static inline enum print_line_t trace_handle_return(struct trace_seq *s)
 {
        return trace_seq_has_overflowed(s) ?
                TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
 }

Where trace_seq_overflowed(s) is:

 static inline bool trace_seq_has_overflowed(struct trace_seq *s)
 {
	return s->full || seq_buf_has_overflowed(&s->seq);
 }

And seq_buf_has_overflowed(&s->seq) is:

 static inline bool
 seq_buf_has_overflowed(struct seq_buf *s)
 {
	return s->len > s->size;
 }

Making trace_handle_return() into:

 return (s->full || (s->seq->len > s->seq->size)) ?
           TRACE_TYPE_PARTIAL_LINE :
           TRACE_TYPE_HANDLED;

One would think this is not an issue to keep as an inline. But because this
is used in the TRACE_EVENT() macro, it is extended for every tracepoint in
the system. Taking a look at a single tracepoint x86_irq_vector (was the
first one I randomly chosen). As trace_handle_return is used in the
TRACE_EVENT() macro of trace_raw_output_##call() we disassemble
trace_raw_output_x86_irq_vector and do a diff:

- is the original
+ is the out-of-line code

I removed identical lines that were different just due to different
addresses.

--- /tmp/irq-vec-orig	2017-03-16 09:12:48.569384851 -0400
+++ /tmp/irq-vec-ool	2017-03-16 09:13:39.378153385 -0400
@@ -6,27 +6,23 @@
        53                      push   %rbx
        48 89 fb                mov    %rdi,%rbx
        4c 8b a7 c0 20 00 00    mov    0x20c0(%rdi),%r12
        e8 f7 72 13 00          callq  ffffffff81155c80 <trace_raw_output_prep>
        83 f8 01                cmp    $0x1,%eax
        74 05                   je     ffffffff8101e993 <trace_raw_output_x86_irq_vector+0x23>
        5b                      pop    %rbx
        41 5c                   pop    %r12
        5d                      pop    %rbp
        c3                      retq
        41 8b 54 24 08          mov    0x8(%r12),%edx
-       48 8d bb 98 10 00 00    lea    0x1098(%rbx),%rdi
+       48 81 c3 98 10 00 00    add    $0x1098,%rbx
-       48 c7 c6 7b 8a a0 81    mov    $0xffffffff81a08a7b,%rsi
+       48 c7 c6 ab 8a a0 81    mov    $0xffffffff81a08aab,%rsi
-       e8 c5 85 13 00          callq  ffffffff81156f70 <trace_seq_printf>

 === here's the start of the main difference ===

+       48 89 df                mov    %rbx,%rdi
+       e8 62 7e 13 00          callq  ffffffff81156810 <trace_seq_printf>
-       8b 93 b8 20 00 00       mov    0x20b8(%rbx),%edx
-       31 c0                   xor    %eax,%eax
-       85 d2                   test   %edx,%edx
-       75 11                   jne    ffffffff8101e9c8 <trace_raw_output_x86_irq_vector+0x58>
-       48 8b 83 a8 20 00 00    mov    0x20a8(%rbx),%rax
-       48 39 83 a0 20 00 00    cmp    %rax,0x20a0(%rbx)
-       0f 93 c0                setae  %al
+       48 89 df                mov    %rbx,%rdi
+       e8 4a c5 12 00          callq  ffffffff8114af00 <trace_handle_return>
        5b                      pop    %rbx
-       0f b6 c0                movzbl %al,%eax

 === end ===

        41 5c                   pop    %r12
        5d                      pop    %rbp
        c3                      retq

If you notice, the original has 22 bytes of text more than the out of line
version. As this is for every TRACE_EVENT() defined in the system, this can
become quite large.

   text	   data	    bss	    dec	    hex	filename
8690305	5450490	1298432	15439227	 eb957b	vmlinux-orig
8681725	5450490	1298432	15430647	 eb73f7	vmlinux-handle

This change has a total of 8580 bytes in savings.

 $ objdump -dr /tmp/vmlinux-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
324

That's 324 tracepoints. But this does not include modules (which contain
many more tracepoints). For an allyesconfig build:

 $ objdump -dr vmlinux-allyes-orig | grep '^[0-9a-f]* <trace_raw_output' | wc -l
1401

That's 1401 tracepoints giving us:

   text    data     bss     dec     hex filename
137920629       140221067       53264384        331406080       13c0db00 vmlinux-allyes-orig
137827709       140221067       53264384        331313160       13bf7008 vmlinux-allyes-handle

92920 bytes in savings!!!

Link: http://lkml.kernel.org/r/20170315021431.13107-2-andi@firstfloor.org

Reported-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/trace_events.h | 11 +----------
 kernel/trace/trace.c         | 12 ++++++++++++
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 0af63c4381b9..a556805eff8a 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -138,16 +138,7 @@ enum print_line_t {
 	TRACE_TYPE_NO_CONSUME	= 3	/* Handled but ask to not consume */
 };
 
-/*
- * Several functions return TRACE_TYPE_PARTIAL_LINE if the trace_seq
- * overflowed, and TRACE_TYPE_HANDLED otherwise. This helper function
- * simplifies those functions and keeps them in sync.
- */
-static inline enum print_line_t trace_handle_return(struct trace_seq *s)
-{
-	return trace_seq_has_overflowed(s) ?
-		TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
-}
+enum print_line_t trace_handle_return(struct trace_seq *s);
 
 void tracing_generic_entry_update(struct trace_entry *entry,
 				  unsigned long flags,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 4fa8e8f3c765..b5d4b80f2d45 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -1998,6 +1998,18 @@ void tracing_record_cmdline(struct task_struct *tsk)
 		__this_cpu_write(trace_cmdline_save, false);
 }
 
+/*
+ * Several functions return TRACE_TYPE_PARTIAL_LINE if the trace_seq
+ * overflowed, and TRACE_TYPE_HANDLED otherwise. This helper function
+ * simplifies those functions and keeps them in sync.
+ */
+enum print_line_t trace_handle_return(struct trace_seq *s)
+{
+	return trace_seq_has_overflowed(s) ?
+		TRACE_TYPE_PARTIAL_LINE : TRACE_TYPE_HANDLED;
+}
+EXPORT_SYMBOL_GPL(trace_handle_return);
+
 void
 tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags,
 			     int pc)
-- 
2.10.2

  parent reply	other threads:[~2017-03-28 22:17 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-28 22:16 [for-next][PATCH 0/7] tracing: [4.12] Allow function tracing to start earlier in boot up Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 1/7] tracing: Split tracing initialization into two for early initialization Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 2/7] ftrace: Move ftrace_init() to right after memory initialization Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 3/7] tracing: Postpone tracer start-up tests till the system is more robust Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 4/7] ftrace: Have function tracing start in early boot up Steven Rostedt
2017-03-28 22:16 ` [for-next][PATCH 5/7] ftrace: Allow for function tracing to record init functions on " Steven Rostedt
2017-03-28 22:16   ` Steven Rostedt
2017-03-28 22:16 ` Steven Rostedt [this message]
2017-03-28 22:16 ` [for-next][PATCH 7/7] ftrace/x86: Do no run CPU sync when there is only one CPU online Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170328221729.723524143@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.