From: Jan Kiszka <jan.kiszka@siemens.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: LKML <linux-kernel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Christoph Hellwig <hch@infradead.org>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
Gregory Haskins <ghaskins@novell.com>,
Arnaldo Carvalho de Melo <acme@ghostprotocols.net>,
Thomas Gleixner <tglx@linutronix.de>,
Tim Bird <tim.bird@am.sony.com>, Sam Ravnborg <sam@ravnborg.org>,
"Frank Ch. Eigler" <fche@redhat.com>,
Steven Rostedt <srostedt@redhat.com>,
Philippe Gerum <rpm@xenomai.org>
Subject: Re: [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation
Date: Thu, 10 Jan 2008 19:19:37 +0100 [thread overview]
Message-ID: <478661B9.7050406@siemens.com> (raw)
In-Reply-To: <20080109233042.386384204@goodmis.org>
Steven Rostedt wrote:
> Index: linux-compile-i386.git/Makefile
> ===================================================================
> --- linux-compile-i386.git.orig/Makefile 2008-01-09 14:09:36.000000000 -0500
> +++ linux-compile-i386.git/Makefile 2008-01-09 14:10:07.000000000 -0500
> @@ -509,6 +509,10 @@ endif
>
> include $(srctree)/arch/$(SRCARCH)/Makefile
>
> +# MCOUNT expects frame pointer
This comment looks stray.
> +ifdef CONFIG_MCOUNT
> +KBUILD_CFLAGS += -pg
> +endif
> ifdef CONFIG_FRAME_POINTER
> KBUILD_CFLAGS += -fno-omit-frame-pointer -fno-optimize-sibling-calls
> else
> Index: linux-compile-i386.git/arch/x86/Kconfig
> ===================================================================
> --- linux-compile-i386.git.orig/arch/x86/Kconfig 2008-01-09 14:09:36.000000000 -0500
> +++ linux-compile-i386.git/arch/x86/Kconfig 2008-01-09 14:10:07.000000000 -0500
> @@ -28,6 +28,10 @@ config GENERIC_CMOS_UPDATE
> bool
> default y
>
> +config ARCH_HAS_MCOUNT
> + bool
> + default y
> +
> config CLOCKSOURCE_WATCHDOG
> bool
> default y
> Index: linux-compile-i386.git/arch/x86/kernel/Makefile_32
> ===================================================================
> --- linux-compile-i386.git.orig/arch/x86/kernel/Makefile_32 2008-01-09 14:09:36.000000000 -0500
> +++ linux-compile-i386.git/arch/x86/kernel/Makefile_32 2008-01-09 14:10:07.000000000 -0500
> @@ -23,6 +23,7 @@ obj-$(CONFIG_APM) += apm_32.o
> obj-$(CONFIG_X86_SMP) += smp_32.o smpboot_32.o tsc_sync.o
> obj-$(CONFIG_SMP) += smpcommon_32.o
> obj-$(CONFIG_X86_TRAMPOLINE) += trampoline_32.o
> +obj-$(CONFIG_MCOUNT) += mcount-wrapper.o
So far the code organization is different for 32 and 64 bit. I would
suggest to either
o move both trampolines into entry_*.S or
o put them in something like mcount-wrapper_32/64.S.
> obj-$(CONFIG_X86_MPPARSE) += mpparse_32.o
> obj-$(CONFIG_X86_LOCAL_APIC) += apic_32.o nmi_32.o
> obj-$(CONFIG_X86_IO_APIC) += io_apic_32.o
> Index: linux-compile-i386.git/arch/x86/kernel/mcount-wrapper.S
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-compile-i386.git/arch/x86/kernel/mcount-wrapper.S 2008-01-09 14:10:07.000000000 -0500
> @@ -0,0 +1,25 @@
> +/*
> + * linux/arch/x86/mcount-wrapper.S
> + *
> + * Copyright (C) 2004 Ingo Molnar
> + */
> +
> +.globl mcount
> +mcount:
> + cmpl $0, mcount_enabled
> + jz out
> +
> + push %ebp
> + mov %esp, %ebp
What is the benefit of having a call frame in this trampoline? We used
to carry this in the i386 mcount tracer for Adeos/I-pipe too (it was
derived from the -rt code), but I just successfully tested a removal
patch. Also glibc [1] doesn't include it.
> + pushl %eax
> + pushl %ecx
> + pushl %edx
> +
> + call __mcount
I think this indirection should be avoided, just like the 64-bit version
and glibc do.
> +
> + popl %edx
> + popl %ecx
> + popl %eax
> + popl %ebp
> +out:
> + ret
...
> Index: linux-compile-i386.git/lib/tracing/mcount.c
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-compile-i386.git/lib/tracing/mcount.c 2008-01-09 14:10:07.000000000 -0500
> @@ -0,0 +1,77 @@
> +/*
> + * Infrastructure for profiling code inserted by 'gcc -pg'.
> + *
> + * Copyright (C) 2007 Arnaldo Carvalho de Melo <acme@redhat.com>
> + *
> + * Converted to be more generic:
> + * Copyright (C) 2007-2008 Steven Rostedt <srostedt@redhat.com>
> + *
> + * From code in the latency_tracer, that is:
> + *
> + * Copyright (C) 2004-2006 Ingo Molnar
> + * Copyright (C) 2004 William Lee Irwin III
> + */
> +
> +#include <linux/module.h>
> +#include <linux/mcount.h>
> +
> +/*
> + * Since we have nothing protecting between the test of
> + * mcount_trace_function and the call to it, we can't
> + * set it to NULL without risking a race that will have
> + * the kernel call the NULL pointer. Instead, we just
> + * set the function pointer to a dummy function.
> + */
> +notrace void dummy_mcount_tracer(unsigned long ip,
> + unsigned long parent_ip)
> +{
> + /* do nothing */
> +}
> +
> +mcount_func_t mcount_trace_function __read_mostly = dummy_mcount_tracer;
> +int mcount_enabled __read_mostly;
> +
> +/** __mcount - hook for profiling
> + *
> + * This routine is called from the arch specific mcount routine, that in turn is
> + * called from code inserted by gcc -pg.
> + */
> +notrace void __mcount(void)
> +{
> + mcount_trace_function(CALLER_ADDR1, CALLER_ADDR2);
> +}
mcount_trace_function should always be called from the assembly
trampoline, IMO.
> +EXPORT_SYMBOL_GPL(mcount);
> +/*
> + * The above EXPORT_SYMBOL is for the gcc call of mcount and not the
> + * function __mcount that it is underneath. I put the export there
> + * to fool checkpatch.pl. It wants that export to be with the
> + * function, but that function happens to be in assembly.
> + */
> +
> +/**
> + * register_mcount_function - register a function for profiling
> + * @func - the function for profiling.
> + *
> + * Register a function to be called by all functions in the
> + * kernel.
> + *
> + * Note: @func and all the functions it calls must be labeled
> + * with "notrace", otherwise it will go into a
> + * recursive loop.
> + */
> +int register_mcount_function(mcount_func_t func)
> +{
> + mcount_trace_function = func;
> + return 0;
> +}
> +
> +/**
> + * clear_mcount_function - reset the mcount function
> + *
> + * This NULLs the mcount function and in essence stops
> + * tracing. There may be lag
> + */
> +void clear_mcount_function(void)
> +{
> + mcount_trace_function = dummy_mcount_tracer;
> +}
> Index: linux-compile-i386.git/include/linux/mcount.h
> ===================================================================
> --- /dev/null 1970-01-01 00:00:00.000000000 +0000
> +++ linux-compile-i386.git/include/linux/mcount.h 2008-01-09 15:17:20.000000000 -0500
> @@ -0,0 +1,21 @@
> +#ifndef _LINUX_MCOUNT_H
> +#define _LINUX_MCOUNT_H
> +
> +#ifdef CONFIG_MCOUNT
> +extern int mcount_enabled;
> +
> +#include <linux/linkage.h>
> +
> +#define CALLER_ADDR0 ((unsigned long)__builtin_return_address(0))
> +#define CALLER_ADDR1 ((unsigned long)__builtin_return_address(1))
> +#define CALLER_ADDR2 ((unsigned long)__builtin_return_address(2))
Still used when __mcount would be gone?
> +
> +typedef void (*mcount_func_t)(unsigned long ip, unsigned long parent_ip);
> +
> +extern void mcount(void);
> +
> +int register_mcount_function(mcount_func_t func);
> +void clear_mcount_function(void);
> +
> +#endif /* CONFIG_MCOUNT */
> +#endif /* _LINUX_MCOUNT_H */
> Index: linux-compile-i386.git/arch/x86/kernel/entry_64.S
> ===================================================================
> --- linux-compile-i386.git.orig/arch/x86/kernel/entry_64.S 2008-01-09 14:09:36.000000000 -0500
> +++ linux-compile-i386.git/arch/x86/kernel/entry_64.S 2008-01-09 14:10:07.000000000 -0500
> @@ -53,6 +53,46 @@
>
> .code64
>
> +#ifdef CONFIG_MCOUNT
> +
> +ENTRY(mcount)
> + cmpl $0, mcount_enabled
> + jz out
> +
> + push %rbp
> + mov %rsp,%rbp
Same as for x86_32.
> +
> + push %r11
> + push %r10
glibc [2] doesn't save those two, and we were also happy without them so
far. Or are there nasty corner-cases in the kernel?
> + push %r9
> + push %r8
> + push %rdi
> + push %rsi
> + push %rdx
> + push %rcx
> + push %rax
SAVE_ARGS/RESTORE_ARGS and glibc use explicit rsp manipulation + movq
instead of push/pop. I wonder if there is a small advantage, but I'm not
that deep into this arch.
> +
> + mov 0x0(%rbp),%rax
> + mov 0x8(%rbp),%rdi
> + mov 0x8(%rax),%rsi
See [2] for saving one instruction here. :)
> +
> + call *mcount_trace_function
> +
> + pop %rax
> + pop %rcx
> + pop %rdx
> + pop %rsi
> + pop %rdi
> + pop %r8
> + pop %r9
> + pop %r10
> + pop %r11
> +
> + pop %rbp
> +out:
> + ret
> +#endif
> +
> #ifndef CONFIG_PREEMPT
> #define retint_kernel retint_restore_args
> #endif
This generic approach is very appreciated here as well. It would take
away the burden of maintaining the arch-dependent stubs within I-pipe.
What we could contribute later on is a blackfin trampoline, there is
just still a bug in their toolchain which breaks mcount for modules. But
I could check with the bfin guys again about the progress and underline
the importance of this long-pending issue.
Jan
[1]http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/i386/i386-mcount.S?rev=1.6&content-type=text/x-cvsweb-markup&cvsroot=glibc
[2]http://sources.redhat.com/cgi-bin/cvsweb.cgi/libc/sysdeps/x86_64/_mcount.S?rev=1.5&content-type=text/x-cvsweb-markup&cvsroot=glibc
next prev parent reply other threads:[~2008-01-11 0:07 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-09 23:29 [RFC PATCH 00/22 -v2] mcount and latency tracing utility -v2 Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 01/22 -v2] Add basic support for gcc profiler instrumentation Steven Rostedt
2008-01-10 18:19 ` Jan Kiszka [this message]
2008-01-10 19:54 ` Steven Rostedt
2008-01-10 23:02 ` Steven Rostedt
2008-01-10 18:28 ` Sam Ravnborg
2008-01-10 19:10 ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 02/22 -v2] Annotate core code that should not be traced Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 03/22 -v2] x86_64: notrace annotations Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 04/22 -v2] add notrace annotations to vsyscall Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 05/22 -v2] add notrace annotations for NMI routines Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 06/22 -v2] mcount based trace in the form of a header file library Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 07/22 -v2] tracer add debugfs interface Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 08/22 -v2] mcount tracer output file Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 09/22 -v2] mcount tracer show task comm and pid Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 10/22 -v2] Add a symbol only trace output Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 11/22 -v2] Reset the tracer when started Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 12/22 -v2] separate out the percpu date into a percpu struct Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays Steven Rostedt
2008-01-10 0:00 ` john stultz
2008-01-10 0:09 ` Steven Rostedt
2008-01-10 19:54 ` Tony Luck
2008-01-10 20:15 ` Steven Rostedt
2008-01-10 20:41 ` john stultz
2008-01-10 20:29 ` john stultz
2008-01-10 20:42 ` Mathieu Desnoyers
2008-01-10 21:25 ` john stultz
2008-01-10 22:00 ` Mathieu Desnoyers
2008-01-10 22:40 ` Steven Rostedt
2008-01-10 22:51 ` john stultz
2008-01-10 23:05 ` john stultz
2008-01-10 21:33 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over longdelays Luck, Tony
2008-01-10 0:19 ` [RFC PATCH 13/22 -v2] handle accurate time keeping over long delays john stultz
2008-01-10 0:25 ` Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 14/22 -v2] time keeping add cycle_raw for actual incrementation Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 15/22 -v2] initialize the clock source to jiffies clock Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 16/22 -v2] add get_monotonic_cycles Steven Rostedt
2008-01-10 3:28 ` Daniel Walker
2008-01-15 21:46 ` Mathieu Desnoyers
2008-01-15 22:01 ` Steven Rostedt
2008-01-15 22:03 ` Steven Rostedt
2008-01-15 22:08 ` Mathieu Desnoyers
2008-01-16 1:38 ` Steven Rostedt
2008-01-16 3:17 ` Mathieu Desnoyers
2008-01-16 13:17 ` Steven Rostedt
2008-01-16 14:56 ` Mathieu Desnoyers
2008-01-16 15:06 ` Steven Rostedt
2008-01-16 15:28 ` Mathieu Desnoyers
2008-01-16 15:58 ` Steven Rostedt
2008-01-16 17:00 ` Mathieu Desnoyers
2008-01-16 17:49 ` Mathieu Desnoyers
2008-01-16 19:43 ` Steven Rostedt
2008-01-16 20:17 ` Mathieu Desnoyers
2008-01-16 20:45 ` Tim Bird
2008-01-16 20:49 ` Steven Rostedt
2008-01-17 20:08 ` Steven Rostedt
2008-01-17 20:37 ` Frank Ch. Eigler
2008-01-17 21:03 ` Steven Rostedt
2008-01-18 22:26 ` Mathieu Desnoyers
2008-01-18 22:49 ` Steven Rostedt
2008-01-18 23:19 ` Mathieu Desnoyers
2008-01-19 3:36 ` Frank Ch. Eigler
2008-01-19 3:55 ` Steven Rostedt
2008-01-19 4:23 ` Frank Ch. Eigler
2008-01-19 15:29 ` Mathieu Desnoyers
2008-01-19 3:32 ` Frank Ch. Eigler
2008-01-16 18:01 ` Tim Bird
2008-01-16 22:36 ` john stultz
2008-01-16 22:51 ` john stultz
2008-01-16 23:33 ` Steven Rostedt
2008-01-17 2:28 ` john stultz
2008-01-17 2:40 ` Mathieu Desnoyers
2008-01-17 2:50 ` Mathieu Desnoyers
2008-01-17 3:02 ` Steven Rostedt
2008-01-17 3:21 ` Paul Mackerras
2008-01-17 3:39 ` Steven Rostedt
2008-01-17 4:22 ` Mathieu Desnoyers
2008-01-17 4:25 ` Mathieu Desnoyers
2008-01-17 4:14 ` Mathieu Desnoyers
2008-01-17 15:22 ` Steven Rostedt
2008-01-17 17:46 ` Linus Torvalds
2008-01-17 2:51 ` Steven Rostedt
2008-01-16 23:39 ` Mathieu Desnoyers
2008-01-16 23:50 ` Steven Rostedt
2008-01-17 0:36 ` Steven Rostedt
2008-01-17 0:33 ` john stultz
2008-01-17 2:20 ` Mathieu Desnoyers
2008-01-17 1:03 ` Linus Torvalds
2008-01-17 1:35 ` Mathieu Desnoyers
2008-01-17 2:20 ` john stultz
2008-01-17 2:35 ` Mathieu Desnoyers
2008-01-09 23:29 ` [RFC PATCH 17/22 -v2] Add timestamps to tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 18/22 -v2] Sort trace by timestamp Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 19/22 -v2] speed up the output of the tracer Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 20/22 -v2] Add latency_trace format tor tracer Steven Rostedt
2008-01-10 3:41 ` Daniel Walker
2008-01-09 23:29 ` [RFC PATCH 21/22 -v2] Split out specific tracing functions Steven Rostedt
2008-01-09 23:29 ` [RFC PATCH 22/22 -v2] Trace irq disabled critical timings Steven Rostedt
2008-01-10 3:58 ` Daniel Walker
2008-01-10 14:45 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478661B9.7050406@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=a.p.zijlstra@chello.nl \
--cc=acme@ghostprotocols.net \
--cc=akpm@linux-foundation.org \
--cc=fche@redhat.com \
--cc=ghaskins@novell.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
--cc=rpm@xenomai.org \
--cc=sam@ravnborg.org \
--cc=srostedt@redhat.com \
--cc=tglx@linutronix.de \
--cc=tim.bird@am.sony.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.