public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andi Kleen <andi@firstfloor.org>
To: rostedt@goodmis.org
Cc: Ingo Molnar <mingo@elte.hu>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Richard Guenther <richard.guenther@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	feng.tang@intel.com, Fr??d??ric Weisbecker <fweisbec@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	jakub@redhat.com, gcc@gcc.gnu.org
Subject: [PATCH] gcc mcount-nofp was Re: BUG: GCC-4.4.x changes the function frame on some functions
Date: Fri, 20 Nov 2009 10:57:02 +0100	[thread overview]
Message-ID: <87iqd54iz5.fsf_-_@basil.nowhere.org> (raw)
In-Reply-To: <1258657614.22249.824.camel@gandalf.stny.rr.com> (Steven Rostedt's message of "Thu, 19 Nov 2009 14:06:54 -0500")

Steven Rostedt <rostedt@goodmis.org> writes:
>
> And frame pointers do add a little overhead as well. Too bad the mcount
> ABI wasn't something like this:
>
>
> 	<function>:
> 		call	mcount
> 		[...]
>
> This way, the function address for mcount would have been (%esp) and the
> parent address would be 4(%esp). Mcount would work without frame
> pointers and this whole mess would also become moot.

I did a patch to do this in x86 gcc some time ago. The motivation
was indeed the frame pointer overhead on Atom with tracing.

Unfortunately it also requires glibc changes (I did those too). For
compatibility and catching mistakes the new function was called
__mcount_nofp.

I haven't tried it with current gcc and last time I missed the 
gcc feature merge window with this.

But perhaps you find it useful. Of course it would need more
kernel changes to probe for the new option and handle it.

Here's the old patch. I haven't retested it with a current
gcc version, but I think it still applies at least.

If there's interest I can polish it up and submit formally.

-Andi


Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(revision 149140)
+++ gcc/doc/tm.texi	(working copy)
@@ -1884,6 +1884,12 @@
 of words in each data entry.
 @end defmac
 
+@defmac TARGET_FUNCTION_PROFILE
+Define if the target has a custom function_profiler function.
+The target should not set this macro, it is implicitely set from 
+the PROFILE_BEFORE_PROLOGUE macro.
+@end defmac
+
 @node Registers
 @section Register Usage
 @cindex register usage
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 149140)
+++ gcc/doc/invoke.texi	(working copy)
@@ -593,7 +593,7 @@
 -momit-leaf-frame-pointer  -mno-red-zone -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model} -mabi=@var{name} @gol
 -m32  -m64 -mlarge-data-threshold=@var{num} @gol
--mfused-madd -mno-fused-madd -msse2avx}
+-mfused-madd -mno-fused-madd -msse2avx -mmcount-nofp}
 
 @emph{IA-64 Options}
 @gccoptlist{-mbig-endian  -mlittle-endian  -mgnu-as  -mgnu-ld  -mno-pic @gol
@@ -11749,6 +11749,11 @@
 @opindex msse2avx
 Specify that the assembler should encode SSE instructions with VEX
 prefix.  The option @option{-mavx} turns this on by default.
+
+@item -mmcount-nofp
+Don't force the frame counter with @option{-pg} function profiling.
+Instead call a new @code{__mcount_nofp} function before a stack 
+frame is set up.
 @end table
 
 These @samp{-m} switches are supported in addition to the above
Index: gcc/target.h
===================================================================
--- gcc/target.h	(revision 149140)
+++ gcc/target.h	(working copy)
@@ -1132,6 +1132,9 @@
    */
   bool arm_eabi_unwinder;
 
+  /* True when the function profiler code is outputted before the prologue. */
+  bool profile_before_prologue;
+
   /* Leave the boolean fields at the end.  */
 };
 
Index: gcc/final.c
===================================================================
--- gcc/final.c	(revision 149140)
+++ gcc/final.c	(working copy)
@@ -1520,10 +1520,8 @@
 
   /* The Sun386i and perhaps other machines don't work right
      if the profiling code comes after the prologue.  */
-#ifdef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (targetm.profile_before_prologue && crtl->profile)
     profile_function (file);
-#endif /* PROFILE_BEFORE_PROLOGUE */
 
 #if defined (DWARF2_UNWIND_INFO) && defined (HAVE_prologue)
   if (dwarf2out_do_frame ())
@@ -1565,10 +1563,8 @@
 static void
 profile_after_prologue (FILE *file ATTRIBUTE_UNUSED)
 {
-#ifndef PROFILE_BEFORE_PROLOGUE
-  if (crtl->profile)
+  if (!targetm.profile_before_prologue && crtl->profile)
     profile_function (file);
-#endif /* not PROFILE_BEFORE_PROLOGUE */
 }
 
 static void
Index: gcc/gcc.c
===================================================================
--- gcc/gcc.c	(revision 149140)
+++ gcc/gcc.c	(working copy)
@@ -797,6 +797,12 @@
 # define SYSROOT_HEADERS_SUFFIX_SPEC ""
 #endif
 
+/* Target can override this to allow -pg/-fomit-frame-pointer together */
+#ifndef TARGET_PG_OPTION_SPEC
+#define TARGET_PG_OPTION_SPEC \
+"%{pg:%{fomit-frame-pointer:%e-pg and -fomit-frame-pointer are incompatible}}"
+#endif
+
 static const char *asm_debug;
 static const char *cpp_spec = CPP_SPEC;
 static const char *cc1_spec = CC1_SPEC;
@@ -866,8 +872,8 @@
 
 /* NB: This is shared amongst all front-ends, except for Ada.  */
 static const char *cc1_options =
-"%{pg:%{fomit-frame-pointer:%e-pg and -fomit-frame-pointer are incompatible}}\
- %1 %{!Q:-quiet} -dumpbase %B %{d*} %{m*} %{a*}\
+ TARGET_PG_OPTION_SPEC
+" %1 %{!Q:-quiet} -dumpbase %B %{d*} %{m*} %{a*}\
  %{fcompare-debug-second:%:compare-debug-auxbase-opt(%b)} \
  %{!fcompare-debug-second:%{c|S:%{o*:-auxbase-strip %*}%{!o*:-auxbase %b}}}%{!c:%{!S:-auxbase %b}} \
  %{g*} %{O*} %{W*&pedantic*} %{w} %{std*&ansi&trigraphs}\
Index: gcc/target-def.h
===================================================================
--- gcc/target-def.h	(revision 149140)
+++ gcc/target-def.h	(working copy)
@@ -841,6 +841,12 @@
     TARGET_OPTION_CAN_INLINE_P,			\
   }
 
+#ifdef PROFILE_BEFORE_PROLOGUE
+#define TARGET_FUNCTION_PROFILE true
+#else
+#define TARGET_FUNCTION_PROFILE false
+#endif
+
 /* The whole shebang.  */
 #define TARGET_INITIALIZER			\
 {						\
@@ -960,7 +966,8 @@
   TARGET_HANDLE_PRAGMA_REDEFINE_EXTNAME,	\
   TARGET_HANDLE_PRAGMA_EXTERN_PREFIX,		\
   TARGET_RELAXED_ORDERING,			\
-  TARGET_ARM_EABI_UNWINDER			\
+  TARGET_ARM_EABI_UNWINDER,			\
+  TARGET_FUNCTION_PROFILE			\
 }
 
 #define TARGET_HANDLE_C_OPTION default_handle_c_option
Index: gcc/config/i386/i386.h
===================================================================
--- gcc/config/i386/i386.h	(revision 149140)
+++ gcc/config/i386/i386.h	(working copy)
@@ -2528,6 +2528,9 @@
 #undef TARG_COND_NOT_TAKEN_BRANCH_COST
 #define TARG_COND_NOT_TAKEN_BRANCH_COST ix86_cost->cond_not_taken_branch_cost
 
+/* Allow -pg with -fomit-frame-pointer */
+#define TARGET_PG_OPTION_SPEC ""
+
 /*
 Local variables:
 version-control: t
Index: gcc/config/i386/i386.opt
===================================================================
--- gcc/config/i386/i386.opt	(revision 149140)
+++ gcc/config/i386/i386.opt	(working copy)
@@ -358,3 +358,7 @@
 msse2avx
 Target Report Var(ix86_sse2avx)
 Encode SSE instructions with VEX prefix
+
+mmcount-nofp
+Target Report Var(ix86_mcount_nofp)
+Support function profiling without frame pointer
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 149140)
+++ gcc/config/i386/i386.c	(working copy)
@@ -3413,6 +3413,9 @@
     target_flags |= MASK_CLD & ~target_flags_explicit;
 #endif
 
+  if (flag_omit_frame_pointer && profile_flag && ix86_mcount_nofp)
+    targetm.profile_before_prologue = true;
+
   /* Save the initial options in case the user does function specific options */
   if (main_args_p)
     target_option_default_node = target_option_current_node
@@ -7442,7 +7445,7 @@
 	  || ix86_current_function_calls_tls_descriptor))
     return true;
 
-  if (crtl->profile)
+  if (crtl->profile && targetm.profile_before_prologue && !ix86_mcount_nofp)
     return true;
 
   return false;
@@ -27364,6 +27367,11 @@
 void
 x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED)
 {
+  const char *name = MCOUNT_NAME;
+
+  if (targetm.profile_before_prologue && ix86_mcount_nofp)
+    name = "__mcount_nofp";
+
   if (TARGET_64BIT)
     {
 #ifndef NO_PROFILE_COUNTERS
@@ -27371,9 +27379,9 @@
 #endif
 
       if (DEFAULT_ABI == SYSV_ABI && flag_pic)
-	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", MCOUNT_NAME);
+	fprintf (file, "\tcall\t*%s@GOTPCREL(%%rip)\n", name);
       else
-	fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+	fprintf (file, "\tcall\t%s\n", name);
     }
   else if (flag_pic)
     {
@@ -27381,7 +27389,7 @@
       fprintf (file, "\tleal\t%sP%d@GOTOFF(%%ebx),%%%s\n",
 	       LPREFIX, labelno, PROFILE_COUNT_REGISTER);
 #endif
-      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", MCOUNT_NAME);
+      fprintf (file, "\tcall\t*%s@GOT(%%ebx)\n", name);
     }
   else
     {
@@ -27389,7 +27397,7 @@
       fprintf (file, "\tmovl\t$%sP%d,%%%s\n", LPREFIX, labelno,
 	       PROFILE_COUNT_REGISTER);
 #endif
-      fprintf (file, "\tcall\t%s\n", MCOUNT_NAME);
+      fprintf (file, "\tcall\t%s\n", name);
     }
 }
 

-- 
ak@linux.intel.com -- Speaking for myself only.

  parent reply	other threads:[~2009-11-20  9:57 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200911172214.nAHMEBh2023787@imap1.linux-foundation.org>
2009-11-18 19:30 ` [patch for 2.6.32? 1/3] hrtimers: remove the "timer_stats_active" check when setting the start info Thomas Gleixner
2009-11-18 20:24   ` [tip:timers/urgent] hrtimer: Fix /proc/timer_list regression tip-bot for Feng Tang
2009-11-19  7:20     ` Ingo Molnar
2009-11-19 10:05       ` Thomas Gleixner
2009-11-19 14:30         ` BUG: function graph tracer function frame assumptions Thomas Gleixner
2009-11-19 15:37           ` BUG: GCC-4.4.x changes the function frame on some functions Thomas Gleixner
2009-11-19 15:44             ` Andrew Haley
2009-11-19 15:54               ` H. Peter Anvin
2009-11-19 15:57                 ` Richard Guenther
2009-11-19 16:02               ` Steven Rostedt
2009-11-19 16:11                 ` H. Peter Anvin
2009-11-19 16:19                 ` Frederic Weisbecker
2009-11-19 16:06               ` Thomas Gleixner
2009-11-19 16:17                 ` Andrew Haley
2009-11-19 16:43                   ` Thomas Gleixner
2009-11-19 16:12               ` Steven Rostedt
2009-11-19 15:45             ` H. Peter Anvin
2009-11-19 15:49               ` Richard Guenther
2009-11-19 15:52                 ` Richard Guenther
2009-11-19 17:37                 ` Andi Kleen
2009-11-19 17:39             ` Linus Torvalds
2009-11-19 17:51               ` Thomas Gleixner
2009-11-19 17:59               ` Steven Rostedt
2009-11-19 18:03                 ` Richard Guenther
2009-11-19 18:22                   ` Andrew Haley
2009-11-19 18:41                     ` Linus Torvalds
2009-11-19 18:43                       ` Linus Torvalds
2009-11-19 18:54                         ` Linus Torvalds
2009-11-19 19:01                           ` Thomas Gleixner
2009-11-23  9:16                             ` Jakub Jelinek
2009-11-23  9:51                               ` Thomas Gleixner
2009-11-19 19:10                           ` David Daney
2009-11-19 19:28                             ` Steven Rostedt
2009-11-19 19:46                               ` Frederic Weisbecker
2009-11-19 19:54                                 ` Kai Tietz
2009-11-19 20:05                                   ` Frederic Weisbecker
2009-11-19 20:05                                 ` Steven Rostedt
2009-11-19 20:17                                   ` Steven Rostedt
2009-11-19 20:28                                     ` Frederic Weisbecker
2009-11-19 20:25                                   ` Frederic Weisbecker
2009-11-19 20:36                                     ` Linus Torvalds
2009-11-19 20:44                                       ` Steven Rostedt
2009-11-19 19:50                               ` H. Peter Anvin
2009-11-19 20:06                                 ` Linus Torvalds
2009-11-19 21:12                                   ` Jeff Law
2009-11-19 20:10                                 ` Steven Rostedt
2009-11-19 21:05                                 ` Jeff Law
2009-11-19 18:31                   ` Thomas Gleixner
2009-11-19 18:38                   ` Linus Torvalds
2009-11-19 18:47                     ` Ingo Molnar
2009-11-19 19:06                       ` Steven Rostedt
2009-11-19 19:50                         ` Ingo Molnar
2009-11-20  9:57                         ` Andi Kleen [this message]
2009-11-20 12:34                           ` [PATCH] gcc mcount-nofp was " Steven Rostedt
2009-11-20 19:06                             ` H. Peter Anvin
2009-11-19 20:36                     ` Thomas Gleixner
2009-11-19 18:20             ` Andrew Haley
2009-11-19 18:33               ` Steven Rostedt
2009-11-19 18:36                 ` Andrew Pinski
2009-11-19 18:36                 ` Andrew Haley
2009-11-19 18:37                 ` H. Peter Anvin
2009-11-19 18:39               ` Thomas Gleixner
2009-11-20  5:23           ` [PATCH][GIT PULL][v2.6.32] tracing/x86: Add check to detect GCC messing with mcount prologue Steven Rostedt
2009-11-20  5:32             ` Steven Rostedt
2009-11-20 17:00             ` Steven Rostedt
2009-11-20 17:13               ` H. Peter Anvin
2009-11-20 19:35               ` Andrew Haley
2009-11-20 19:46                 ` Steven Rostedt
2009-11-20 19:49                   ` H. Peter Anvin
2009-11-22  9:38                 ` H.J. Lu
2009-11-22 17:20                   ` Andrew Haley
2009-11-22 23:30                     ` H.J. Lu
2009-11-24 14:43                       ` Andrew Haley
2009-11-24 14:55                         ` Thomas Gleixner
2009-11-24 15:06                           ` Jakub Jelinek
2009-11-24 15:32                             ` Andrew Haley
2009-11-24 15:36                               ` Jakub Jelinek
2009-11-24 15:46                                 ` Andrew Haley
2009-11-24 16:38                                   ` H. Peter Anvin
2009-11-24 17:12                                     ` Andrew Haley
2009-11-24 17:30                                       ` Steven Rostedt
2009-11-25 20:05                                         ` H. Peter Anvin
2009-11-24 19:55                                       ` H. Peter Anvin
2009-11-25 15:29                             ` Thomas Gleixner
2009-11-25 15:44                               ` Ingo Molnar
2009-11-25 15:53                                 ` Thomas Gleixner
2009-11-25 16:25                                   ` Ingo Molnar
2009-11-25 16:44                                 ` Jakub Jelinek
2009-11-25 20:12                                   ` H. Peter Anvin
2009-11-25 21:00                                     ` Andrew Haley
2009-11-22  9:05               ` Ingo Molnar
2009-11-20 10:30   ` [tip:timers/urgent] hrtimer: Fix /proc/timer_list regression tip-bot for Feng Tang
2009-11-20 14:19     ` Heiko Carstens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87iqd54iz5.fsf_-_@basil.nowhere.org \
    --to=andi@firstfloor.org \
    --cc=akpm@linux-foundation.org \
    --cc=feng.tang@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=gcc@gcc.gnu.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jakub@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=richard.guenther@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox