public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: BUG: GCC-4.4.x changes the function frame on some functions
@ 2009-11-19 21:14 H. Peter Anvin
  2009-11-19 21:25 ` Jeff Law
  0 siblings, 1 reply; 68+ messages in thread
From: H. Peter Anvin @ 2009-11-19 21:14 UTC (permalink / raw)
  To: Jeff Law
  Cc: rostedt, David Daney, Linus Torvalds, Andrew Haley,
	Richard Guenther, Thomas Gleixner, Ingo Molnar, LKML,
	Andrew Morton, Heiko Carstens, feng.tang, Fr??d??ric Weisbecker,
	Peter Zijlstra, jakub, gcc

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

Hence a new unconstrained option...

"Jeff Law" <law@redhat.com> wrote:

>On 11/19/09 12:50, H. Peter Anvin wrote:
>>
>> Calling the profiler immediately at the entry point is clearly the more
>> sane option.  It means the ABI is well-defined, stable, and independent
>> of what the actual function contents are.  It means that ABI isn't the
>> normal C ABI (the __fentry__ function would have to preserve all
>> registers), but that's fine...
>>    
>Note there are targets (even some old x86 variants) that required the 
>profiling calls to occur after the prologue.  Unfortunately, nobody 
>documented *why* that  was the case.   Sigh.
>
>Jeff

--
Sent from my mobile phone. Please excuse any lack of formatting.

^ permalink raw reply	[flat|nested] 68+ messages in thread
* Re: BUG: GCC-4.4.x changes the function frame on some functions
@ 2009-11-19 20:48 H. Peter Anvin
  0 siblings, 0 replies; 68+ messages in thread
From: H. Peter Anvin @ 2009-11-19 20:48 UTC (permalink / raw)
  To: Frederic Weisbecker, Steven Rostedt
  Cc: David Daney, Linus Torvalds, Andrew Haley, Richard Guenther,
	Thomas Gleixner, Ingo Molnar, LKML, Andrew Morton, Heiko Carstens,
	feng.tang, Peter Zijlstra, jakub, gcc

[-- Attachment #1: Type: text/plain, Size: 2814 bytes --]

On i386, if we call __fentry__ immediately on entry the return address will be in 4(%esp), so I fail to see how you could not reliably have the return address.  Other arches would have different constraints, of course.

"Frederic Weisbecker" <fweisbec@gmail.com> wrote:

>On Thu, Nov 19, 2009 at 03:05:41PM -0500, Steven Rostedt wrote:
>> On Thu, 2009-11-19 at 20:46 +0100, Frederic Weisbecker wrote:
>> > On Thu, Nov 19, 2009 at 02:28:06PM -0500, Steven Rostedt wrote:
>> 
>> > > 	<function>:
>> > > 		call __fentry__
>> > > 		[...]
>> > > 
>> > > 	
>> > > -- Steve
>> > 
>> > 
>> > I would really like this. So that we can forget about other possible
>> > further suprises due to sophisticated function prologues beeing before
>> > the mcount call.
>> > 
>> > And I guess that would fix it in every archs.
>> 
>> Well, other archs use a register to store the return address. But it
>> would also be easy to do (pseudo arch assembly):
>> 
>> 	<function>:
>> 		mov lr, (%sp)
>> 		add 8, %sp
>> 		blr __fentry__
>> 		sub 8, %sp
>> 		mov (%sp), lr
>> 
>> 
>> That way the lr would have the current function, and the parent would
>> still be at 8(%sp)
>> 
>
>
>Yeah right, we need at least such very tiny prologue for
>archs that store return addresses in a reg.
>
>	
>> > 
>> > That said, Linus had a good point about the fact there might other uses
>> > of mcount even more tricky than what does the function graph tracer,
>> > outside the kernel, and those may depend on the strict ABI assumption
>> > that 4(ebp) is always the _real_ return address, and that through all
>> > the previous stack call. This is even a concern that extrapolates the
>> > single mcount case.
>> 
>> As I am proposing a new call. This means that mcount stay as is for
>> legacy reasons. Yes I know there exists the -finstrument-functions but
>> that adds way too much bloat to the code. One single call to the
>> profiler is all I want.
>
>
>Sure, the purpose is not to change the existing -mcount thing.
>What I meant is that we could have -mcount and -real-ra-before-fp
>at the same time to guarantee fp + 4 is really what we want while
>using -mcount.
>
>The __fentry__ idea is more neat, but the guarantee of a real pointer
>to the return address is still something that lacks.
>
>
>> > 
>> > So I wonder that actually the real problem is the lack of something that
>> > could provide this guarantee. We may need a -real-ra-before-fp (yeah
>> > I suck in naming).
>> 
>> Don't worry, so do the C compiler folks, I mean, come on "mcount"?
>
>
>I guess it has been first created for the single purpose of counting
>specific functions but then it has been used for wider, unpredicted uses :)
>

--
Sent from my mobile phone. Please excuse any lack of formatting.

^ permalink raw reply	[flat|nested] 68+ messages in thread
* Re: [patch for 2.6.32? 1/3] hrtimers: remove the "timer_stats_active" check when setting the start info
@ 2009-11-18 19:30 Thomas Gleixner
  2009-11-18 20:24 ` [tip:timers/urgent] hrtimer: Fix /proc/timer_list regression tip-bot for Feng Tang
  0 siblings, 1 reply; 68+ messages in thread
From: Thomas Gleixner @ 2009-11-18 19:30 UTC (permalink / raw)
  To: akpm
  Cc: Ingo Molnar, John Stultz, feng.tang, Peter Zijlstra,
	Heiko Carstens, LKML, Arjan van de Ven

On Tue, 17 Nov 2009, akpm@linux-foundation.org wrote:

Back to LKML with some more cc's.

> From: Feng Tang <feng.tang@intel.com>
> 
> Recent hrtimer code will set the start info to a hrtimer only when that
> flag is set, then the start info of all hrtimers will always be
> uninitialised before a "echo 1 > /proc/timer_stats", thus the
> /proc/timer_lists will have something like:
> 
> active timers:
>  #0: <c27d46b0>, tick_sched_timer, S:01, <(null)>, /-1
>  # expires at 91062000000-91062000000 nsecs [in 156071 to 156071 nsecs]
>  #1: <efb81b6c>, hrtimer_wakeup, S:01, <(null)>, /-1
>  # expires at 91062300331-91062350331 nsecs [in 456402 to 506402 nsecs]
>  #2: <efac9b6c>, hrtimer_wakeup, S:01, <(null)>, /-1
>  # expires at 91068699811-91068749811 nsecs [in 6855882 to 6905882 nsecs]
>  #3: <efacdb6c>, hrtimer_wakeup, S:01, <(null)>, /-1
>  # expires at 91068755511-91068805511 nsecs [in 6911582 to 6961582 nsecs]
>  #4: <efa95b6c>, hrtimer_wakeup, S:01, <(null)>, /-1
>  # expires at 91068806066-91068856066 nsecs [in 6962137 to 7012137 nsecs]
>  .....
> 
> This patch fixes it.

This patch does more than that and it does not mention it in the
change log. The change log is also missing the context which
introduced this regression.

Just to get the context straight:

commit 507e1231 (timer stats: Optimize by adding quick check to avoid
function calls) commit 507e1231 added the timer_stats_active check in
timer_stats_hrtimer_set_start_info() to reduce the overhead when timer
stats are inactive.

As an unintentional side effect it introduced the above regression in
/proc/timer_list.

>  static inline void timer_stats_hrtimer_set_start_info(struct hrtimer *timer)
>  {
> -	if (likely(!timer_stats_active))
> -		return;

This is fine. It reverts the hrtimer part of commit 507e1231 and fixes
the /proc/timer_list regression for the price of the same overhead
which we had before that commit.

That's the immediate fix which needs to go to Linus and stable.

>  	__timer_stats_hrtimer_set_start_info(timer, __builtin_return_address(0));
>  }
>  
> diff -puN kernel/hrtimer.c~hrtimers-remove-the-timer_stats_active-check-when-setting-the-start-info kernel/hrtimer.c
> --- a/kernel/hrtimer.c~hrtimers-remove-the-timer_stats_active-check-when-setting-the-start-info
> +++ a/kernel/hrtimer.c
> @@ -750,7 +750,7 @@ static inline void hrtimer_init_timer_hr
>  #ifdef CONFIG_TIMER_STATS
>  void __timer_stats_hrtimer_set_start_info(struct hrtimer *timer, void *addr)
>  {
> -	if (timer->start_site)
> +	if (timer->start_site == addr && timer->start_pid == current->pid)

This part is unrelated to the above regression and is wrong for the
following reasons:

  1) it runs the memcpy when the start_site changes even when the pid
     is the same, which is extremly bad for tick->sched_timer as this
     timer is re(armed) frequently from different places when NOHZ=y.

  2) it runs the memcpy when the pid changes which is even worse for
     tick->sched_timer as this timer is re(armed) frequently from
     hrtimer_interrupt which hits random pids and would therefor
     report completely wrong pid/comm info.

This needs more thought and is definitely neither -rc7 nor stable
material.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2009-11-23  9:54 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-11-19 21:14 BUG: GCC-4.4.x changes the function frame on some functions H. Peter Anvin
2009-11-19 21:25 ` Jeff Law
2009-11-19 22:43   ` Steven Rostedt
2009-11-19 23:58     ` Jeff Law
2009-11-20  0:36       ` Thomas Gleixner
2009-11-20  0:59         ` Linus Torvalds
2009-11-20  1:27           ` Thomas Gleixner
2009-11-20  2:14             ` Thomas Gleixner
2009-11-20 13:09             ` [tip:x86/urgent] x86: Prevent GCC 4.4.x (pentium-mmx et al) function prologue wreckage tip-bot for Thomas Gleixner
2009-11-20  1:29           ` BUG: GCC-4.4.x changes the function frame on some functions H. Peter Anvin
2009-11-20  5:36           ` Ingo Molnar
2009-11-20 12:04         ` Andrew Haley
2009-11-20 12:22           ` Andrew Haley
  -- strict thread matches above, loose matches on Subject: below --
2009-11-19 20:48 H. Peter Anvin
2009-11-18 19:30 [patch for 2.6.32? 1/3] hrtimers: remove the "timer_stats_active" check when setting the start info Thomas Gleixner
2009-11-18 20:24 ` [tip:timers/urgent] hrtimer: Fix /proc/timer_list regression tip-bot for Feng Tang
2009-11-19  7:20   ` Ingo Molnar
2009-11-19 10:05     ` Thomas Gleixner
2009-11-19 14:30       ` BUG: function graph tracer function frame assumptions Thomas Gleixner
2009-11-19 15:37         ` BUG: GCC-4.4.x changes the function frame on some functions Thomas Gleixner
2009-11-19 15:44           ` Andrew Haley
2009-11-19 15:54             ` H. Peter Anvin
2009-11-19 15:57               ` Richard Guenther
2009-11-19 16:02             ` Steven Rostedt
2009-11-19 16:11               ` H. Peter Anvin
2009-11-19 16:19               ` Frederic Weisbecker
2009-11-19 16:06             ` Thomas Gleixner
2009-11-19 16:17               ` Andrew Haley
2009-11-19 16:43                 ` Thomas Gleixner
2009-11-19 16:12             ` Steven Rostedt
2009-11-19 15:45           ` H. Peter Anvin
2009-11-19 15:49             ` Richard Guenther
2009-11-19 15:52               ` Richard Guenther
2009-11-19 17:37               ` Andi Kleen
2009-11-19 17:39           ` Linus Torvalds
2009-11-19 17:51             ` Thomas Gleixner
2009-11-19 17:59             ` Steven Rostedt
2009-11-19 18:03               ` Richard Guenther
2009-11-19 18:22                 ` Andrew Haley
2009-11-19 18:41                   ` Linus Torvalds
2009-11-19 18:43                     ` Linus Torvalds
2009-11-19 18:54                       ` Linus Torvalds
2009-11-19 19:01                         ` Thomas Gleixner
2009-11-23  9:16                           ` Jakub Jelinek
2009-11-23  9:51                             ` Thomas Gleixner
2009-11-19 19:10                         ` David Daney
2009-11-19 19:28                           ` Steven Rostedt
2009-11-19 19:46                             ` Frederic Weisbecker
2009-11-19 19:54                               ` Kai Tietz
2009-11-19 20:05                                 ` Frederic Weisbecker
2009-11-19 20:05                               ` Steven Rostedt
2009-11-19 20:17                                 ` Steven Rostedt
2009-11-19 20:28                                   ` Frederic Weisbecker
2009-11-19 20:25                                 ` Frederic Weisbecker
2009-11-19 20:36                                   ` Linus Torvalds
2009-11-19 20:44                                     ` Steven Rostedt
2009-11-19 19:50                             ` H. Peter Anvin
2009-11-19 20:06                               ` Linus Torvalds
2009-11-19 21:12                                 ` Jeff Law
2009-11-19 20:10                               ` Steven Rostedt
2009-11-19 21:05                               ` Jeff Law
2009-11-19 18:31                 ` Thomas Gleixner
2009-11-19 18:38                 ` Linus Torvalds
2009-11-19 18:47                   ` Ingo Molnar
2009-11-19 19:06                     ` Steven Rostedt
2009-11-19 19:50                       ` Ingo Molnar
2009-11-19 20:36                   ` Thomas Gleixner
2009-11-19 18:20           ` Andrew Haley
2009-11-19 18:33             ` Steven Rostedt
2009-11-19 18:36               ` Andrew Pinski
2009-11-19 18:36               ` Andrew Haley
2009-11-19 18:37               ` H. Peter Anvin
2009-11-19 18:39             ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox