public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH] x86: do not allow to optimize flag_is_changeable_p()
@ 2008-09-30  8:27 krzysztof.h1
  2008-09-30 15:23 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 7+ messages in thread
From: krzysztof.h1 @ 2008-09-30  8:27 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Yinghai Lu
  Cc: Krzysztof Helt, linux-kernel@vger.kernel.org, Ingo Molnar,
	Thomas Gleixner, H. Peter Anvin

> Yinghai Lu wrote:
> > On Mon, Sep 29, 2008 at 11:14 PM, Jeremy Fitzhardinge <jeremy@goop.org>
> wrote:
> >   
> >> Krzysztof Helt wrote:
> >>     
> >>> From: Krzysztof Helt <krzysztof.h1@wp.pl>
> >>>
> >>> The flag_is_changeable_p() is used by
> >>> has_cpuid_p() which can return different results
> >>> in the code sequence below:
> >>>
> >>>  if (!have_cpuid_p())
> >>>       identify_cpu_without_cpuid(c);
> >>>
> >>>   /* cyrix could have cpuid enabled via c_identify()*/
> >>>   if (!have_cpuid_p())
> >>>       return;
> >>>
> >>> Otherwise, the gcc 3.4.6 optimizes these two calls
> >>> into one which make the code not working correctly.
> >>> Cyrix cpus have the CPUID instruction enabled but
> >>> it is not detected due to the gcc optimization.
> >>> Thus the ARR registers (mtrr like) are not detected
> >>> on such a cpu.
> >>>
> >>>       
> >> If "asm volatile" changes the code and fixes the bug, it seems like
> >> you're making use of an undocumented - or at least non-portable -
> behaviour.
> >>

Why you call it undocumented. This is information you can find with "info gcc" in the Extendend Asm section:

If your assembler instructions access memory in an unpredictable
fashion, add `memory' to the list of clobbered registers.  This will
cause GCC to not keep memory values cached in registers across the
assembler instruction and not optimize stores or loads to that memory.
You will also want to add the `volatile' keyword if the memory affected
is not listed in the inputs or outputs of the `asm', as the `memory'
clobber does not count as a side-effect of the `asm'.  If you know how
large the accessed memory is, you can add it as input or output but if
this is not known, you should add `memory'.

> >> Does adding a "memory" clobber also fix the problem?  That would have
> >> better defined characteristics.
> >>

A changeable flag bit is hardly a memory side effect. IMO, the volatile attribute is better as it says that each evaluation may have a different results despite the inputs and outputs are the same.

> 
> The trouble is that flag_is_changeable_p() doesn't have any obvious
> global dependencies; it just takes a constant argument and returns a
> result.   The asm() needs to be updated to have a "memory" constraint as
> a stand-in for the specific constraint of "cpu has switched into
> cpuid-supporting state".
> 

See above about adding the memory constrain.

Kind regards,
Krzysztof

----------------------------------------------------------------------
Tanie i proste polaczenia telefoniczne!
Sprawdz >>  http://link.interia.pl/f1f23 



^ permalink raw reply	[flat|nested] 7+ messages in thread
* [PATCH] x86: do not allow to optimize flag_is_changeable_p()
@ 2008-09-29 18:06 Krzysztof Helt
  2008-09-29 18:17 ` H. Peter Anvin
  2008-09-30  6:14 ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 7+ messages in thread
From: Krzysztof Helt @ 2008-09-29 18:06 UTC (permalink / raw)
  To: linux-kernel, Ingo Molnar; +Cc: Thomas Gleixner, H. Peter Anvin, Yinghai Lu

From: Krzysztof Helt <krzysztof.h1@wp.pl>

The flag_is_changeable_p() is used by
has_cpuid_p() which can return different results
in the code sequence below:

 if (!have_cpuid_p())
      identify_cpu_without_cpuid(c);

  /* cyrix could have cpuid enabled via c_identify()*/
  if (!have_cpuid_p())
      return;

Otherwise, the gcc 3.4.6 optimizes these two calls
into one which make the code not working correctly.
Cyrix cpus have the CPUID instruction enabled but
it is not detected due to the gcc optimization.
Thus the ARR registers (mtrr like) are not detected
on such a cpu.

Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl>
---

I have tested the 6x86MX cpu with the CPUID
disabled. I have used linux-next tree (20080819)
and Yinghai Lu's patch:

x86: identify_cpu_without_cpuid v2

http://marc.info/?l=linux-kernel&m=122138380004347&w=2

The patch below is required to make the patch
above working correctly.

diff -urp linux-orig/arch/x86/kernel/cpu/common.c linux-2.6.27/arch/x86/kernel/cpu/common.c
--- linux-orig/arch/x86/kernel/cpu/common.c	2008-09-29 07:11:54.000000000 +0200
+++ linux-2.6.27/arch/x86/kernel/cpu/common.c	2008-09-29 18:07:27.667392725 +0200
@@ -124,18 +124,18 @@ static inline int flag_is_changeable_p(u
 {
 	u32 f1, f2;
 
-	asm("pushfl\n\t"
-	    "pushfl\n\t"
-	    "popl %0\n\t"
-	    "movl %0,%1\n\t"
-	    "xorl %2,%0\n\t"
-	    "pushl %0\n\t"
-	    "popfl\n\t"
-	    "pushfl\n\t"
-	    "popl %0\n\t"
-	    "popfl\n\t"
-	    : "=&r" (f1), "=&r" (f2)
-	    : "ir" (flag));
+	asm volatile ("pushfl\n\t"
+		      "pushfl\n\t"
+		      "popl %0\n\t"
+		      "movl %0,%1\n\t"
+		      "xorl %2,%0\n\t"
+		      "pushl %0\n\t"
+		      "popfl\n\t"
+		      "pushfl\n\t"
+		      "popl %0\n\t"
+		      "popfl\n\t"
+		      : "=&r" (f1), "=&r" (f2)
+		      : "ir" (flag));
 
 	return ((f1^f2) & flag) != 0;
 }

----------------------------------------------------------------------
Dzwon taniej na zagraniczne komorki!
Sprawdz >> http://link.interia.pl/f1f26 


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-09-30 15:23 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-30  8:27 [PATCH] x86: do not allow to optimize flag_is_changeable_p() krzysztof.h1
2008-09-30 15:23 ` Jeremy Fitzhardinge
  -- strict thread matches above, loose matches on Subject: below --
2008-09-29 18:06 Krzysztof Helt
2008-09-29 18:17 ` H. Peter Anvin
2008-09-30  6:14 ` Jeremy Fitzhardinge
2008-09-30  6:34   ` Yinghai Lu
2008-09-30  6:54     ` Jeremy Fitzhardinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox