* Andi, you broke my laptop :-)
@ 2007-05-09 19:56 Pete Zaitcev
2007-05-10 13:01 ` Andi Kleen
0 siblings, 1 reply; 7+ messages in thread
From: Pete Zaitcev @ 2007-05-09 19:56 UTC (permalink / raw)
To: ak; +Cc: zaitcev, linux-kernel, joerg.roedel
Hi, Andi:
The attached patch (actually, git show output) makes my Dell 1501 to hang
on boot. Sorry, I have no clue why... The culprit is found with git bisect.
But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible.
Cheers,
-- Pete
commit c5bcb5635a03da3158f121ae20ccbbf72b4fc62a
Author: Andi Kleen <ak@suse.de>
Date: Wed May 2 19:27:21 2007 +0200
[PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
RDTSCP is already synchronous and doesn't need an explicit CPUID.
This is a little faster and more importantly avoids VMEXITs on Hypervisors.
Original patch from Joerg Roedel, but reworked by AK
Also includes miscompilation fix by Eric Biederman
Cc: "Joerg Roedel" <joerg.roedel@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
diff --git a/include/asm-i386/tsc.h b/include/asm-i386/tsc.h
index 0181f9d..3f3c1fa 100644
--- a/include/asm-i386/tsc.h
+++ b/include/asm-i386/tsc.h
@@ -38,6 +38,15 @@ static __always_inline cycles_t get_cycles_sync(void)
unsigned eax;
/*
+ * Use RDTSCP if possible; it is guaranteed to be synchronous
+ * and doesn't cause a VMEXIT on Hypervisors
+ */
+ alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP,
+ "=A" (ret), "0" (0ULL) : "ecx", "memory");
+ if (ret)
+ return ret;
+
+ /*
* Don't do an additional sync on CPUs where we know
* RDTSC is already synchronous:
*/
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: Andi, you broke my laptop :-) 2007-05-09 19:56 Andi, you broke my laptop :-) Pete Zaitcev @ 2007-05-10 13:01 ` Andi Kleen 2007-05-10 13:35 ` Joerg Roedel 0 siblings, 1 reply; 7+ messages in thread From: Andi Kleen @ 2007-05-10 13:01 UTC (permalink / raw) To: Pete Zaitcev; +Cc: ak, linux-kernel, joerg.roedel On Wed, May 09, 2007 at 12:56:16PM -0700, Pete Zaitcev wrote: > Hi, Andi: > > The attached patch (actually, git show output) makes my Dell 1501 to hang > on boot. Sorry, I have no clue why... The culprit is found with git bisect. > But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. MK-36? Does it have SVM? Anyways we previously had issues with this being miscompiled, but I thought the latest patch should have been ok. What compiler do you use? Can you send me a disassembly listing of arch/x86_64/kernel/time.o? -Andi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Andi, you broke my laptop :-) 2007-05-10 13:01 ` Andi Kleen @ 2007-05-10 13:35 ` Joerg Roedel 2007-05-10 14:19 ` Andi Kleen 2007-05-10 16:31 ` Pete Zaitcev 0 siblings, 2 replies; 7+ messages in thread From: Joerg Roedel @ 2007-05-10 13:35 UTC (permalink / raw) To: Andi Kleen; +Cc: Pete Zaitcev, ak, linux-kernel, Benny Halevy, akpm [-- Attachment #1: Type: text/plain, Size: 1425 bytes --] On Thu, May 10, 2007 at 03:01:44PM +0200, Andi Kleen wrote: > On Wed, May 09, 2007 at 12:56:16PM -0700, Pete Zaitcev wrote: > > Hi, Andi: > > > > The attached patch (actually, git show output) makes my Dell 1501 to hang > > on boot. Sorry, I have no clue why... The culprit is found with git bisect. > > But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. > > MK-36? Does it have SVM? > > Anyways we previously had issues with this being miscompiled, but > I thought the latest patch should have been ok. What compiler do you use? > > Can you send me a disassembly listing of arch/x86_64/kernel/time.o? I debugged this problem a bit and my compiler[1]interprets the =A constraint as %rax instead of %edx:%eax on x86_64 which causes the problem. The appended patch provides a workaround for this and fixed the hang on my machine. [1] gcc version 4.1.3 20070429 (prerelease) (Debian 4.1.2-5) Cc: Andi Kleen <ak@suse.de> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> -- | AMD Saxony Limited Liability Company & Co. KG Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany System | Register Court Dresden: HRA 4896 Research | General Partner authorized to represent: Center | AMD Saxony LLC (Wilmington, Delaware, US) | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy [-- Attachment #2: x86_64_get_cycles_sync_fix.patch --] [-- Type: text/plain, Size: 1903 bytes --] diff --git a/include/asm-i386/alternative.h b/include/asm-i386/alternative.h index 0f70b37..eb7da54 100644 --- a/include/asm-i386/alternative.h +++ b/include/asm-i386/alternative.h @@ -98,6 +98,12 @@ static inline void alternatives_smp_switch(int smp) {} ".previous" : output : [feat] "i" (feature), ##input) /* + * use this macro(s) if you need more than one output parameter + * in alternative_io + */ +#define ASM_OUTPUT2(a, b) a, b + +/* * Alternative inline assembly for SMP. * * The LOCK_PREFIX macro defined here replaces the LOCK and diff --git a/include/asm-i386/tsc.h b/include/asm-i386/tsc.h index 3f3c1fa..62c091f 100644 --- a/include/asm-i386/tsc.h +++ b/include/asm-i386/tsc.h @@ -35,14 +35,16 @@ static inline cycles_t get_cycles(void) static __always_inline cycles_t get_cycles_sync(void) { unsigned long long ret; - unsigned eax; + unsigned eax, edx; /* * Use RDTSCP if possible; it is guaranteed to be synchronous * and doesn't cause a VMEXIT on Hypervisors */ alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP, - "=A" (ret), "0" (0ULL) : "ecx", "memory"); + ASM_OUTPUT2("=a" (eax), "=d" (edx)), + "a" (0U), "d" (0U) : "ecx", "memory"); + ret = (((unsigned long long)edx) << 32) | ((unsigned long long)eax); if (ret) return ret; diff --git a/include/asm-x86_64/alternative.h b/include/asm-x86_64/alternative.h index a09fe85..a094276 100644 --- a/include/asm-x86_64/alternative.h +++ b/include/asm-x86_64/alternative.h @@ -103,6 +103,12 @@ static inline void alternatives_smp_switch(int smp) {} ".previous" : output : [feat] "i" (feature), ##input) /* + * use this macro(s) if you need more than one output parameter + * in alternative_io + */ +#define ASM_OUTPUT2(a, b) a, b + +/* * Alternative inline assembly for SMP. * * The LOCK_PREFIX macro defined here replaces the LOCK and ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: Andi, you broke my laptop :-) 2007-05-10 13:35 ` Joerg Roedel @ 2007-05-10 14:19 ` Andi Kleen 2007-05-10 14:25 ` Benny Halevy 2007-05-10 16:31 ` Pete Zaitcev 1 sibling, 1 reply; 7+ messages in thread From: Andi Kleen @ 2007-05-10 14:19 UTC (permalink / raw) To: Joerg Roedel Cc: Andi Kleen, Pete Zaitcev, ak, linux-kernel, Benny Halevy, akpm On Thu, May 10, 2007 at 03:35:56PM +0200, Joerg Roedel wrote: > On Thu, May 10, 2007 at 03:01:44PM +0200, Andi Kleen wrote: > > On Wed, May 09, 2007 at 12:56:16PM -0700, Pete Zaitcev wrote: > > > Hi, Andi: > > > > > > The attached patch (actually, git show output) makes my Dell 1501 to hang > > > on boot. Sorry, I have no clue why... The culprit is found with git bisect. > > > But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. > > > > MK-36? Does it have SVM? > > > > Anyways we previously had issues with this being miscompiled, but > > I thought the latest patch should have been ok. What compiler do you use? > > > > Can you send me a disassembly listing of arch/x86_64/kernel/time.o? > > I debugged this problem a bit and my compiler[1]interprets the =A > constraint as %rax instead of %edx:%eax on x86_64 which causes the > problem. The appended patch provides a workaround for this and fixed the > hang on my machine. Hmm yes now I can reproduce it too. I didn't see any hangs so i suppose my (and that of most -mm tester's) compiled binary always happened to have a suitable value in edx Thanks for the patch. -Andi ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Andi, you broke my laptop :-) 2007-05-10 14:19 ` Andi Kleen @ 2007-05-10 14:25 ` Benny Halevy 0 siblings, 0 replies; 7+ messages in thread From: Benny Halevy @ 2007-05-10 14:25 UTC (permalink / raw) To: Andi Kleen, Joerg Roedel; +Cc: Pete Zaitcev, ak, linux-kernel, akpm Joerg's patch works for me too. Thanks, Benny Andi Kleen wrote: > On Thu, May 10, 2007 at 03:35:56PM +0200, Joerg Roedel wrote: >> On Thu, May 10, 2007 at 03:01:44PM +0200, Andi Kleen wrote: >>> On Wed, May 09, 2007 at 12:56:16PM -0700, Pete Zaitcev wrote: >>>> Hi, Andi: >>>> >>>> The attached patch (actually, git show output) makes my Dell 1501 to hang >>>> on boot. Sorry, I have no clue why... The culprit is found with git bisect. >>>> But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. >>> MK-36? Does it have SVM? >>> >>> Anyways we previously had issues with this being miscompiled, but >>> I thought the latest patch should have been ok. What compiler do you use? >>> >>> Can you send me a disassembly listing of arch/x86_64/kernel/time.o? >> I debugged this problem a bit and my compiler[1]interprets the =A >> constraint as %rax instead of %edx:%eax on x86_64 which causes the >> problem. The appended patch provides a workaround for this and fixed the >> hang on my machine. > > Hmm yes now I can reproduce it too. I didn't see any hangs so i suppose > my (and that of most -mm tester's) compiled binary always happened to have a > suitable value in edx > > Thanks for the patch. > > -Andi > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Benny Halevy Panasas Inc. Accelerating Time to Results(TM) with Clustered Storage www.panasas.com bhalevy@panasas.com Tel/Fax: +972-3-647-8340 Mobile: +972-54-802-8340 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Andi, you broke my laptop :-) 2007-05-10 13:35 ` Joerg Roedel 2007-05-10 14:19 ` Andi Kleen @ 2007-05-10 16:31 ` Pete Zaitcev 1 sibling, 0 replies; 7+ messages in thread From: Pete Zaitcev @ 2007-05-10 16:31 UTC (permalink / raw) To: Joerg Roedel; +Cc: Andi Kleen, ak, linux-kernel, Benny Halevy, akpm, zaitcev On Thu, 10 May 2007 15:35:56 +0200, "Joerg Roedel" <joerg.roedel@amd.com> wrote: > I debugged this problem a bit and my compiler[1]interprets the =A > constraint as %rax instead of %edx:%eax on x86_64 which causes the > problem. The appended patch provides a workaround for this and fixed the > hang on my machine. > > [1] gcc version 4.1.3 20070429 (prerelease) (Debian 4.1.2-5) > alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP, > - "=A" (ret), "0" (0ULL) : "ecx", "memory"); > + ASM_OUTPUT2("=a" (eax), "=d" (edx)), > + "a" (0U), "d" (0U) : "ecx", "memory"); This works for me. Thanks, Joerg. gcc version 4.1.2 20070424 (Red Hat 4.1.2-11) -- Pete ^ permalink raw reply [flat|nested] 7+ messages in thread
[parent not found: <4642E8AA.7060501@panasas.com>]
* Re: Andi, you broke my laptop :-) [not found] <4642E8AA.7060501@panasas.com> @ 2007-05-10 10:12 ` Benny Halevy 0 siblings, 0 replies; 7+ messages in thread From: Benny Halevy @ 2007-05-10 10:12 UTC (permalink / raw) To: Pete Zaitcev, Andi Kleen; +Cc: linux-kernel Andy, Pete, this patch also causes our test machines to hang hard during boot. x86_64 smp kernel, single cpu Athlon 64 machine, cpuinfo below. Benny $ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 79 model name : AMD Athlon(tm) 64 Processor 3800+ stepping : 2 cpu MHz : 2412.397 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up pni cx16 lahf_lm svm cr8_legacy bogomips : 4828.89 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc Pete Zaitcev <zaitcev@redhat.com> wrote: > > Hi, Andi: > > The attached patch (actually, git show output) makes my Dell 1501 to hang > on boot. Sorry, I have no clue why... The culprit is found with git bisect. > But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. > > Cheers, > -- Pete > > commit c5bcb5635a03da3158f121ae20ccbbf72b4fc62a > Author: Andi Kleen <ak@suse.de> > Date: Wed May 2 19:27:21 2007 +0200 > > [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible > > RDTSCP is already synchronous and doesn't need an explicit CPUID. > This is a little faster and more importantly avoids VMEXITs on Hypervisors. > > Original patch from Joerg Roedel, but reworked by AK > Also includes miscompilation fix by Eric Biederman > > Cc: "Joerg Roedel" <joerg.roedel@amd.com> > > Signed-off-by: Andi Kleen <ak@suse.de> > > diff --git a/include/asm-i386/tsc.h b/include/asm-i386/tsc.h > index 0181f9d..3f3c1fa 100644 > --- a/include/asm-i386/tsc.h > +++ b/include/asm-i386/tsc.h > @@ -38,6 +38,15 @@ static __always_inline cycles_t get_cycles_sync(void) > unsigned eax; > > /* > + * Use RDTSCP if possible; it is guaranteed to be synchronous > + * and doesn't cause a VMEXIT on Hypervisors > + */ > + alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP, > + "=A" (ret), "0" (0ULL) : "ecx", "memory"); > + if (ret) > + return ret; > + > + /* > * Don't do an additional sync on CPUs where we know > * RDTSC is already synchronous: > */ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-05-10 16:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-09 19:56 Andi, you broke my laptop :-) Pete Zaitcev
2007-05-10 13:01 ` Andi Kleen
2007-05-10 13:35 ` Joerg Roedel
2007-05-10 14:19 ` Andi Kleen
2007-05-10 14:25 ` Benny Halevy
2007-05-10 16:31 ` Pete Zaitcev
[not found] <4642E8AA.7060501@panasas.com>
2007-05-10 10:12 ` Benny Halevy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox