* cpuid_eax damages registers (2.4.7pre7) @ 2001-07-18 10:48 Julian Anastasov 2001-07-18 15:10 ` Linus Torvalds 0 siblings, 1 reply; 22+ messages in thread From: Julian Anastasov @ 2001-07-18 10:48 UTC (permalink / raw) To: linux-kernel Hello, I don't know whether cpuid_eax (2.4.7pre) should preserve the registers changed from cpuid but I have an oops on boot with 2.4.7pre7 in squash_the_stupid_serial_number where cpuid_eax changes ebx and the parameter "c" is loaded with "Genu". The following change fixes the problem: from: c->cpuid_level = cpuid_eax(0); to: unsigned int dummy; cpuid(0, &c->cpuid_level, &dummy, &dummy, &dummy); but I'm not sure in the definitions of these cpuid_XXX funcs. I see that they are used at many places. IMO, they have to preserve the registers. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 10:48 cpuid_eax damages registers (2.4.7pre7) Julian Anastasov @ 2001-07-18 15:10 ` Linus Torvalds 2001-07-18 17:08 ` Julian Anastasov 2001-07-21 23:45 ` Richard Henderson 0 siblings, 2 replies; 22+ messages in thread From: Linus Torvalds @ 2001-07-18 15:10 UTC (permalink / raw) To: ja, linux-kernel In article <Pine.LNX.4.10.10107181347030.16710-100000@l> you write: > > I don't know whether cpuid_eax (2.4.7pre) should preserve the >registers changed from cpuid It should. It has the proper "this instruction assigned values to these registers" stuff, so gcc should know which ones change. > but I have an oops on boot with 2.4.7pre7 in >squash_the_stupid_serial_number where cpuid_eax changes ebx and the >parameter "c" is loaded with "Genu". The following change fixes the >problem: Interesting. Can you do the following: - tell us your compiler version - do a "make arch/i386/kernel/setup.s" both ways, and show what squash_the_stupid_serial_number() looks like. - fix _all_ the "cpuid*()" functions to have :"0" (op) instead of their current incorrect :"a" (op) (we're supposed to explicitly tell the compiler that the first input is the same as the first output) - see if that makes any difference to the assembler output. In any case it does sound like a compiler bug, but it would be good to have a workaround. But it would also be good to have a more complete dump of the oops in question to see more about what is going on.. Thanks, Linus ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 15:10 ` Linus Torvalds @ 2001-07-18 17:08 ` Julian Anastasov 2001-07-18 17:21 ` Linus Torvalds 2001-07-21 23:45 ` Richard Henderson 1 sibling, 1 reply; 22+ messages in thread From: Julian Anastasov @ 2001-07-18 17:08 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Hello, On Wed, 18 Jul 2001, Linus Torvalds wrote: > In article <Pine.LNX.4.10.10107181347030.16710-100000@l> you write: > > > > I don't know whether cpuid_eax (2.4.7pre) should preserve the > >registers changed from cpuid > > It should. It has the proper "this instruction assigned values to these > registers" stuff, so gcc should know which ones change. > > > but I have an oops on boot with 2.4.7pre7 in > >squash_the_stupid_serial_number where cpuid_eax changes ebx and the > >parameter "c" is loaded with "Genu". The following change fixes the > >problem: > > Interesting. Can you do the following: > > - tell us your compiler version root@l:~# gcc -v Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/specs gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release) root@l:~# make --version GNU Make version 3.77, by Richard Stallman and Roland McGrath. ... root@l:~# ld -v GNU ld version 2.9.1 (with BFD 2.9.1.0.24) root@l:~# insmod -V insmod version 2.4.6 it seems my binutils is older than the recommended 2.9.1.0.25 :( > - do a "make arch/i386/kernel/setup.s" both ways, and show what > squash_the_stupid_serial_number() looks like. Both with "a" and "0" (zero) gcc shows same output in setup.s (using cpuid_eax): .Lfe20: .size deep_magic_nexgen_probe,.Lfe20-deep_magic_nexgen_probe .section .rodata .align 32 .LC127: .string "<5>CPU serial number disabled.\n" .section .text.init .align 4 .type squash_the_stupid_serial_number,@function squash_the_stupid_serial_number: pushl %esi pushl %ebx movl 12(%esp),%ebx movl 12(%ebx),%eax testl $262144,%eax je .L2393 cmpl $0,disable_x86_serial_nr je .L2393 movl $281,%ecx #APP rdmsr #NO_APP movl %eax,%esi orl $2097152,%esi movl %esi,%eax #APP wrmsr #NO_APP pushl $.LC127 call printk addl $4,%esp #APP lock ; btrl $18,12(%ebx) #NO_APP xorl %eax,%eax #APP cpuid <<<-- cpuid_eax #NO_APP movl %eax,8(%ebx) <<<-- oops .L2393: popl %ebx popl %esi ret > - fix _all_ the "cpuid*()" functions to have > > :"0" (op) does not help in my case, other solutions? And without recompile anyone can see its vmlinux file too (disassemble squash_the_stupid_serial_number) Even by using cpuid(...) replacement the registers are not preserved but instead of ebx, the arg "c" is in ebp and the oops does not occur: 0xc029ef94 <squash_the_stupid_serial_number+68>: xor %eax,%eax 0xc029ef96 <squash_the_stupid_serial_number+70>: cpuid 0xc029ef98 <squash_the_stupid_serial_number+72>: mov %eax,0x8(%ebp) It seems the problem remains in cpuid_XXX funcs. > instead of their current incorrect > > :"a" (op) > > (we're supposed to explicitly tell the compiler that the first input > is the same as the first output) > > - see if that makes any difference to the assembler output. > > In any case it does sound like a compiler bug, but it would be good to > have a workaround. But it would also be good to have a more complete > dump of the oops in question to see more about what is going on.. After fixing this with cpuid() I can work perfectly with this kernel on network activity (45K packets/sec), SMP. So, only cpuid_xxx are suspected for now. I don't preserve the oopses but I can do it if the attached in another mail setup.s is not enough. This problem does not occur in 2.4.6 and below, I have tried 2.4.2 - 2.4.7pre7 with the same gcc, etc. It seems only in squash_XXX the ebx damage was noticed. Part from .config: # # Processor type and features # # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set CONFIG_M586TSC=y # CONFIG_M586MMX is not set # CONFIG_M686 is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MCRUSOE is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP2 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MCYRIXIII is not set CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_X86_USE_STRING_486=y CONFIG_X86_ALIGNMENT_16=y CONFIG_X86_TSC=y # CONFIG_TOSHIBA is not set # CONFIG_MICROCODE is not set # CONFIG_X86_MSR is not set # CONFIG_X86_CPUID is not set CONFIG_NOHIGHMEM=y # CONFIG_HIGHMEM4G is not set # CONFIG_HIGHMEM64G is not set # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_SMP=y CONFIG_HAVE_DEC_LOCK=y 2nd CPU (same as 1st, I don't see problems in the detection): processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 10 cpu MHz : 864.004 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1723.59 > Thanks, > Linus Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 17:08 ` Julian Anastasov @ 2001-07-18 17:21 ` Linus Torvalds 2001-07-18 20:43 ` Kai Germaschewski 2001-07-18 22:04 ` Linus Torvalds 0 siblings, 2 replies; 22+ messages in thread From: Linus Torvalds @ 2001-07-18 17:21 UTC (permalink / raw) To: Julian Anastasov; +Cc: linux-kernel On Wed, 18 Jul 2001, Julian Anastasov wrote: > > gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release) Ok. That has been considered a "stable" gcc for a long time. We should try to find a work-around. > > - do a "make arch/i386/kernel/setup.s" both ways, and show what > > squash_the_stupid_serial_number() looks like. > > Both with "a" and "0" (zero) gcc shows same output in setup.s > (using cpuid_eax): Ok. Can you try to do the following in the cpuid_xxx() functions: - remove the dummy reads (ie leave just the one register in the asm that we're actually interested in) - add explicit clobbers for the other registers So, for example, cpuid_eax() would be roughly unsigned int eax; asm("cpuid" :"=a" (eax) :"0" (level) :"bx","cx","dx"); return eax; (and for the others, you can't really clobber "ax" because it's an input, so you'd have to leave that as just a "=a" (eax) and hope that gcc gets that output right. > lock ; btrl $18,12(%ebx) > #NO_APP > xorl %eax,%eax > #APP > cpuid <<<-- cpuid_eax > #NO_APP > movl %eax,8(%ebx) <<<-- oops Yes, the above is very definitely a bug in gcc. Oh, well.. > Even by using cpuid(...) replacement the registers are not > preserved but instead of ebx, the arg "c" is in ebp and the oops does not > occur: They don't have ot be preserved per se - gcc just knows that they are modified, and because gcc doesn't care about the value it just won't use it. > After fixing this with cpuid() I can work perfectly with this > kernel on network activity (45K packets/sec), SMP. So, only cpuid_xxx are > suspected for now. The current asms for cpuid_xxx() are correct. The fact that gcc generates bad code for this case implies that it might happen somewhere else too, but as egcs-2.91.66 is fairly well tested it may indeed be that this is the only case where that actually happens. > I don't preserve the oopses but I can do it if .. No, your generated assembly is clear enough evidence of what is going on. Thanks. If you could try the clobber approach, that would be good. Linus ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 17:21 ` Linus Torvalds @ 2001-07-18 20:43 ` Kai Germaschewski 2001-07-18 22:04 ` Linus Torvalds 1 sibling, 0 replies; 22+ messages in thread From: Kai Germaschewski @ 2001-07-18 20:43 UTC (permalink / raw) To: Linus Torvalds; +Cc: Julian Anastasov, linux-kernel On Wed, 18 Jul 2001, Linus Torvalds wrote: > Can you try to do the following in the cpuid_xxx() functions: > - remove the dummy reads (ie leave just the one register in the asm that > we're actually interested in) > - add explicit clobbers for the other registers I looked into this a little to improve my knowledge on inline asm. Anyway, I found one ugly work-around, i.e. using a makro instead of the inline function plus a local eax_in variable, but your idea seems way nicer and works as well. Generated code looks okay now (using kgcc aka egcs-2.91.66): 2002: 31 c0 xor %eax,%eax 2004: 0f a2 cpuid 2006: 89 46 08 mov %eax,0x8(%esi) 2009: 5b pop %ebx 200a: 5e pop %esi 200b: c3 ret Patch follows: --Kai diff -ur linux-2.4.7-pre7/include/asm-i386/processor.h linux-2.4.7-pre7.work/include/asm-i386/processor.h --- linux-2.4.7-pre7/include/asm-i386/processor.h Wed Jul 18 21:49:47 2001 +++ linux-2.4.7-pre7.work/include/asm-i386/processor.h Wed Jul 18 22:38:20 2001 @@ -134,38 +134,42 @@ */ extern inline unsigned int cpuid_eax(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int eax; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=a" (eax) + : "a" (op) + : "ebx", "ecx", "edx"); return eax; } extern inline unsigned int cpuid_ebx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int ebx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=b" (ebx) + : "a" (op) + : "eax", "ecx", "edx"); return ebx; } extern inline unsigned int cpuid_ecx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int ecx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=c" (ecx) + : "a" (op) + : "eax", "ebx", "edx"); return ecx; } extern inline unsigned int cpuid_edx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int edx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=d" (edx) + : "a" (op) + : "eax", "ebx", "ecx"); return edx; } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 17:21 ` Linus Torvalds 2001-07-18 20:43 ` Kai Germaschewski @ 2001-07-18 22:04 ` Linus Torvalds 2001-07-18 22:25 ` Kai Germaschewski ` (3 more replies) 1 sibling, 4 replies; 22+ messages in thread From: Linus Torvalds @ 2001-07-18 22:04 UTC (permalink / raw) To: kai, linux-kernel, Julian Anastasov In article <Pine.LNX.4.33.0107182239050.1298-100000@vaio> you write: > >Generated code looks okay now (using kgcc aka egcs-2.91.66): > > 2002: 31 c0 xor %eax,%eax > 2004: 0f a2 cpuid > 2006: 89 46 08 mov %eax,0x8(%esi) > 2009: 5b pop %ebx > 200a: 5e pop %esi > 200b: c3 ret > >Patch follows: Can you verify with this alternate patch instead? Yours works ok on older gcc's, but the gcc team feels that clobbers must never cover inputs or outputs, so your patch really generates invalid asms. Here's a alternate, can you verify that it works for you guys, and perhaps people can at the same time eye-ball it for any other issues they can think of? Linus ---- --- pre7/linux/include/asm-i386/processor.h Wed Jul 18 09:34:03 2001 +++ linux/include/asm-i386/processor.h Wed Jul 18 14:58:45 2001 @@ -126,7 +126,7 @@ "=b" (*ebx), "=c" (*ecx), "=d" (*edx) - : "a" (op)); + : "0" (op)); } /* @@ -134,38 +134,42 @@ */ extern inline unsigned int cpuid_eax(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int eax; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=a" (eax) + : "0" (op) + : "bx", "cx", "dx"); return eax; } extern inline unsigned int cpuid_ebx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int eax, ebx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=a" (eax), "=b" (ebx) + : "0" (op) + : "cx", "dx" ); return ebx; } extern inline unsigned int cpuid_ecx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int eax, ecx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=a" (eax), "=c" (ecx) + : "0" (op) + : "bx", "dx" ); return ecx; } extern inline unsigned int cpuid_edx(unsigned int op) { - unsigned int eax, ebx, ecx, edx; + unsigned int eax, edx; __asm__("cpuid" - : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) - : "a" (op)); + : "=a" (eax), "=d" (edx) + : "0" (op) + : "bx", "cx"); return edx; } ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 22:04 ` Linus Torvalds @ 2001-07-18 22:25 ` Kai Germaschewski 2001-07-19 8:23 ` Julian Anastasov ` (2 subsequent siblings) 3 siblings, 0 replies; 22+ messages in thread From: Kai Germaschewski @ 2001-07-18 22:25 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel, Julian Anastasov On Wed, 18 Jul 2001, Linus Torvalds wrote: > Can you verify with this alternate patch instead? Yours works ok on > older gcc's, but the gcc team feels that clobbers must never cover > inputs or outputs, so your patch really generates invalid asms. Here's > a alternate, can you verify that it works for you guys, and perhaps > people can at the same time eye-ball it for any other issues they can > think of? Generated code looks good here. I checked the asm output for some instances of cpuid_e[acd]x() in setup.c, and generated asm looks right in all cases. --Kai ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 22:04 ` Linus Torvalds 2001-07-18 22:25 ` Kai Germaschewski @ 2001-07-19 8:23 ` Julian Anastasov 2001-07-19 18:36 ` H. Peter Anvin 2001-07-20 1:42 ` Julian Anastasov 2001-07-22 0:00 ` Richard Henderson 3 siblings, 1 reply; 22+ messages in thread From: Julian Anastasov @ 2001-07-19 8:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: kai, linux-kernel Hello, On Wed, 18 Jul 2001, Linus Torvalds wrote: > Can you verify with this alternate patch instead? Yours works ok on > older gcc's, but the gcc team feels that clobbers must never cover > inputs or outputs, so your patch really generates invalid asms. Here's > a alternate, can you verify that it works for you guys, and perhaps > people can at the same time eye-ball it for any other issues they can > think of? This patch works for me too (I checked all cpuid_XXX calls). After some thinking I produced another patch. The interesting part is that __volatile__ solves the problem. Patch appended. I see in other kernel files that volatile solves gcc bugs. The question is whether the volatile is needed only as a work-around or it is needed in this case particulary, i.e. where the output registers are not used and are optimized. > Linus Regards -- Julian Anastasov <ja@ssi.bg> --- include/asm-i386/processor.h.orig1 Wed Jul 18 12:03:26 2001 +++ include/asm-i386/processor.h Thu Jul 19 10:58:06 2001 @@ -136,7 +136,7 @@ { unsigned int eax, ebx, ecx, edx; - __asm__("cpuid" + __asm__ __volatile__ ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (op)); return eax; @@ -145,7 +145,7 @@ { unsigned int eax, ebx, ecx, edx; - __asm__("cpuid" + __asm__ __volatile__ ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (op)); return ebx; @@ -154,7 +154,7 @@ { unsigned int eax, ebx, ecx, edx; - __asm__("cpuid" + __asm__ __volatile__ ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (op)); return ecx; @@ -163,7 +163,7 @@ { unsigned int eax, ebx, ecx, edx; - __asm__("cpuid" + __asm__ __volatile__ ("cpuid" : "=a" (eax), "=b" (ebx), "=c" (ecx), "=d" (edx) : "a" (op)); return edx; ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-19 8:23 ` Julian Anastasov @ 2001-07-19 18:36 ` H. Peter Anvin 0 siblings, 0 replies; 22+ messages in thread From: H. Peter Anvin @ 2001-07-19 18:36 UTC (permalink / raw) To: linux-kernel Followup to: <Pine.LNX.4.10.10107191113080.2341-100000@l> By author: Julian Anastasov <ja@ssi.bg> In newsgroup: linux.dev.kernel > > This patch works for me too (I checked all cpuid_XXX calls). > After some thinking I produced another patch. The interesting part is > that __volatile__ solves the problem. Patch appended. I see in other > kernel files that volatile solves gcc bugs. The question is whether > the volatile is needed only as a work-around or it is needed in this > case particulary, i.e. where the output registers are not used and are > optimized. > It certainly shouldn't; obviously, the assembly code is clearly declaring that it is outputting multiple things. "volatile" on an "asm" statement basically means "do this even if you don't need the output values" (i.e. don't assume you're doing this just for the computation), which is incorrect in this case (we *are* doing it just for the output values, not for any side effects), but it is not really surprising that it works around this bug. The problem seems to be that gcc 2.91.66 thinks it can optimize away half of an indivisible operation, which cannot be called anything but a bug. -hpa -- <hpa@transmeta.com> at work, <hpa@zytor.com> in private! "Unix gives you enough rope to shoot yourself in the foot." http://www.zytor.com/~hpa/puzzle.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 22:04 ` Linus Torvalds 2001-07-18 22:25 ` Kai Germaschewski 2001-07-19 8:23 ` Julian Anastasov @ 2001-07-20 1:42 ` Julian Anastasov 2001-07-19 22:51 ` H. Peter Anvin 2001-07-19 22:55 ` Linus Torvalds 2001-07-22 0:00 ` Richard Henderson 3 siblings, 2 replies; 22+ messages in thread From: Julian Anastasov @ 2001-07-20 1:42 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Linus Torvalds, linux-kernel Hello, > > kernel files that volatile solves gcc bugs. The question is whether > > the volatile is needed only as a work-around or it is needed in this > > case particulary, i.e. where the output registers are not used and are > > optimized. > > > > It certainly shouldn't; obviously, the assembly code is clearly > declaring that it is outputting multiple things. "volatile" on an > "asm" statement basically means "do this even if you don't need the > output values" (i.e. don't assume you're doing this just for the > computation), which is incorrect in this case (we *are* doing it just > for the output values, not for any side effects), but it is not really > surprising that it works around this bug. I'm now learning the inline asm but I found some interesting notes on this/similar issue: http://gcc.gnu.org/fom_serv/cache/23.html In my distro (now with gcc 2.96) I have a gcc info with name "Extended Asm", "Assembler Instructions with C Expression Operands" with the following text: --- If an `asm' has output operands, GNU CC assumes for optimization purposes the instruction has no side effects except to change the output operands. This does not mean instructions with a side effect cannot be used, but you must be careful, because the compiler may eliminate them if the output operands aren't used, or move them out of loops, or replace two with one if they constitute a common subexpression. Also, if your instruction does have a side effect on a variable that otherwise appears not to change, the old value of the variable may be reused later if it happens to be found in a register. You can prevent an `asm' instruction from being deleted, moved significantly, or combined, by writing the keyword `volatile' after the `asm'. ------ follows example, etc. What I want to say (I could be wrong and that can't surprise me) is that the original cpuid_eax is in fact incorrect. All cpuid_XXX funcs use only dummy output operands and because they are not used, they are removed and not considered as changed. By using dummy vars we say "We don't care for these output values, use them in current scope only". As result, the cpuid instruction appears as not to change the 3 of the 4 registers. This explains why cpuid() behaves differently from the cpuid_XXX funcs: because the cpuid_XXX funcs declare eax ... edx as local vars, while in cpuid() they are not local and hence are not optimized. The two solutions looks valid: - to clobber the changed registers and hence not to use them as dummy output operands - already in 2.4.7pre8 - to specify volatile, which in some way avoids the default behavior to make optimizations and to remove assumptions about the used dummy output registers Of course, it is interesting why 2.95 behaves differently, may be it simply proves that I'm wrong or there is a bug/precaution introduced after 2.91.66. The key is in the "volatile" semantic. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-20 1:42 ` Julian Anastasov @ 2001-07-19 22:51 ` H. Peter Anvin 2001-07-20 2:01 ` Julian Anastasov 2001-07-19 22:55 ` Linus Torvalds 1 sibling, 1 reply; 22+ messages in thread From: H. Peter Anvin @ 2001-07-19 22:51 UTC (permalink / raw) To: Julian Anastasov; +Cc: H. Peter Anvin, Linus Torvalds, linux-kernel Julian Anastasov wrote: > > What I want to say (I could be wrong and that can't surprise me) is > that the original cpuid_eax is in fact incorrect. All cpuid_XXX funcs > use only dummy output operands... > Bullsh*t. One of the output operands is always a non-dummy (in cpuid_edx() edx is not a dummy, for example.) -hpa ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-19 22:51 ` H. Peter Anvin @ 2001-07-20 2:01 ` Julian Anastasov 2001-07-19 23:02 ` H. Peter Anvin 0 siblings, 1 reply; 22+ messages in thread From: Julian Anastasov @ 2001-07-20 2:01 UTC (permalink / raw) To: H. Peter Anvin; +Cc: H. Peter Anvin, Linus Torvalds, linux-kernel Hello, On Thu, 19 Jul 2001, H. Peter Anvin wrote: > Julian Anastasov wrote: > > > > What I want to say (I could be wrong and that can't surprise me) is > > that the original cpuid_eax is in fact incorrect. All cpuid_XXX funcs > > use only dummy output operands... > > > > Bullsh*t. One of the output operands is always a non-dummy (in > cpuid_edx() edx is not a dummy, for example.) Right, and it is may be not damaged. In my first posting I claim that cpuid_eax damages ebx (and may be ecx and edx). > -hpa Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-20 2:01 ` Julian Anastasov @ 2001-07-19 23:02 ` H. Peter Anvin 0 siblings, 0 replies; 22+ messages in thread From: H. Peter Anvin @ 2001-07-19 23:02 UTC (permalink / raw) To: Julian Anastasov; +Cc: H. Peter Anvin, Linus Torvalds, linux-kernel Julian Anastasov wrote: > > Hello, > > On Thu, 19 Jul 2001, H. Peter Anvin wrote: > > > Julian Anastasov wrote: > > > > > > What I want to say (I could be wrong and that can't surprise me) is > > > that the original cpuid_eax is in fact incorrect. All cpuid_XXX funcs > > > use only dummy output operands... > > > > > > > Bullsh*t. One of the output operands is always a non-dummy (in > > cpuid_edx() edx is not a dummy, for example.) > > Right, and it is may be not damaged. In my first posting I > claim that cpuid_eax damages ebx (and may be ecx and edx). > Doesn't matter. gcc can't pick and choose what *effects* of an asm() statement it wants to happen -- this should be utterly obvious to anyone. As the old saying goes, you can't be half pregnant. -hpa ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-20 1:42 ` Julian Anastasov 2001-07-19 22:51 ` H. Peter Anvin @ 2001-07-19 22:55 ` Linus Torvalds 2001-07-20 2:19 ` Julian Anastasov 1 sibling, 1 reply; 22+ messages in thread From: Linus Torvalds @ 2001-07-19 22:55 UTC (permalink / raw) To: Julian Anastasov; +Cc: H. Peter Anvin, linux-kernel On Fri, 20 Jul 2001, Julian Anastasov wrote: > > In my distro (now with gcc 2.96) I have a gcc info with name "Extended > Asm", "Assembler Instructions with C Expression Operands" with the > following text: [ yes ] > What I want to say (I could be wrong and that can't surprise me) is > that the original cpuid_eax is in fact incorrect. No. It's correct, because cpuid doesn't have any side effects (*), so we don't need to mark it volatile. gcc is free to remove it if nothing uses the outputs, for example. But gcc cannot (and generally does not) ignore outputs that _are_ specified. Now, adding the "volatile" doesn't really make things worse, and it will make gcc even more anal about optimizations than it normally is, which is probably why that also hides the gcc bug. Note that gcc having bus in the inline asm handling is nothing new. We've had that before, and I'm sure we'll have it again. Not very many people use them: the kernel tends to be the heaviest user of them (with libc probably a good second). Which is why bugs here often take time to get fixed. It doesn't help that the documentation has been quite bad, even misleading, at times. Linus (*) cpuid has the side effect of being a "synchronizing instruction", and as such you can use it for some SMP ordering things etc, but as it's one of the slowest such instructions nobody is really ever interested in using it that way, and it doesn't have any other "architecturally visible" effects that the compiler could care about. ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-19 22:55 ` Linus Torvalds @ 2001-07-20 2:19 ` Julian Anastasov 2001-07-19 23:23 ` H. Peter Anvin 2001-07-19 23:24 ` Linus Torvalds 0 siblings, 2 replies; 22+ messages in thread From: Julian Anastasov @ 2001-07-20 2:19 UTC (permalink / raw) To: Linus Torvalds; +Cc: H. Peter Anvin, linux-kernel Hello, On Thu, 19 Jul 2001, Linus Torvalds wrote: > No. It's correct, because cpuid doesn't have any side effects (*), so we > don't need to mark it volatile. gcc is free to remove it if nothing uses > the outputs, for example. But gcc cannot (and generally does not) ignore > outputs that _are_ specified. My understanding was that eax, ... edx are declared as local vars and so their values can't be used out of the current function scope, even when they are defined in inline func. So, I assume they can be optimized (the fact is that they are not used) and may be gcc forgets them. True, may be the docs do not cover such situations but my first thought was not to explain everything with bugs. > Now, adding the "volatile" doesn't really make things worse, and it will > make gcc even more anal about optimizations than it normally is, which is > probably why that also hides the gcc bug. > > Note that gcc having bus in the inline asm handling is nothing new. We've > had that before, and I'm sure we'll have it again. Not very many people > use them: the kernel tends to be the heaviest user of them (with libc > probably a good second). Which is why bugs here often take time to get > fixed. It doesn't help that the documentation has been quite bad, even > misleading, at times. Agreed. It seems I have to read more docs ... > Linus > > (*) cpuid has the side effect of being a "synchronizing instruction", and > as such you can use it for some SMP ordering things etc, but as it's one > of the slowest such instructions nobody is really ever interested in using > it that way, and it doesn't have any other "architecturally visible" > effects that the compiler could care about. Regards -- Julian Anastasov <ja@ssi.bg> ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-20 2:19 ` Julian Anastasov @ 2001-07-19 23:23 ` H. Peter Anvin 2001-07-19 23:24 ` Linus Torvalds 1 sibling, 0 replies; 22+ messages in thread From: H. Peter Anvin @ 2001-07-19 23:23 UTC (permalink / raw) To: Julian Anastasov; +Cc: Linus Torvalds, H. Peter Anvin, linux-kernel Julian Anastasov wrote: > > Hello, > > On Thu, 19 Jul 2001, Linus Torvalds wrote: > > > No. It's correct, because cpuid doesn't have any side effects (*), so we > > don't need to mark it volatile. gcc is free to remove it if nothing uses > > the outputs, for example. But gcc cannot (and generally does not) ignore > > outputs that _are_ specified. > > My understanding was that eax, ... edx are declared as > local vars and so their values can't be used out of the current > function scope, even when they are defined in inline func. So, I > assume they can be optimized (the fact is that they are not used) > and may be gcc forgets them. True, may be the docs do not cover > such situations but my first thought was not to explain everything > with bugs. > Well, your first thought was wrong. It is a bug. Sorry. However, your argument basically explains why adding "volatile" does work -- it keeps gcc from thinking that it can optimize away something that it otherwise couldn't. However, it's still a bug. -hpa ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-20 2:19 ` Julian Anastasov 2001-07-19 23:23 ` H. Peter Anvin @ 2001-07-19 23:24 ` Linus Torvalds 1 sibling, 0 replies; 22+ messages in thread From: Linus Torvalds @ 2001-07-19 23:24 UTC (permalink / raw) To: Julian Anastasov; +Cc: H. Peter Anvin, linux-kernel On Fri, 20 Jul 2001, Julian Anastasov wrote: > > My understanding was that eax, ... edx are declared as > local vars and so their values can't be used out of the current > function scope, even when they are defined in inline func. Yes, but notice how we return a value. And the only way to get that value is to execute the cpuid. So obviously gcc can't drop the cpuid. And if it cannot drop it, it cannot ignore the fact that cpuid changes all the registers we say it changes. Linus ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 22:04 ` Linus Torvalds ` (2 preceding siblings ...) 2001-07-20 1:42 ` Julian Anastasov @ 2001-07-22 0:00 ` Richard Henderson 2001-07-22 4:27 ` H. Peter Anvin 3 siblings, 1 reply; 22+ messages in thread From: Richard Henderson @ 2001-07-22 0:00 UTC (permalink / raw) To: Linus Torvalds; +Cc: kai, linux-kernel, Julian Anastasov On Wed, Jul 18, 2001 at 03:04:20PM -0700, Linus Torvalds wrote: > Can you verify with this alternate patch instead? I take it you've found something that happens to work with egcs 1.1? At a glance the bug appears to be the one that caused gcc/testsuite/gcc.dg/clobbers.c to be written. That one is a fundamental flaw in reload that caused it to be largely rewritten for gcc 2.95. In other words, you may not be able to find a workaround for egcs 1.1 that works for all cases. Using an alternative that writes all of eax/ebx/ecx/edx to memory is probably safer if none of the uses of cpuid are performance-critical. r~ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-22 0:00 ` Richard Henderson @ 2001-07-22 4:27 ` H. Peter Anvin 0 siblings, 0 replies; 22+ messages in thread From: H. Peter Anvin @ 2001-07-22 4:27 UTC (permalink / raw) To: linux-kernel Followup to: <20010721170018.C3676@twiddle.net> By author: Richard Henderson <rth@twiddle.net> In newsgroup: linux.dev.kernel > > On Wed, Jul 18, 2001 at 03:04:20PM -0700, Linus Torvalds wrote: > > Can you verify with this alternate patch instead? > > I take it you've found something that happens to work with egcs 1.1? > > At a glance the bug appears to be the one that caused > gcc/testsuite/gcc.dg/clobbers.c to be written. That one > is a fundamental flaw in reload that caused it to be > largely rewritten for gcc 2.95. > > In other words, you may not be able to find a workaround > for egcs 1.1 that works for all cases. Using an alternative > that writes all of eax/ebx/ecx/edx to memory is probably > safer if none of the uses of cpuid are performance-critical. > Either making the asms "volatile" seems to work, as does leaving the unused arguments as clobbers rather than output operands. If either doesn't work with later versions, then later versions would have their own bugs. -hpa -- <hpa@transmeta.com> at work, <hpa@zytor.com> in private! "Unix gives you enough rope to shoot yourself in the foot." http://www.zytor.com/~hpa/puzzle.txt ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 15:10 ` Linus Torvalds 2001-07-18 17:08 ` Julian Anastasov @ 2001-07-21 23:45 ` Richard Henderson 1 sibling, 0 replies; 22+ messages in thread From: Richard Henderson @ 2001-07-21 23:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: ja, linux-kernel On Wed, Jul 18, 2001 at 08:10:22AM -0700, Linus Torvalds wrote: > - fix _all_ the "cpuid*()" functions to have > > :"0" (op) > > instead of their current incorrect > > :"a" (op) > > (we're supposed to explicitly tell the compiler that the first input > is the same as the first output) FWIW, using "a" as the input constraint isn't incorrect. The two are equivalent given a singleton register class. r~ ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) @ 2001-07-18 20:30 Rick Hohensee 2001-07-19 5:15 ` Keith Owens 0 siblings, 1 reply; 22+ messages in thread From: Rick Hohensee @ 2001-07-18 20:30 UTC (permalink / raw) To: linux-kernel >We should try to find a work-around. >> > - do a "make arch/i386/kernel/setup.s" both ways, and show what Is there a way to do a "make bla/zay/woof.c " and save the woof.s ? Rick Hohensee www.clienux.com ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: cpuid_eax damages registers (2.4.7pre7) 2001-07-18 20:30 Rick Hohensee @ 2001-07-19 5:15 ` Keith Owens 0 siblings, 0 replies; 22+ messages in thread From: Keith Owens @ 2001-07-19 5:15 UTC (permalink / raw) To: Rick Hohensee; +Cc: linux-kernel On Wed, 18 Jul 2001 16:30:55 -0400 (EDT), Rick Hohensee <humbubba@smarty.smart.net> wrote: >>We should try to find a work-around. > >>> > - do a "make arch/i386/kernel/setup.s" both ways, and show what > >Is there a way to do a "make bla/zay/woof.c " and save the woof.s ? rm bla/zay/woof.o make CFLAGS_woof.o=-save-temps ^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2001-07-22 4:28 UTC | newest] Thread overview: 22+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2001-07-18 10:48 cpuid_eax damages registers (2.4.7pre7) Julian Anastasov 2001-07-18 15:10 ` Linus Torvalds 2001-07-18 17:08 ` Julian Anastasov 2001-07-18 17:21 ` Linus Torvalds 2001-07-18 20:43 ` Kai Germaschewski 2001-07-18 22:04 ` Linus Torvalds 2001-07-18 22:25 ` Kai Germaschewski 2001-07-19 8:23 ` Julian Anastasov 2001-07-19 18:36 ` H. Peter Anvin 2001-07-20 1:42 ` Julian Anastasov 2001-07-19 22:51 ` H. Peter Anvin 2001-07-20 2:01 ` Julian Anastasov 2001-07-19 23:02 ` H. Peter Anvin 2001-07-19 22:55 ` Linus Torvalds 2001-07-20 2:19 ` Julian Anastasov 2001-07-19 23:23 ` H. Peter Anvin 2001-07-19 23:24 ` Linus Torvalds 2001-07-22 0:00 ` Richard Henderson 2001-07-22 4:27 ` H. Peter Anvin 2001-07-21 23:45 ` Richard Henderson -- strict thread matches above, loose matches on Subject: below -- 2001-07-18 20:30 Rick Hohensee 2001-07-19 5:15 ` Keith Owens
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox