* VIA Ezra CentaurHauls
@ 2003-06-18 14:18 Guennadi Liakhovetski
2003-06-18 14:42 ` P
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Guennadi Liakhovetski @ 2003-06-18 14:18 UTC (permalink / raw)
To: linux-kernel, debian-glibc
Hello
We have a platform with the above processor, and we happened to have 2
revisions thereof: stepping 8 and 10. With stepping 8 we are getting
"random" application crashes (segfaults), sometimes with kernel-Oopses.
The distribution is Debian-Woody. I saw some messages on the Debian
mailing list about problems with exactly this CPU, however, it was not
related to different revisions (stepping), perhaps, the author only had
/ tried stepping 8. The fix was to upgrade libc. I've done this (to
version libc6_2.3.1-16, but it didn't help. Any ideas?
Thanks
Guennadi
---------------------------------
Guennadi Liakhovetski, Ph.D.
DSA Daten- und Systemtechnik GmbH
Pascalstr. 28
D-52076 Aachen
Germany
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: VIA Ezra CentaurHauls 2003-06-18 14:18 VIA Ezra CentaurHauls Guennadi Liakhovetski @ 2003-06-18 14:42 ` P 2003-06-18 16:15 ` Guennadi Liakhovetski 2003-06-18 17:17 ` Guennadi Liakhovetski 2003-06-20 11:11 ` Daniel Egger 2003-06-27 6:18 ` Alex Belits 2 siblings, 2 replies; 8+ messages in thread From: P @ 2003-06-18 14:42 UTC (permalink / raw) To: Guennadi Liakhovetski; +Cc: linux-kernel, debian-glibc Guennadi Liakhovetski wrote: > Hello > > We have a platform with the above processor, and we happened to have 2 > revisions thereof: stepping 8 and 10. With stepping 8 we are getting > "random" application crashes (segfaults), sometimes with kernel-Oopses. > The distribution is Debian-Woody. Interesting, so stepping 10 is OK? > I saw some messages on the Debian > mailing list about problems with exactly this CPU, however, it was not > related to different revisions (stepping), perhaps, the author only had > / tried stepping 8. The fix was to upgrade libc. so is it a glibc bug or CPU bug? > I've done this (to > version libc6_2.3.1-16, but it didn't help. Any ideas? You could search for CMOV instructions on your system, which could cause wierdness, like: find / -perm +111 -type f | while read bin; do objdump --disassemble $bin 2>/dev/null | grep -q cmov && echo "$bin has cmov" done Note C3 Nehemiah do have CMOV (but no 3dnow). Pádraig. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls 2003-06-18 14:42 ` P @ 2003-06-18 16:15 ` Guennadi Liakhovetski 2003-06-18 17:17 ` Guennadi Liakhovetski 1 sibling, 0 replies; 8+ messages in thread From: Guennadi Liakhovetski @ 2003-06-18 16:15 UTC (permalink / raw) To: P; +Cc: linux-kernel, debian-glibc On Wed, 18 Jun 2003 P@draigBrady.com wrote: > Guennadi Liakhovetski wrote: > > We have a platform with the above processor, and we happened to have 2 > > revisions thereof: stepping 8 and 10. With stepping 8 we are getting > > "random" application crashes (segfaults), sometimes with kernel-Oopses. > > The distribution is Debian-Woody. > > Interesting, so stepping 10 is OK? Looks so. > > I saw some messages on the Debian > > mailing list about problems with exactly this CPU, however, it was not > > related to different revisions (stepping), perhaps, the author only had > > / tried stepping 8. The fix was to upgrade libc. > > so is it a glibc bug or CPU bug? Good question... > > I've done this (to version libc6_2.3.1-16, but it didn't help. Any ideas? > > You could search for CMOV instructions on your system, > which could cause wierdness, like: > > find / -perm +111 -type f | > while read bin; do > objdump --disassemble $bin 2>/dev/null | > grep -q cmov && echo "$bin has cmov" > done Yeah, will try. Plus libraries... > Note C3 Nehemiah do have CMOV (but no 3dnow). Meanwhile, I've written a micro-program with an assembly-inline with cmov. I have no idea about the ix86 assembly, so, I've just done int main(void) { int x=0,y=1; __asm__( "testl %0, %0\n" " cmovnz %0, %1":"=r" (x) :"r" (y)); exit(x); } On "10" the exit code is 1, which is correct (?), on "8" the exit code is 76. Funny enough, strace on "8" produces also semget(2, 1074927648, 0) = -1 ENOSYS (Function not implemented) but this, most probably, comes from the new libc6, that I installed there. Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls 2003-06-18 14:42 ` P 2003-06-18 16:15 ` Guennadi Liakhovetski @ 2003-06-18 17:17 ` Guennadi Liakhovetski 2003-06-20 16:57 ` GOTO Masanori 1 sibling, 1 reply; 8+ messages in thread From: Guennadi Liakhovetski @ 2003-06-18 17:17 UTC (permalink / raw) To: P; +Cc: linux-kernel, debian-glibc > find / -perm +111 -type f | > while read bin; do > objdump --disassemble $bin 2>/dev/null | > grep -q cmov && echo "$bin has cmov" > done So, using the above for libraries I found 3 libraries on the system, that use cmov: libldap.so.2.0.15 libcrypto.so.0.9.6 libqt-mt.so.3.0.5 So, the libraries have nothing to do with the kernel, the Debian guys might take a notice of them (not glibc, but still...). But what I do find interesting and noteworthy - is that this problem is specific only to some revisions of this CPU, which might be of interest to all. Thanks for the tip with the script! Guennadi --------------------------------- Guennadi Liakhovetski, Ph.D. DSA Daten- und Systemtechnik GmbH Pascalstr. 28 D-52076 Aachen Germany ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls 2003-06-18 17:17 ` Guennadi Liakhovetski @ 2003-06-20 16:57 ` GOTO Masanori 0 siblings, 0 replies; 8+ messages in thread From: GOTO Masanori @ 2003-06-20 16:57 UTC (permalink / raw) To: Guennadi Liakhovetski; +Cc: P, linux-kernel, debian-glibc At Wed, 18 Jun 2003 19:17:11 +0200 (CEST), Guennadi Liakhovetski wrote: > So, using the above for libraries I found 3 libraries on the system, that > use cmov: > > libldap.so.2.0.15 > libcrypto.so.0.9.6 > libqt-mt.so.3.0.5 > > So, the libraries have nothing to do with the kernel, the Debian guys > might take a notice of them (not glibc, but still...). But what I do find > interesting and noteworthy - is that this problem is specific only to some > revisions of this CPU, which might be of interest to all. I think it's not debian glibc problem. If you hit cmov problem, then your application says "illegal instruction". At least debian glibc 2.3.1-16 has trick not to load cmov-contained dynamic libraries. All libraries you pointed out are put under /.../lib/.../cmov/* in debian sid. I guess it's your CPU or thermal issue. Regards, -- gotom ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls 2003-06-18 14:18 VIA Ezra CentaurHauls Guennadi Liakhovetski 2003-06-18 14:42 ` P @ 2003-06-20 11:11 ` Daniel Egger 2003-06-27 6:18 ` Alex Belits 2 siblings, 0 replies; 8+ messages in thread From: Daniel Egger @ 2003-06-20 11:11 UTC (permalink / raw) To: Guennadi Liakhovetski; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 397 bytes --] Am Mit, 2003-06-18 um 16.18 schrieb Guennadi Liakhovetski: > / tried stepping 8. The fix was to upgrade libc. I've done this (to > version libc6_2.3.1-16, but it didn't help. Any ideas? IIRC there were some versions of glibc in Debian which activated the 686 and higher optimized versions for the cmov-less Ezra. A workaround is to (re)move /usr/lib/686. -- Servus, Daniel [-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls 2003-06-18 14:18 VIA Ezra CentaurHauls Guennadi Liakhovetski 2003-06-18 14:42 ` P 2003-06-20 11:11 ` Daniel Egger @ 2003-06-27 6:18 ` Alex Belits 2 siblings, 0 replies; 8+ messages in thread From: Alex Belits @ 2003-06-27 6:18 UTC (permalink / raw) To: Guennadi Liakhovetski; +Cc: linux-kernel, debian-glibc On Wed, 18 Jun 2003, Guennadi Liakhovetski wrote: > Hello > > We have a platform with the above processor, and we happened to have 2 > revisions thereof: stepping 8 and 10. With stepping 8 we are getting > "random" application crashes (segfaults), sometimes with kernel-Oopses. > The distribution is Debian-Woody. I saw some messages on the Debian > mailing list about problems with exactly this CPU, however, it was not > related to different revisions (stepping), perhaps, the author only had > / tried stepping 8. The fix was to upgrade libc. I've done this (to > version libc6_2.3.1-16, but it didn't help. Any ideas? I have two EPIA 800 motherboards with different CPUs: 1. The board has "Revision B" printed on it. CPU is: processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 7 model name : VIA Ezra stepping : 8 cpu MHz : 800.047 cache size : 64 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu de tsc msr cx8 mtrr pge mmx 3dnow bogomips : 1595.80 2. The board is "Revision D", but otherwise looks exactly the same. CPU is: processor : 0 vendor_id : CentaurHauls cpu family : 6 model : 7 model name : VIA Samuel 2 stepping : 3 cpu MHz : 800.047 cache size : 64 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu de tsc msr cx8 mtrr pge mmx 3dnow bogomips : 1595.80 Both boards were used in exactly the same environment, as a replacement for a failed FV24 motherboard in SV24 box that had Debian Woody already installed. Obviously, to make it boot I had to recompile the kernel, and I had to recompile mplayer that I have previously built for i686 (everything was done with gcc 2.95.4). First board worked perfectly until I have started (as a regular user) RealPlayer 8. After that the box became unstable, and other applications (mozilla, mplayer, gcc) started to crash randomly with SEGV. However I have not seen a kernel crash. I have tried different X drivers (trident from 4.3.0, trident from current, vesa), different memory, one or two sticks, different power supplies (including a known-good ATX power supply just in case), different CMOS settings (including "safe" default, all caches off, etc.), and the result was the same -- no problems without RealPlayer, application crashes after it started. TRplayer, that uses RealPlayer's libraries, has the same effect, however mplayer (tested only with non-Realmedia sources) worked perfectly, and even shown an impressive performance by playing SVCD in vidix mode with no dropped frames (as long as I was not doing anything else at the same time). When I have started RealPlayer, programs started to randomly crash, and whatever the problem was, it was not confined to the userid that RealPlayer was running as -- mplayer and gcc were running as root when they crashed. Usually things crashed with segmentation fault, however once RealPlayer crashed with floating point exception. Puzzled, I have ran memtest86, and all memory that ever was in that box passed all basic tests (I had no patience for anything more than that). When I have installed the second motherboard (obviously, with no other modifications), all problems disappeared. I have also checked the RealPlayer binaries, and objdump shown no cmov. I don't know what exactly happens, but it looks for me very strange that a single piece of code causes all this havoc, and that over all that time apparently no SIGILLs happened. I can only speculate that "something" leaves some piece of state in CPU (or maybe in cache) that survives a context switch, and messes up the state of other processes (registers or maybe memory). And whatever it is, "VIA Samuel 2, stepping 3" does not have it. -- Alex ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: VIA Ezra CentaurHauls
@ 2003-06-27 9:18 Miklos Szeredi
0 siblings, 0 replies; 8+ messages in thread
From: Miklos Szeredi @ 2003-06-27 9:18 UTC (permalink / raw)
To: Alex Belits; +Cc: linux-kernel
> First board worked perfectly until I have started (as a regular user)
> RealPlayer 8. After that the box became unstable, and other applications
> (mozilla, mplayer, gcc) started to crash randomly with SEGV. However I
> have not seen a kernel crash.
I had very similar experiences with the same CPU: VIA Ezra Stepping 8
(see: http://marc.theaimsgroup.com/?t=104262312700003&r=1&w=2). And
that was not with an EPIA, but an ASUS CUV4X-C MB.
I had the processor replaced, because I narrowed the problem down to
that, but it didn't help.
My feeling is ever stronger as I see these posts, that it is really
this modell that is buggy. If that is true, then VIA should either
replace these CPUs with a non-buggy one, or find a workaround for
whatever operating systems are affected.
BTW, I could reliably cure this broblem by turning off the L2 cache in
BIOS. Maybe it is some memory interaction problem, but I'm not an
expert on this subject.
Miklos
^ permalink raw reply [flat|nested] 8+ messages in threadend of thread, other threads:[~2003-06-27 9:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-06-18 14:18 VIA Ezra CentaurHauls Guennadi Liakhovetski 2003-06-18 14:42 ` P 2003-06-18 16:15 ` Guennadi Liakhovetski 2003-06-18 17:17 ` Guennadi Liakhovetski 2003-06-20 16:57 ` GOTO Masanori 2003-06-20 11:11 ` Daniel Egger 2003-06-27 6:18 ` Alex Belits -- strict thread matches above, loose matches on Subject: below -- 2003-06-27 9:18 Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox