All of lore.kernel.org
 help / color / mirror / Atom feed
* U1/U2 failures with kernel 2.6.<anything> --- maybe a clue?
@ 2006-02-26  3:14 Ferris McCormick
  2006-02-26  4:23 ` [gentoo-sparc] U1/U2 failures with kernel 2.6.<anything> --- gentuxx
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Ferris McCormick @ 2006-02-26  3:14 UTC (permalink / raw)
  To: sparclinux

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

OK, I've been thinking about this, and here is what we have.
(1) Some U1/U2 systems do very well on these kernels;
(2) Some are unusable:  I have one which on 2.6.xx, has mean time between 
(very hard lock) failure of about a day, on kernel-2.4.32, it's never 
(literally).
(3) Weeve and (I believe) squash are as in point 2.

Now, I am not imagining things: a system which responds to nothing at all 
is hard to make up.

Further, my unusable-with-2.6 system is 2x400; stable ones are I think a 
bit slower.

Here's the clue:  I tried the 2x400 system with a cdrecord, (which works 
perfectly on 2.4.xx) with 2.6.15-rc4.  It wrote the disk.  Then it tried 
to fixate it. 
That killed it within about 1 second.  I *think* fixating is one long 
system call (I haven't read cdrecord yet), and scsi disk activity I know 
is the general killer.  So maybe looking at cdrecord's fixating system 
activity can tell where the problem is.  (I do know cdrecord on this
system with 2.6.xx has a 100% failure rate, based on several attempts.)

Thoughts, Comments?
(By the way, I regret my rash remarks from earlier.)
Regards,
Ferris
- --
Ferris McCormick (P44646, MI) <fmccor@gentoo.org>
Developer, Gentoo Linux (Devrel, Sparc)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (GNU/Linux)

iD8DBQFEAR06Qa6M3+I///cRAmq7AJ9SCiBS/sXieWdWF/Xu6nBMxIplngCdF2/3
Ku3jz0TMhUjvbnbT1md+Y/0=tMFQ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-sparc] U1/U2 failures with kernel 2.6.<anything> ---
  2006-02-26  3:14 U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Ferris McCormick
@ 2006-02-26  4:23 ` gentuxx
  2006-02-26 15:18 ` U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Mark Fortescue
  2006-02-27  0:56 ` Jason Wever
  2 siblings, 0 replies; 4+ messages in thread
From: gentuxx @ 2006-02-26  4:23 UTC (permalink / raw)
  To: sparclinux

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ferris McCormick wrote:

> OK, I've been thinking about this, and here is what we have.
> (1) Some U1/U2 systems do very well on these kernels;
> (2) Some are unusable: I have one which on 2.6.xx, has mean time
> between (very hard lock) failure of about a day, on kernel-2.4.32,
> it's never (literally).
> (3) Weeve and (I believe) squash are as in point 2.
>
> Now, I am not imagining things: a system which responds to nothing
> at all is hard to make up.
>
> Further, my unusable-with-2.6 system is 2x400; stable ones are I
> think a bit slower.


This could be.  I currently have 3 U1's running 2.6.15-r5 perfectly
fine.  I'm still sort of in the process of building them out (just
adding packaged I need and such), so they have been compiling (with
distcc) fine for about 3 days straight.  Two are 200Mhz Ultrasparc I's
and one is a 166Mhz U1.

They each only have one processor, but SMP is enabled.

>
> Here's the clue: I tried the 2x400 system with a cdrecord, (which
> works perfectly on 2.4.xx) with 2.6.15-rc4. It wrote the disk.
> Then it tried to fixate it. That killed it within about 1 second. I
> *think* fixating is one long system call (I haven't read cdrecord
> yet), and scsi disk activity I know is the general killer. So maybe
> looking at cdrecord's fixating system activity can tell where the
> problem is. (I do know cdrecord on this
> system with 2.6.xx has a 100% failure rate, based on several attempts.)
>
> Thoughts, Comments?
> (By the way, I regret my rash remarks from earlier.)
> Regards,
> Ferris
> --
> Ferris McCormick (P44646, MI) <fmccor@gentoo.org>
> Developer, Gentoo Linux (Devrel, Sparc)


- --
gentux
echo "hfouvyAdpy/ofu" | perl -pe 's/(.)/chr(ord($1)-1)/ge'

gentux's gpg fingerprint => 34CE 2E97 40C7 EF6E EC40  9795 2D81 924A
6996 0993
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (GNU/Linux)

iD8DBQFEAS04LYGSSmmWCZMRAmqtAKC1RoYtSKkJywwjC4j49pqQmwgr9wCeLE7A
rX6mbX/BI8o7YTdfsV5Nvtk=j9Yv
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: U1/U2 failures with kernel 2.6.<anything> --- maybe a clue?
  2006-02-26  3:14 U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Ferris McCormick
  2006-02-26  4:23 ` [gentoo-sparc] U1/U2 failures with kernel 2.6.<anything> --- gentuxx
@ 2006-02-26 15:18 ` Mark Fortescue
  2006-02-27  0:56 ` Jason Wever
  2 siblings, 0 replies; 4+ messages in thread
From: Mark Fortescue @ 2006-02-26 15:18 UTC (permalink / raw)
  To: sparclinux

Hi,

There have been a number of changes from V2.2.xx to V2.6.xx in the SCSI
layers that break things.

I tried to use the AHA1542 driver (PC ISA) with a 2.6.10 kernel and it
failed miserably. I have not tried a more recent one yet but as looking at
the code between a working 2.2.26 kernel and the 2.6.10 kernel indicated
that a number of things relating to SCSI Bus timeouts have been re-located
and messed arround with. It is clear that not all the comments in the
V2.2/V2.4 SCSI drivers/SCSI layer code were read and understod before the
changes to the V2.6 kernel were made and that much more testing of the
various SCSI drivers should have been done before the changes got into the
kernel.

The lack of thorough testing apears to be a big issue with the 2.6.xx
kernels as from my point of view is the continuously changing internal
interfaces. It is hard work keeping propriatory hardware device drivers
uptodate with all the changes, let alown attempting to identift and fix
bugs in the kernel supplied drivers/interface layers.

I have had to make significant changes to the frame buffer code to get a
satesfactory/usable display using my SparcStation 1 (CG3). Black on black
displays make debugging very dificult especially when the CG3 code causes 
a system crash.

I still do not have a usable system as there is a bug someware that
prevents a NFS root filing system from working properly (linking causes a
radix error in the NFS code) and the UFS filing system write support cause
the kernel to lockup on file change.

Regards
	Mark Fortescue.

On Sun, 26 Feb 2006, Ferris McCormick wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> OK, I've been thinking about this, and here is what we have.
> (1) Some U1/U2 systems do very well on these kernels;
> (2) Some are unusable:  I have one which on 2.6.xx, has mean time between 
> (very hard lock) failure of about a day, on kernel-2.4.32, it's never 
> (literally).
> (3) Weeve and (I believe) squash are as in point 2.
> 
> Now, I am not imagining things: a system which responds to nothing at all 
> is hard to make up.
> 
> Further, my unusable-with-2.6 system is 2x400; stable ones are I think a 
> bit slower.
> 
> Here's the clue:  I tried the 2x400 system with a cdrecord, (which works 
> perfectly on 2.4.xx) with 2.6.15-rc4.  It wrote the disk.  Then it tried 
> to fixate it. 
> That killed it within about 1 second.  I *think* fixating is one long 
> system call (I haven't read cdrecord yet), and scsi disk activity I know 
> is the general killer.  So maybe looking at cdrecord's fixating system 
> activity can tell where the problem is.  (I do know cdrecord on this
> system with 2.6.xx has a 100% failure rate, based on several attempts.)
> 
> Thoughts, Comments?
> (By the way, I regret my rash remarks from earlier.)
> Regards,
> Ferris
> - --
> Ferris McCormick (P44646, MI) <fmccor@gentoo.org>
> Developer, Gentoo Linux (Devrel, Sparc)
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.1 (GNU/Linux)
> 
> iD8DBQFEAR06Qa6M3+I///cRAmq7AJ9SCiBS/sXieWdWF/Xu6nBMxIplngCdF2/3
> Ku3jz0TMhUjvbnbT1md+Y/0> =tMFQ
> -----END PGP SIGNATURE-----
> -
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: U1/U2 failures with kernel 2.6.<anything> --- maybe a clue?
  2006-02-26  3:14 U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Ferris McCormick
  2006-02-26  4:23 ` [gentoo-sparc] U1/U2 failures with kernel 2.6.<anything> --- gentuxx
  2006-02-26 15:18 ` U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Mark Fortescue
@ 2006-02-27  0:56 ` Jason Wever
  2 siblings, 0 replies; 4+ messages in thread
From: Jason Wever @ 2006-02-27  0:56 UTC (permalink / raw)
  To: sparclinux

[-- Attachment #1: Type: text/plain, Size: 1092 bytes --]

On Sun, 26 Feb 2006 03:14:58 +0000 (UTC)
Ferris McCormick <fmccor@gentoo.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> OK, I've been thinking about this, and here is what we have.
> (1) Some U1/U2 systems do very well on these kernels;
> (2) Some are unusable:  I have one which on 2.6.xx, has mean time
> between (very hard lock) failure of about a day, on kernel-2.4.32,
> it's never (literally).
> (3) Weeve and (I believe) squash are as in point 2.

Yes, I have an Ultra 2 (2x300, 2GB RAM) and an Ultra 1 (143MHz, 448MB
RAM) that can readily be locked up with what appears to be the I/O
issue.  In both cases, neither running the systems with a serial
console or graphical console reveals anything when they lock up
(even with the syslog daemon turned off).

For both systems, I've tried kernels built with gcc-3.4.5 and gcc-3.3.6
and the only possible difference is that it *seems* to take longer to
crash on the gcc-3.4.5 built kernels (but let me generate some data to
back that up).

Cheers,
-- 
Jason Wever
Gentoo/Sparc Team Co-Lead

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-02-27  0:56 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-02-26  3:14 U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Ferris McCormick
2006-02-26  4:23 ` [gentoo-sparc] U1/U2 failures with kernel 2.6.<anything> --- gentuxx
2006-02-26 15:18 ` U1/U2 failures with kernel 2.6.<anything> --- maybe a clue? Mark Fortescue
2006-02-27  0:56 ` Jason Wever

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.