All of lore.kernel.org
 help / color / mirror / Atom feed
* R10K/R12K Based O2's
@ 2006-01-28  5:19 Kumba
  2006-02-07  2:58 ` Kumba
  0 siblings, 1 reply; 2+ messages in thread
From: Kumba @ 2006-01-28  5:19 UTC (permalink / raw)
  To: Linux MIPS List


Okay, this is being sent to try and spur some activity on the front of getting 
R10K O2's to work somewhat better.  I toyed around with such a machine to see 
how long I could get it to stay online, and thought I'd post some results.

To start, I merged the generic pieces of Peter Fuerst's IP28 patches into an 
IP32 tree (piece that touch generic MIPS code, like memcpy.S, strcpy.S, etc to 
add R10K cache barriers) and built an R10K IP32 kernel with the gcc cache 
barriers patch he also has up.  End result seemed to be a kernel, that while 
definitely not production ready, stays up longer than others.

As told to me by Ilya, using the network device too much or even remotely trying 
to use scsi instantly killed the machine (based on his tests a while back).  My 
results were a bit different in that I was able to use the network for a 
prolonged time (over ssh too)  and even managed to use the scsi disk some, 
downloading and unpacking a 40MB tbz2 of a Gentoo stage3 (uclibc) tarball on the 
system.  It didn't finally lock up until an 'emerge sync', which hits the disk 
heavily with lots of small files (which as Ilya predicted, would kill it).

Whether this was the result of using the R10K cache barriers in both the IP28 
patch and gcc, or just general kernel improvements in those drivers, is anyone's 
guess, but I got farther than I initially expected (I didn't hold out hope that 
such a system would even boot, let alone get to userland).


IP28 used protected intermediate buffers for DMA input on the sgiwd93 (scsi) 
driver, but not on sgiseeq (net), as this had access to bounce buffers for 
incoming packets.  IP32's meth (net) driver was designed specifically to avoid 
the need of bounce buffers, though, so likely it needs some solution to work 
around the speculative execution bit.  IP32 likely needs the same or similar 
implements in its scsi driver (aic7xxx) as well).

It seems when speculative execution triggers, this is seen on the primary console:

CRIME CPU error at 0x00b74fa90 status 0x00000004
CRIME CPU error at 0x00b74fa90 status 0x00000004
CRIME CPU error at 0x01005fcc0 status 0x00000004
CRIME CPU error at 0x00b74fa90 status 0x00000004
CRIME CPU error at 0x00b74fa90 status 0x00000004
CRIME CPU error at 0x00b74fa90 status 0x00000004

(once per hit; they don't happen very often unless scsi is getting a workout. 
The two repeating addresses are constant, it seems).


Thoughts?


For testing, I have an IP32 R10K netboot kernel available, based on 2.6.15.1, if 
anyone wants to try it out for themselves and report their findings.

http://dev.gentoo.org/~kumba/mips/netboot/testing/ip32/r10k/



--Kumba

-- 
Gentoo/MIPS Team Lead
Gentoo Foundation Board of Trustees

"Such is oft the course of deeds that move the wheels of the world: small hands 
do them because they must, while the eyes of the great are elsewhere."  --Elrond

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: R10K/R12K Based O2's
  2006-01-28  5:19 R10K/R12K Based O2's Kumba
@ 2006-02-07  2:58 ` Kumba
  0 siblings, 0 replies; 2+ messages in thread
From: Kumba @ 2006-02-07  2:58 UTC (permalink / raw)
  To: Linux MIPS List

Round 2:

Discovered I'd built my IP32 R10K kernel with:
	A) An older version of Peter Fuerst's R10K Cache Barriers Patch
	B) Did a code change in the above patch that generated less
	   cache barriers instead of more

So I've re-done everything, and now have a kernel for IP32 R10K (and R12K) that 
uses the latest gcc cache barriers patch available on the IP28 site w/ the one 
code change that generates more cache barriers (See below for details).

It seems with the gcc patch and the stripped IP28 patch, a working IP32 R10K 
kernel can be produced that seems to handle itself very well.  It occasionally 
spits out CRIME CPU errors, and in my case, I've triggered two Oopses 
(non-fatal).  It seems the network adapter doesn't seem to trigger the 
speculative execution bug too much (which I believe is indicated by the CRIME 
CPU errors), however the scsi driver definitely causes them to appear every so 
often.

Patches and other misc. files available here:
http://dev.gentoo.org/~kumba/mips/netboot/testing/ip32/r10k/

Included there is a 2.6.15.2 kernel, the kernel patch, the gcc patch, the kernel 
disassembly (71MB whopper unpacked), and the cpio archive containing the netboot 
userland (this is actually 5-6 cpio archives slapped together with 'cat', but 
the kernel handles it just fine), and the md5/sha1 sums to verify against.

To boot:
bootp(): console=ttyS0,<baud>  (gbefb seems to not respond on these systems)
Initramfs should take over and load a mini netboot userland.  Has basic 
capabilities, and supports a few fs'es (xfs, ext2/3, no reiser/jfs, etc)


Stats:

# uname -a
Linux netboot-2006.0 2.6.15.2-mipsgit-20060109 #2 Sun Feb 5 19:28:36 EST 2006 
mips64 R10000 V2.6  FPU V0.0 SGI O2 GNU/Linux

# uptime
21:54:14 up 1 day,  2:23, load average: 2.01, 2.02, 2.00

Currently, it's building gcc-3.4.5 as I type this, using the disk for everything 
but swap.  There are some caveats, though, which are outlined below:


The Errors (Just a sample, and in no specific order):
CRIME CPU error at 0x00f73b850 status 0x00000004
CRIME CPU error at 0x01005fcc0 status 0x00000004
CRIME CPU error at 0x005a8b810 status 0x00000004
CRIME CPU error at 0x005a8b850 status 0x00000004
CRIME CPU error at 0x007bd7850 status 0x00000004
CRIME CPU error at 0x00802f850 status 0x00000004
CRIME CPU error at 0x003b0b850 status 0x00000004
CRIME CPU error at 0x00e33f850 status 0x00000004
CRIME CPU error at 0x00a1df850 status 0x00000004

If anyone has an idea just what those hex addresses are referencing, or what 
exact is the meaning of that status bit, I'd be curious to know.


The Oopses:
http://dev.gentoo.org/~kumba/mips/netboot/testing/ip32/r10k/ip32r10k-26152-oops1.txt
http://dev.gentoo.org/~kumba/mips/netboot/testing/ip32/r10k/ip32r10k-26152-oops2.txt

These seem similar to the old MC Bus Errors I hit when toying around in the 
early days of IP28 support.  They've pretty much vanished now since Peter wrote 
up the new IP28 bus error handler.  So whether this are the same or not is 
anyone's guess -- if so, I imagine there's some degree of fixups that can be 
done to IP32's bus error handler to take care of these.  But that's just pure 
speculation on my part.


Read the README file there too -- it's got some small /proc tweaks that, at a 
first glance, help the kernel to avoid touching the disk too often by turning 
off swap entirely and setting two other small vm settings.



--Kumba

-- 
Gentoo/MIPS Team Lead
Gentoo Foundation Board of Trustees

"Such is oft the course of deeds that move the wheels of the world: small hands 
do them because they must, while the eyes of the great are elsewhere."  --Elrond

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-02-07  2:52 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-01-28  5:19 R10K/R12K Based O2's Kumba
2006-02-07  2:58 ` Kumba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.