From mboxrd@z Thu Jan 1 00:00:00 1970 From: "David S. Miller" Date: Tue, 21 Sep 2004 19:48:13 +0000 Subject: Re: RED State Exception on Ultra 1, 2.6.9-rc2 Message-Id: <20040921124813.3d2fa37c.davem@davemloft.net> List-Id: References: <1095742791.3421.3.camel@ori.thedillows.org> In-Reply-To: <1095742791.3421.3.camel@ori.thedillows.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: sparclinux@vger.kernel.org On 21 Sep 2004 09:47:02 -0400 David Dillow wrote: > > > TL00.0000.0000.0005 TT00.0000.0000.0080 > > > TPC00.0000.0040.ec98 TnPC00.0000.0040.ec9c TSTATE00.0000.8000.9504 > TPC: etrap_irq > TnPC: etrap_irq > > > TL00.0000.0000.0004 TT00.0000.0000.0010 > > > TPC00.0000.0040.d000 TnPC00.0000.0040.d004 TSTATE00.0000.8000.9504 > TPC: tl1_s0n > TnPC: tl1_s0n > > > TL00.0000.0000.0003 TT00.0000.0000.0080 > > > TPC00.0000.0040.ec98 TnPC00.0000.0040.ec9c TSTATE00.0000.8000.9502 > TPC: etrap_irq > TnPC: etrap_irq > > > TL00.0000.0000.0002 TT00.0000.0000.0010 > > > TPC00.0000.0040.8c00 TnPC00.0000.0040.8c04 TSTATE00.0000.8008.9402 > TPC: tl0_ivec > TnPC: tl0_ivec > > > TL00.0000.0000.0001 TT00.0000.0000.0060 > > > TPC00.0000.0042.1b68 TnPC00.0000.0042.1b6c TSTATE00.0000.8000.9602 > TPC: free_streaming_cluster > TnPC: free_streaming_cluster > > Doh! Did I mess something up in my prior patch allowing larger than 1MB > sbus_map_sg()'s? I don't think so, based upon this trace. TT means "Trap Type", that's the numbered trap the cpu took at each trap level and what the trap type numbers mean is described in the UltraSPARC programmer's manual. TSTATE's layout is defined by macros in include/asm-sparc64/pstate.h Anyways, in free_streaming_cluster() we took a vectored interrupt (trap type 0x60). In tl0_ivec we took an illegal instruction trap, which is very odd because the instruction at 0x408c00:tl0_ivec is a branch. It looks like something clobbered the instruction there, that is my best guess. If you get one of these again you can, at the OBP prompt, say: ok 0x408c00 dis and see if the instruction has been corrupted there. It may be your DMA mapping changes, so doing the following wouldn't hurt: 1) try running with those DMA mapping patches reverted 2) audit those patches for possible errors Thanks.