* crash on SysRq : Show Memory
@ 2004-03-09 16:09 Aron Griffis
2004-03-09 17:14 ` Aron Griffis
` (11 more replies)
0 siblings, 12 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 16:09 UTC (permalink / raw)
To: linux-ia64
[-- Attachment #1: Type: text/plain, Size: 4453 bytes --]
I'm running davidm's bk kernel, version 2.6.4-rc1, built with gcc-3.3.
I've been poking around with SysRq and hit a crash last evening with the
following footprint. After this the machine was wedged so badly that
the Ethernet MP stopped responding completely, so I was unable to power
cycle the machine until I arrived at the office.
telnet> send brk
SysRq : Show Memory
swapper[0]: Oops 8813272891392 [1]
Pid: 0, CPU 0, comm: swapper
psr : 0000101008026018 ifs : 8000000000000389 ip : [<a00000010005b9e0>] Tainted: GF
ip is at show_mem+0x100/0x260
unat: 0000000000000000 pfs : 0000000000000389 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr : 80000000ff556565
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f
csd : 0000000000000000 ssd : 0000000000000000
b0 : a00000010005b970 b6 : a00000010006dc70 b7 : a0000001002eda00
f6 : 0fffbccccccccc8c00000 f7 : 0ffdaa200000000000000
f8 : 100008000000000000000 f9 : 10002a000000000000000
f10 : 0fffcccccccccc8c00000 f11 : 1003e0000000000000000
r1 : a000000100afd5c0 r2 : ffffffffffffff05 r3 : 000000000103ffef
r8 : a0000001009174f0 r9 : a0000001009174f0 r10 : 0000000000000000
r11 : a0000001009172a0 r12 : e0000000047ffb90 r13 : e0000000047f8000
r14 : a000000100848510 r15 : a00000010084dcf8 r16 : a0007fffff0fffb0
r17 : 000000000100ffff r18 : a0007fffff0fffd0 r19 : a0007fffaec00000
r20 : a000000100917508 r21 : 0000000000000000 r22 : 000000000504fffb
r23 : 000000000100ffff r24 : 0000000000000000 r25 : 0000000000000001
r26 : 00000000000008eb r27 : 0000000000000001 r28 : a000000100931c12
r29 : 0000000000004b3a r30 : 0000000000000000 r31 : a000000100917258
Call Trace:
[<a000000100018420>] show_stack+0x80/0xa0
sp=e0000000047ff760 bsp=e0000000047f96b8
[<a00000010003ca70>] die+0x170/0x200
sp=e0000000047ff930 bsp=e0000000047f9680
[<a00000010005a850>] ia64_do_page_fault+0x350/0x980
sp=e0000000047ff930 bsp=e0000000047f9618
[<a000000100011540>] ia64_leave_kernel+0x0/0x260
sp=e0000000047ff9c0 bsp=e0000000047f9618
[<a00000010005b9e0>] show_mem+0x100/0x260
sp=e0000000047ffb90 bsp=e0000000047f95c8
[<a0000001003928a0>] sysrq_handle_showmem+0x20/0x40
sp=e0000000047ffb90 bsp=e0000000047f95b0
[<a000000100392dd0>] __handle_sysrq_nolock+0x110/0x260
sp=e0000000047ffb90 bsp=e0000000047f9568
[<a000000100392c90>] handle_sysrq+0x70/0xa0
sp=e0000000047ffb90 bsp=e0000000047f9538
[<a00000010040e110>] receive_chars+0x330/0x680
sp=e0000000047ffb90 bsp=e0000000047f9448
[<a00000010040ebf0>] serial8250_interrupt+0x1b0/0x200
sp=e0000000047ffba0 bsp=e0000000047f93d8
[<a000000100014d40>] handle_IRQ_event+0xa0/0x120
sp=e0000000047ffbb0 bsp=e0000000047f9390
[<a000000100015780>] do_IRQ+0x280/0x380
sp=e0000000047ffbb0 bsp=e0000000047f9340
[<a000000100017400>] ia64_handle_irq+0xa0/0x1a0
sp=e0000000047ffbb0 bsp=e0000000047f9308
[<a000000100011540>] ia64_leave_kernel+0x0/0x260
sp=e0000000047ffbb0 bsp=e0000000047f9308
[<a0000001000177c0>] ia64_pal_call_static+0x80/0xa0
sp=e0000000047ffd80 bsp=e0000000047f9000
[<a000000100018e30>] default_idle+0x90/0x140
sp=e0000000047ffd80 bsp=e0000000047f8fd8
[<a000000100019000>] cpu_idle+0x120/0x1e0
sp=e0000000047ffe20 bsp=e0000000047f8f78
[<a000000100008d30>] rest_init+0x90/0xe0
sp=e0000000047ffe20 bsp=e0000000047f8f60
[<a0000001007c4e10>] start_kernel+0x490/0x520
sp=e0000000047ffe20 bsp=e0000000047f8f00
[<a0000001000081a0>] _start+0x280/0x2a0
sp=e0000000047ffe30 bsp=e0000000047f8e90
<<00>>KIenr nienlt eprarnuipct: hAainedel,e rk i-l lniontg siynntceirnrugp
h a t
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
@ 2004-03-09 17:14 ` Aron Griffis
2004-03-09 17:21 ` Aron Griffis
` (10 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 17:14 UTC (permalink / raw)
To: linux-ia64
Aron Griffis wrote: [Tue Mar 09 2004, 11:09:04AM EST]
> I'm running davidm's bk kernel, version 2.6.4-rc1, built with gcc-3.3.
^^^^^^^^
Sorry, I meant gcc-3.3.3
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
2004-03-09 17:14 ` Aron Griffis
@ 2004-03-09 17:21 ` Aron Griffis
2004-03-09 17:54 ` Jesse Barnes
` (9 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 17:21 UTC (permalink / raw)
To: linux-ia64
Aron Griffis wrote: [Tue Mar 09 2004, 11:09:04AM EST]
> <<00>>KIenr nienlt eprarnuipct: hAainedel,e rk i-l lniontg siynntceirnrugp
> h a t
Deciphered:
Kernel panic: Aiee, killing interrupt
In interrupt handler - not syncing
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
2004-03-09 17:14 ` Aron Griffis
2004-03-09 17:21 ` Aron Griffis
@ 2004-03-09 17:54 ` Jesse Barnes
2004-03-09 18:47 ` Aron Griffis
` (8 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Jesse Barnes @ 2004-03-09 17:54 UTC (permalink / raw)
To: linux-ia64
On Tue, Mar 09, 2004 at 11:09:04AM -0500, Aron Griffis wrote:
> I'm running davidm's bk kernel, version 2.6.4-rc1, built with gcc-3.3.
> I've been poking around with SysRq and hit a crash last evening with the
> following footprint. After this the machine was wedged so badly that
> the Ethernet MP stopped responding completely, so I was unable to power
> cycle the machine until I arrived at the office.
What kind of machine are you running on? What's your .config look like?
Jesse
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (2 preceding siblings ...)
2004-03-09 17:54 ` Jesse Barnes
@ 2004-03-09 18:47 ` Aron Griffis
2004-03-09 18:57 ` Grant Grundler
` (7 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 18:47 UTC (permalink / raw)
To: linux-ia64
Jesse Barnes wrote: [Tue Mar 09 2004, 12:54:25PM EST]
> What kind of machine are you running on? What's your .config look like?
Long's Peak. I've dropped the config at
http://dev.gentoo.org/~agriffis/sysrq_oops/
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (3 preceding siblings ...)
2004-03-09 18:47 ` Aron Griffis
@ 2004-03-09 18:57 ` Grant Grundler
2004-03-09 19:25 ` Jesse Barnes
` (6 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Grant Grundler @ 2004-03-09 18:57 UTC (permalink / raw)
To: linux-ia64
On Tue, Mar 09, 2004 at 01:47:15PM -0500, Aron Griffis wrote:
> Jesse Barnes wrote: [Tue Mar 09 2004, 12:54:25PM EST]
> > What kind of machine are you running on? What's your .config look like?
>
> Long's Peak.
That's an RX2600 for anyone wanting to find the product on www.hp.com
grant
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (4 preceding siblings ...)
2004-03-09 18:57 ` Grant Grundler
@ 2004-03-09 19:25 ` Jesse Barnes
2004-03-09 20:10 ` Kenneth Chen
` (5 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Jesse Barnes @ 2004-03-09 19:25 UTC (permalink / raw)
To: linux-ia64
On Tue, Mar 09, 2004 at 10:57:48AM -0800, Grant Grundler wrote:
> On Tue, Mar 09, 2004 at 01:47:15PM -0500, Aron Griffis wrote:
> > Jesse Barnes wrote: [Tue Mar 09 2004, 12:54:25PM EST]
> > > What kind of machine are you running on? What's your .config look like?
> >
> > Long's Peak.
>
> That's an RX2600 for anyone wanting to find the product on www.hp.com
Ah, zx1 then. I accidentally deleted the original backtrace and it
doesn't appear to be in the list archives yet, but I'm guessing that
since the panic occured at arch/ia64/mm/contig.c:show_mem+0x100, I'm
guessing the problem is somewhere after the call to show_free_areas().
I don't have a zx1 box that's easy to test with, so I probably won't be
much help.
Jesse
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (5 preceding siblings ...)
2004-03-09 19:25 ` Jesse Barnes
@ 2004-03-09 20:10 ` Kenneth Chen
2004-03-09 21:13 ` Aron Griffis
` (4 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Kenneth Chen @ 2004-03-09 20:10 UTC (permalink / raw)
To: linux-ia64
Jesse Barnes wrote on Tue, March 09, 2004 11:26 AM
> On Tue, Mar 09, 2004 at 10:57:48AM -0800, Grant Grundler wrote:
> > On Tue, Mar 09, 2004 at 01:47:15PM -0500, Aron Griffis wrote:
> > > Jesse Barnes wrote: [Tue Mar 09 2004, 12:54:25PM EST]
> > > > What kind of machine are you running on?
> > >
> > > Long's Peak.
> >
> > That's an RX2600 for anyone wanting to find the product on www.hp.com
>
> Ah, zx1 then. I accidentally deleted the original backtrace and it
> doesn't appear to be in the list archives yet, but I'm guessing that
> since the panic occured at arch/ia64/mm/contig.c:show_mem+0x100, I'm
> guessing the problem is somewhere after the call to show_free_areas().
> I don't have a zx1 box that's easy to test with, so I probably won't be
> much help.
Looks like it passed beyond show_free_areas(), faulting IP was in the
while loop accessing mem_map variable.
By the way, the local variable total is redundant in that function.
Same data already exists with max_mapnr.
- Ken
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (6 preceding siblings ...)
2004-03-09 20:10 ` Kenneth Chen
@ 2004-03-09 21:13 ` Aron Griffis
2004-03-09 21:36 ` Aron Griffis
` (3 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 21:13 UTC (permalink / raw)
To: linux-ia64
Jesse Barnes wrote: [Tue Mar 09 2004, 02:25:41PM EST]
> Ah, zx1 then. I accidentally deleted the original backtrace
http://dev.gentoo.org/~agriffis/oops.txt
> and it
> doesn't appear to be in the list archives yet, but I'm guessing that
> since the panic occured at arch/ia64/mm/contig.c:show_mem+0x100, I'm
> guessing the problem is somewhere after the call to show_free_areas().
> I don't have a zx1 box that's easy to test with, so I probably won't be
> much help.
So I have the vmlinux which was built with debugging symbols (I assume
this is why I got a backtrace as well)... however I'm not immediately
seeing how to trace this back into show_mem()
$ gdb vmlinux
(gdb) list *(show_mem+0x100)
0xa00000010005b9e0 is in show_mem (bitops.h:280).
275 }
276
277 static __inline__ int
278 test_bit (int nr, const volatile void *addr)
279 {
280 return 1 & (((const volatile __u32 *) addr)[nr >> 5] >> (nr & 31));
281 }
282
283 /**
284 * ffz - find the first zero bit in a long word
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (7 preceding siblings ...)
2004-03-09 21:13 ` Aron Griffis
@ 2004-03-09 21:36 ` Aron Griffis
2004-03-09 22:49 ` Kenneth Chen
` (2 subsequent siblings)
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 21:36 UTC (permalink / raw)
To: linux-ia64
Kenneth Chen wrote: [Tue Mar 09 2004, 03:10:50PM EST]
> Looks like it passed beyond show_free_areas(), faulting IP was in the
> while loop accessing mem_map variable.
How did you determine that?
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* RE: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (8 preceding siblings ...)
2004-03-09 21:36 ` Aron Griffis
@ 2004-03-09 22:49 ` Kenneth Chen
2004-03-09 23:31 ` Aron Griffis
2004-03-09 23:39 ` Andreas Schwab
11 siblings, 0 replies; 13+ messages in thread
From: Kenneth Chen @ 2004-03-09 22:49 UTC (permalink / raw)
To: linux-ia64
Aron Griffis wrote on Tue, March 09, 2004 1:36 PM
> > Looks like it passed beyond show_free_areas(), faulting IP was
> > in the while loop accessing mem_map variable.
> How did you determine that?
Do objdump on vmlinux, then look at the disassembled code at
faulting IP:
Show_mem+0x100
a00000010003cd00: [MMI] ld4.acq r21=[r16];;
It is page faulting on r16, lookup r16 in the panic dump, it shows:
r16 : a0007fffff0fffb0, which basically points to a page struct in
the mem_map array, you might want to verify whether that address is
valid or not.
Exam the code up/down a little bit, you would realize the while loop
is in between show_mem+0xe0 and show_mem+0x1dc, variable i is stored
in r17/r3, lookup up these instructions to see how the pointer is
calculated with index i:
(i*5*16 + base mem_map, sizeof(struct page) = 80).
a00000010003cce6: sxt4 r23=r17;;
a00000010003ccec: shladd r22=r23,2,r23;;
a00000010003ccf0: [MMI] shladd r16=r22,4,r19;;
Anyway, hope you get the idea ....
- Ken
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (9 preceding siblings ...)
2004-03-09 22:49 ` Kenneth Chen
@ 2004-03-09 23:31 ` Aron Griffis
2004-03-09 23:39 ` Andreas Schwab
11 siblings, 0 replies; 13+ messages in thread
From: Aron Griffis @ 2004-03-09 23:31 UTC (permalink / raw)
To: linux-ia64
Kenneth Chen wrote: [Tue Mar 09 2004, 05:49:26PM EST]
> Anyway, hope you get the idea ....
Thanks! That's exactly what I was trying to do but lacked the
know-how...
--
Aron Griffis
hp Linux and Open Source Lab
Key fingerprint = 4601 AE87 379D A917 BA62 5263 C284 0366 5E6A 3C6B
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: crash on SysRq : Show Memory
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
` (10 preceding siblings ...)
2004-03-09 23:31 ` Aron Griffis
@ 2004-03-09 23:39 ` Andreas Schwab
11 siblings, 0 replies; 13+ messages in thread
From: Andreas Schwab @ 2004-03-09 23:39 UTC (permalink / raw)
To: linux-ia64
jbarnes@sgi.com (Jesse Barnes) writes:
> On Tue, Mar 09, 2004 at 10:57:48AM -0800, Grant Grundler wrote:
>> On Tue, Mar 09, 2004 at 01:47:15PM -0500, Aron Griffis wrote:
>> > Jesse Barnes wrote: [Tue Mar 09 2004, 12:54:25PM EST]
>> > > What kind of machine are you running on? What's your .config look like?
>> >
>> > Long's Peak.
>>
>> That's an RX2600 for anyone wanting to find the product on www.hp.com
>
> Ah, zx1 then.
I can reproduce it on Tiger as well. Doesn't seem to be platform
dependent.
Andreas.
--
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-03-09 23:39 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-03-09 16:09 crash on SysRq : Show Memory Aron Griffis
2004-03-09 17:14 ` Aron Griffis
2004-03-09 17:21 ` Aron Griffis
2004-03-09 17:54 ` Jesse Barnes
2004-03-09 18:47 ` Aron Griffis
2004-03-09 18:57 ` Grant Grundler
2004-03-09 19:25 ` Jesse Barnes
2004-03-09 20:10 ` Kenneth Chen
2004-03-09 21:13 ` Aron Griffis
2004-03-09 21:36 ` Aron Griffis
2004-03-09 22:49 ` Kenneth Chen
2004-03-09 23:31 ` Aron Griffis
2004-03-09 23:39 ` Andreas Schwab
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox