* Crash on MPC855T with 2.2.14
@ 2004-05-26 22:09 Marcelo Tosatti
2004-05-26 22:20 ` Help with crash " Marcelo Tosatti
0 siblings, 1 reply; 2+ messages in thread
From: Marcelo Tosatti @ 2004-05-26 22:09 UTC (permalink / raw)
To: linuxppc-embedded; +Cc: Nei A. Chiaradia, Edson Seabra
Hi PPC fellows,
We are facing a crash on high load on our TS console servers (2.2.14 based).
The test used to reproduce the crash involves running SSH connection
attemps in a loop from a fast host. After one or two hours of testing,
the crash happens. Its still possible to ping the box and it answers to
typed keys, but thats all. The kernel is looping in page fault handling
code as following, which has been observed from a BDI2000 and gdb:
(gdb) cont
Continuing.
(locked here, so I type "ctrl+c" on the gdb session).
Program received signal SIGSTOP, Stopped (signal).
local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549
549 asm volatile ("tlbia" : : );
(gdb) bt
#0 local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549
#1 0xc0019368 in handle_mm_fault (tsk=0xce95e000, vma=0xce678200,
address=2147481140, write_access=33554432) at memory.c:918
Cannot access memory at address 0xce95fca0
(gdb) cont
Continuing.
And it keeps receiving faults from this address (7FFFF634 in this example,
sometimes also 7FFFF630), which are part of the process last VMA. Forever.
# cat /proc/1/maps
30023000-30026000 rwxp 00013000 01:00 249 /lib/ld-2.1.3.so
30026000-30027000 rwxp 00000000 00:00 0
7fffe000-80000000 rwxp fffff000 00:00 0
The "error_code" passed to "do_page_fault" under such endless loop
is either 0xE (14) or 0x82000000 (2181038080).
handle_mm_fault trace for such "unsuccessful pte bringup":
#0 handle_mm_fault (tsk=0xce70c000, vma=0xce6188c0, address=2147481140,
write_access=33554432) at memory.c:901
903 if (!pte_present(entry)) {
909 entry = pte_mkyoung(entry);
910 set_pte(pte, entry);
911 flush_tlb_page(vma, address);
912 if (write_access) {
913 if (!pte_write(entry))
303 pte_val(pte) |= _PAGE_DIRTY;
304 if (pte_val(pte) & _PAGE_RW)
305 pte_val(pte) |= _PAGE_HWWRITE;
918 flush_tlb_page(vma, address);
916 entry = pte_mkdirty(entry);
917 set_pte(pte, entry);
918 flush_tlb_page(vma, address);
921 return 1;
I should try to figure out why is it faulting. Maybe the pte
is not being correctly setup.
Any hints are welcome.
/proc/cpuinfo
processor : 0
cpu : 8xx
clock : 48MHz
clock : 48MHz
bus clock : 48MHz
revision : 0.0
bogomips : 47.82
zero pages : total 0 (0Kb) current: 0 (0Kb) hits: 0/124087 (0%)
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Help with crash on MPC855T with 2.2.14
2004-05-26 22:09 Crash on MPC855T with 2.2.14 Marcelo Tosatti
@ 2004-05-26 22:20 ` Marcelo Tosatti
0 siblings, 0 replies; 2+ messages in thread
From: Marcelo Tosatti @ 2004-05-26 22:20 UTC (permalink / raw)
To: linuxppc-embedded; +Cc: Nei A. Chiaradia, Edson Seabra
Forgot to mention that same processor (on a similar but not exactly the
same hardware) running v2.4 is not-crashable with the same test.
On Wed, May 26, 2004 at 07:09:54PM -0300, Marcelo Tosatti wrote:
>
> Hi PPC fellows,
>
> We are facing a crash on high load on our TS console servers (2.2.14 based).
>
> The test used to reproduce the crash involves running SSH connection attemps in a loop
> from a fast host. After one or two hours of testing, the crash happens. Its still
> possible to ping the box and it answers to typed keys, but thats all. The kernel is looping
> in page fault handling code as following, which has been observed from a BDI2000 and gdb:
>
> (gdb) cont
> Continuing.
>
> (locked here, so I type "ctrl+c" on the gdb session).
>
> Program received signal SIGSTOP, Stopped (signal).
> local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549
> 549 asm volatile ("tlbia" : : );
> (gdb) bt
> #0 local_flush_tlb_page (vma=0xce678200, vmaddr=2147481140) at init.c:549
> #1 0xc0019368 in handle_mm_fault (tsk=0xce95e000, vma=0xce678200,
> address=2147481140, write_access=33554432) at memory.c:918
> Cannot access memory at address 0xce95fca0
> (gdb) cont
> Continuing.
>
> And it keeps receiving faults from this address (7FFFF634 in this example,
> sometimes also 7FFFF630), which are part of the process last VMA. Forever.
>
> # cat /proc/1/maps
>
> 30023000-30026000 rwxp 00013000 01:00 249 /lib/ld-2.1.3.so
> 30026000-30027000 rwxp 00000000 00:00 0
> 7fffe000-80000000 rwxp fffff000 00:00 0
>
> The "error_code" passed to "do_page_fault" under such endless loop
> is either 0xE (14) or 0x82000000 (2181038080).
>
> handle_mm_fault trace for such "unsuccessful pte bringup":
>
> #0 handle_mm_fault (tsk=0xce70c000, vma=0xce6188c0, address=2147481140,
> write_access=33554432) at memory.c:901
>
> 903 if (!pte_present(entry)) {
> 909 entry = pte_mkyoung(entry);
> 910 set_pte(pte, entry);
> 911 flush_tlb_page(vma, address);
> 912 if (write_access) {
> 913 if (!pte_write(entry))
> 303 pte_val(pte) |= _PAGE_DIRTY;
> 304 if (pte_val(pte) & _PAGE_RW)
> 305 pte_val(pte) |= _PAGE_HWWRITE;
> 918 flush_tlb_page(vma, address);
> 916 entry = pte_mkdirty(entry);
> 917 set_pte(pte, entry);
> 918 flush_tlb_page(vma, address);
> 921 return 1;
>
> I should try to figure out why is it faulting. Maybe the pte
> is not being correctly setup.
>
> Any hints are welcome.
>
> /proc/cpuinfo
> processor : 0
> cpu : 8xx
> clock : 48MHz
> clock : 48MHz
> bus clock : 48MHz
> revision : 0.0
> bogomips : 47.82
> zero pages : total 0 (0Kb) current: 0 (0Kb) hits: 0/124087 (0%)
** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2004-05-26 22:20 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-26 22:09 Crash on MPC855T with 2.2.14 Marcelo Tosatti
2004-05-26 22:20 ` Help with crash " Marcelo Tosatti
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).