* [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP
@ 2003-01-04 19:38 John David Anglin
2003-01-05 5:51 ` Grant Grundler
0 siblings, 1 reply; 4+ messages in thread
From: John David Anglin @ 2003-01-04 19:38 UTC (permalink / raw)
To: parisc-linux
This has been around for awhile. When using a SMP configuration, the
program expect "causes" a segmentation fault that results in do_page_fault()
going into an infinite loop. The log data repeats indefinitely and
eventually fills /var. For some reason, expect is not killed by the kernel
when this happens, although the loop can be broken by manually killing it.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP
2003-01-04 19:38 [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP John David Anglin
@ 2003-01-05 5:51 ` Grant Grundler
2003-01-05 6:16 ` John David Anglin
2003-01-05 6:19 ` John David Anglin
0 siblings, 2 replies; 4+ messages in thread
From: Grant Grundler @ 2003-01-05 5:51 UTC (permalink / raw)
To: John David Anglin; +Cc: parisc-linux
On Sat, Jan 04, 2003 at 02:38:15PM -0500, John David Anglin wrote:
> This has been around for awhile. When using a SMP configuration, the
> program expect "causes" a segmentation fault that results in do_page_fault()
> going into an infinite loop. The log data repeats indefinitely and
> eventually fills /var. For some reason, expect is not killed by the kernel
> when this happens, although the loop can be broken by manually killing it.
This on gsyprf11? (running SMP 2.4.20-pa13 on a500-65)
I'm hoping this is unrelated to my entry.S changes.
But is certainly sounds like that kind of problem.
In -pa12, Randolph and I fixed:
| revision 1.98
| date: 2002/12/09 06:09:08; author: tausq; state: Exp; lines: +2 -2
| -pa12
| fix interruption return path so that it will process signals after
| handle_interruption()
| (thanks to Grant for pointing this out)
Since I broken this with -pa11, maybe the rebuild of -pa13 picked
up the old -pa11 entry.o?
I'll rebuild from scratch to rule this out and reboot gsyprf11.
Perhaps a user space signal handler is interfering?
BTW, appended is one "expect" segfault info from dmesg ouput.
Dmesg output is filled with the same PID and AFAICT the register dumps
look identical too. "infinite" is about right.
grant
do_page_fault() pid=28552 command='expect' type=15 address=0x00000014
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001001111111100001111 Not tainted
r00-03 0000000000000000 fffffffffffffffa 00000000403309c4 00000000403309d4
r04-07 0000000040330970 000000004032ea28 0000000000000063 000000004032ea28
r08-11 0000000000021110 0000000000205ff4 0000000000000006 0000000000003b1b
r12-15 0000000000000001 0000000000000000 0000000000207d40 0000000000000001
r16-19 0000000000000000 0000000000000001 0000000000000000 000000004032ea28
r20-23 000000000000000b 000000000000000c 0000000000205628 00000000002055f8
r24-27 0000000000000030 0000000000000000 0000000040330970 0000000000020d44
r28-31 0000000000000002 00000000403309e8 00000000faf05a40 0000000000000000
sr0-3 000000000037b780 000000000037b780 0000000000000000 000000000037b780
sr4-7 000000000037b780 000000000037b780 000000000037b780 000000000037b780
IASQ: 000000000037b780 000000000037b780 IAOQ: 000000004025b45f 000000004025b463
IIR: 0eb41290 ISR: 000000000037b780 IOR: 0000000000000014
CPU: 1 CR30: 0000000030754000 CR31: 0000000000008020
ORIG_R28: 0000000000000002
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP
2003-01-05 5:51 ` Grant Grundler
@ 2003-01-05 6:16 ` John David Anglin
2003-01-05 6:19 ` John David Anglin
1 sibling, 0 replies; 4+ messages in thread
From: John David Anglin @ 2003-01-05 6:16 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
> On Sat, Jan 04, 2003 at 02:38:15PM -0500, John David Anglin wrote:
> > This has been around for awhile. When using a SMP configuration, the
> > program expect "causes" a segmentation fault that results in do_page_fault()
> > going into an infinite loop. The log data repeats indefinitely and
> > eventually fills /var. For some reason, expect is not killed by the kernel
> > when this happens, although the loop can be broken by manually killing it.
>
> This on gsyprf11? (running SMP 2.4.20-pa13 on a500-65)
We were running 2.4.20-pa18 earlier today. I rebooted to see if
that would help and SMP 2.4.20-pa13 came up. It think the sample
fault below was on 2.4.20-pa18.
> I'm hoping this is unrelated to my entry.S changes.
Possibly, this is involved. The IAOQ below points to an address in
the dynamic loader or a shared library. I tried building a static
version of expect to see if I could locate which code was causing
the problem but it didn't work at all. It caused page faults in
what was possibly a syscall. The return pointer was still above
0x40000000.
> But is certainly sounds like that kind of problem.
>
> In -pa12, Randolph and I fixed:
> | revision 1.98
> | date: 2002/12/09 06:09:08; author: tausq; state: Exp; lines: +2 -2
> | -pa12
> | fix interruption return path so that it will process signals after
> | handle_interruption()
> | (thanks to Grant for pointing this out)
>
> Since I broken this with -pa11, maybe the rebuild of -pa13 picked
> up the old -pa11 entry.o?
Don't know. However, I haven't seen the hang during gcc's configure
process. That's where I first noticed the page fault problem that
you and Randolph fixed above.
> I'll rebuild from scratch to rule this out and reboot gsyprf11.
>
> Perhaps a user space signal handler is interfering?
>
> BTW, appended is one "expect" segfault info from dmesg ouput.
> Dmesg output is filled with the same PID and AFAICT the register dumps
> look identical too. "infinite" is about right.
>
> grant
>
> do_page_fault() pid=28552 command='expect' type=15 address=0x00000014
>
> YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00000000000001001111111100001111 Not tainted
> r00-03 0000000000000000 fffffffffffffffa 00000000403309c4 00000000403309d4
> r04-07 0000000040330970 000000004032ea28 0000000000000063 000000004032ea28
> r08-11 0000000000021110 0000000000205ff4 0000000000000006 0000000000003b1b
> r12-15 0000000000000001 0000000000000000 0000000000207d40 0000000000000001
> r16-19 0000000000000000 0000000000000001 0000000000000000 000000004032ea28
> r20-23 000000000000000b 000000000000000c 0000000000205628 00000000002055f8
> r24-27 0000000000000030 0000000000000000 0000000040330970 0000000000020d44
> r28-31 0000000000000002 00000000403309e8 00000000faf05a40 0000000000000000
> sr0-3 000000000037b780 000000000037b780 0000000000000000 000000000037b780
> sr4-7 000000000037b780 000000000037b780 000000000037b780 000000000037b780
>
> IASQ: 000000000037b780 000000000037b780 IAOQ: 000000004025b45f 000000004025b463
> IIR: 0eb41290 ISR: 000000000037b780 IOR: 0000000000000014
> CPU: 1 CR30: 0000000030754000 CR31: 0000000000008020
> ORIG_R28: 0000000000000002
>
>
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP
2003-01-05 5:51 ` Grant Grundler
2003-01-05 6:16 ` John David Anglin
@ 2003-01-05 6:19 ` John David Anglin
1 sibling, 0 replies; 4+ messages in thread
From: John David Anglin @ 2003-01-05 6:19 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
> Perhaps a user space signal handler is interfering?
I believe expect uses alarm to handle timeouts.
Dave
--
J. David Anglin dave.anglin@nrc.ca
National Research Council of Canada (613) 990-0752 (FAX: 952-6605)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-01-05 6:19 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-01-04 19:38 [parisc-linux] do_page_fault() infinite loop running 2.4.20-pa18 #9 SMP John David Anglin
2003-01-05 5:51 ` Grant Grundler
2003-01-05 6:16 ` John David Anglin
2003-01-05 6:19 ` John David Anglin
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.