Oops in pdflush

public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed

* Oops in pdflush
@ 2004-02-20 13:34 Andreas Schwab
  2004-02-20 14:18 ` Keith Owens
                   ` (27 more replies)
  0 siblings, 28 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-20 13:34 UTC (permalink / raw)
  To: linux-ia64

This happens regularily on all our Intel Tiger SMP systems, haven't seen
it on UP systems, and neither on any HP system we have.  Theses oopses are
all from different systems running kernels 2.6.2 and 2.6.3 in various
incarnations.  They are all using e1000 and mptscsih, unlike the HP
systems, so it might be a bug in either of those drivers.

pdflush[5742]: Oops 11012296146944 [1]

Pid: 5742, CPU 2, comm:              pdflush
psr : 0000121008026018 ifs : 800000000000040b ip  : [<a000000100485d01>]    Not tainted
ip is at ip_finish_output2+0x41/0x560
unat: 0000000000000000 pfs : 0000000000000797 rsc : 0000000000000003
rnat: e000000007ac7890 bsps: 0000000000001310 pr  : 82aa6aa6a555969b
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a000000100451900 b6  : a000000100282bc0 b7  : a000000100485cc0
f6  : 000000000000000000000 f7  : 1003e000000000011d541
f8  : 1003e000000000016fa06 f9  : 1003e0000000000b38153
f10 : 1003e000000000023aa82 f11 : 1003e0000000000000001
r1  : a000000100988130 r2  : 000000005dde178a r3  : 0000000000100000
r8  : 0000000000000001 r9  : 00000000000006a8 r10 : 840d3f83e0420336
r11 : 840d3f83e04202b6 r12 : e000000007286830 r13 : e000000007280000
r14 : 0000000000000001 r15 : a0000001004af498 r16 : a0000001004af530
r17 : 000000005e01380a r18 : 0000000000232080 r19 : a0000001004af528
r20 : a00000010084b2c0 r21 : 0000000000100000 r22 : 0000000000100000
r23 : a0000001007e4ca0 r24 : a000000100485cc0 r25 : e000000007286840
r26 : 00000000ffffffff r27 : e00000003feb66d8 r28 : 0000000000000000
r29 : e00000003feb66d4 r30 : 0000000000100000 r31 : e00000003feb66d0

Call Trace:
 [<a0000001000169e0>] show_stack+0x80/0xa0
                                spà00000007286a38 bspà00000007286a18
 [<a000000100039270>] die+0x170/0x200
                                spà00000007286c08 bspà000000072869d8
 [<a000000100057760>] ia64_do_page_fault+0x720/0xa60
                                spà00000007286c08 bspà00000007286970
 [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
                                spà00000007286c98 bspà00000007286970
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

pdflush[18576]: General Exception: IA-64 Reserved Register/Field fault (data access) 2199023255600 [1]

Pid: 18576, CPU 3, comm:              pdflush
psr : 0000121008026038 ifs : 800000000000494e ip  : [<a00000020005a6a1>]    Not tainted
ip is at scsi_io_completion+0x321/0x820 [scsi_mod]
unat: 0000000000000000 pfs : 000000000000494e rsc : 0000000000000003
rnat: 0000000000000892 bsps: a00000020024c000 pr  : 80000000afb59567
ldrs: 0000000000000000 ccv : a00000020024c000 fpsr: 0009804c8270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000020005a5e0 b6  : a000000100003320 b7  : a00000010034a1c0
f6  : e00000000b839210e00000001e0cc700 f7  : 600000001d4880000000000000000000
f8  : e00000001d48e808a00000010063b6f0 f9  : a00000020005a5e0e00000001d48e808
f10 : 1003e000000000000494e f11 : e00000001d48e808a00000010063b6f0
r1  : 0000000000000001 r2  : 0000000000004000 r3  : 0000000000000001
r8  : 0000000000000000 r9  : 0000000000000000 r10 : 0000000000000000
r11 : e00000003c168b98 r12 : e00000001d48e840 r13 : e00000001d488000
r14 : e00000003c168b98 r15 : 000000000000003f r16 : a00000020004e190
r17 : 0000000000000307 r18 : a000000200220000 r19 : e00000003db7fb80
r20 : 0000000000000000 r21 : 00000000000000e8 r22 : e00000001e0cc800
r23 : 0000000000000001 r24 : e00000001e0cc700 r25 : 0000000000000000
r26 : e00000000b839200 r27 : e00000000b839280 r28 : 0000000000000001
r29 : 0000000000000001 r30 : a000000200022940 r31 : 000000000000040b

Call Trace:
 [<a0000001000169e0>] show_stack+0x80/0xa0
                                spà0000001d48ea80 bspà0000001d48ea60
 [<a000000100039270>] die+0x170/0x200
                                spà0000001d48ec50 bspà0000001d48ea28
 [<a000000100039470>] ia64_fault+0x110/0x1020
                                spà0000001d48ec50 bspà0000001d48e9c8
 [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
                                spà0000001d48ee60 bspà0000001d48e9c8
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

pdflush[25442]: Oops 8804682956800 [1]

Pid: 25442, CPU 2, comm:              pdflush
psr : 0000101008026018 ifs : 8000000000001228 ip  : [<a00000020042dc30>]    Not tainted
ip is at e1000_clean+0xcb0/0x12e0 [e1000]
unat: 0000000000000000 pfs : 0000000000001228 rsc : 0000000000000003
rnat: 000000000000058e bsps: a000000100987ef0 pr  : 80005940af669697
ldrs: 0000000000000000 ccv : ffffffff00000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000020042dff0 b6  : a000000100255040 b7  : a000000100492560
f6  : ffffffffffff87300000000000000000 f7  : 04000e00000000000008c
f8  : 10000a000000200456488 f9  : e00000000765170ce000000035070000
f10 : e000000007651440000000090000000c f11 : a00000020044d038a00000020044d020
r1  : a000000200628000 r2  : e00000003b2260dc r3  : e00000003b225f08
r8  : 0000000000000000 r9  : e00000003b226084 r10 : e00000003c8add04
r11 : e00000003feb3550 r12 : e000000009c3e830 r13 : e000000009c38000
r14 : 0000000000000001 r15 : 0000000000000010 r16 : e000000007651000
r17 : e000000009c3e850 r18 : e000000003497618 r19 : e00000000349762c
r20 : e000000003497580 r21 : 0000000000000002 r22 : a00000010079acf8
r23 : cccccccccccccccd r24 : 000000000001003e r25 : 0000001008022018
r26 : 000000000001003e r27 : e000000003497580 r28 : 000000000000008c
r29 : 0000000000000000 r30 : e0f8188000000000 r31 : 0008000000080000

Call Trace:
 [<a0000001000169e0>] show_stack+0x80/0xa0
                                spà00000009c3e950 bspà00000009c3e930
 [<a000000100039270>] die+0x170/0x200
                                spà00000009c3eb20 bspà00000009c3e8f8
 [<a0000001000577c0>] ia64_do_page_fault+0x720/0xa60
                                spà00000009c3eb20 bspà00000009c3e890
 [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
                                spà00000009c3ebb0 bspà00000009c3e890
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

pdflush[4171]: General Exception: IA-64 Reserved Register/Field fault (data access) 4947802325040 [1]

Pid: 4171, CPU 1, comm:              pdflush
psr : 0000121008026038 ifs : 8000000000000002 ip  : [<a00000010012ff41>]    Not tainted
ip is at end_buffer_async_write+0x401/0x440
unat: 0000000000000000 pfs : 0000000000000002 rsc : e00000003711bc80
rnat: 0000000000001000 bsps: 0000000000000000 pr  : e00000003711bcb0
ldrs: 0000000000000793 ccv : a000000100988130 fpsr: 000000000000038b
csd : c00000003711bcaa ssd : 0000000000000000
b0  : a00000010033d440 b6  : 0000000000006000 b7  : e00000003711bc80
f6  : a00000010000d56000000000000000ef f7  : a0000001009881300000000000000005
f8  : e00000002d0066c0e00000002d006530 f9  : a0000001006dfa800000048000000030
f10 : 1003e0000000000000060 f11 : 1003e000000000000ce4f
r1  : a000000100988130 r2  : 0000000000004000 r3  : a00000010079d134
r8  : e00000001c3e2a10 r9  : 0000000000022000 r10 : 0000000000000000
r11 : 0000000000000000 r12 : e00000003711bc80 r13 : a000000100130cc0
r14 : 000000000000009d r15 : a000000100988130 r16 : 0000001008026038
r17 : a000000100126c70 r18 : 0000000000000206 r19 : 0000000000000207
r20 : a0000001005e7fc8 r21 : a0000001005f3640 r22 : 000000000000038a
r23 : 000000000000000f r24 : e00000002d0066c0 r25 : 0000000000000018
r26 : 0000048000000030 r27 : e00000000829d120 r28 : 0000000000000000
r29 : 0000000000000003 r30 : e00000002d0066c0 r31 : e00000002d006530

Call Trace:
 [<a0000001000169e0>] show_stack+0x80/0xa0
                                spà0000002d006898 bspà0000002d006878
 [<a000000100039270>] die+0x170/0x200
                                spà0000002d006a68 bspà0000002d006840
 [<a000000100039470>] ia64_fault+0x110/0x1020
                                spà0000002d006a68 bspà0000002d0067e0
 [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
                                spà0000002d006c78 bspà0000002d0067e0
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

pdflush[5219]: Oops 11012296146944 [1]

Pid: 5219, CPU 3, comm:              pdflush
psr : 0000121008026018 ifs : 8000000000000590 ip  : [<a00000010044f951>]    Not tainted
ip is at nf_iterate+0x171/0x240
unat: 0000000000000000 pfs : 0000000000000590 rsc : 0000000000000003
rnat: a00000010047d090 bsps: 0000000000000490 pr  : 82aa6aa6a555aa9b
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000010044f910 b6  : a000000100282140 b7  : a00000010025d620
f6  : 000000000000000000000 f7  : 1003e000000000014ce25
f8  : 1003e000000000082d395 f9  : 1003e0000000003fe1500
f10 : 1003e0000000000299c4a f11 : 1003e0000000000000001
r1  : a0000001009880e0 r2  : ffffffffffefffff r3  : 0000000000100000
r8  : 0000000000000001 r9  : 00000000000006a8 r10 : 0000000000000000
r11 : e0000000384ce778 r12 : e0000000384ce7c0 r13 : e0000000384c8000
r14 : 0000000000000001 r15 : 0000000000098764 r16 : e0000000384ce748
r17 : e00000003c729300 r18 : e00000003c72930c r19 : e0000000384ce740
r20 : a0000001007e3c18 r21 : 0000000000100000 r22 : 0000000000100000
r23 : a0000001007e4c18 r24 : e0000000384ce780 r25 : e00000003c72931c
r26 : 00000000ffffffff r27 : e00000003c729318 r28 : 0000000000000000
r29 : e00000003c729314 r30 : 0000000000100000 r31 : e00000003c729310

Call Trace:
 [<a0000001000169e0>] show_stack+0x80/0xa0
                                spà000000384ceaa0 bspà000000384cea80
 [<a000000100039270>] die+0x170/0x200
                                spà000000384cec70 bspà000000384cea48
 [<a000000100057760>] ia64_do_page_fault+0x720/0xa60
                                spà000000384cec70 bspà000000384ce9d8
 [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
                                spà000000384ced00 bspà000000384ce9d8
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
@ 2004-02-20 14:18 ` Keith Owens
  2004-02-20 14:52 ` Andreas Schwab
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-20 14:18 UTC (permalink / raw)
  To: linux-ia64

On Fri, 20 Feb 2004 14:34:03 +0100, 
Andreas Schwab <schwab@suse.de> wrote:
>This happens regularily on all our Intel Tiger SMP systems, haven't seen
>it on UP systems, and neither on any HP system we have.  Theses oopses are
>all from different systems running kernels 2.6.2 and 2.6.3 in various
>incarnations.  They are all using e1000 and mptscsih, unlike the HP
>systems, so it might be a bug in either of those drivers.
>
>pdflush[5742]: Oops 11012296146944 [1]
>
>Pid: 5742, CPU 2, comm:              pdflush
>psr : 0000121008026018 ifs : 800000000000040b ip  : [<a000000100485d01>]    Not tainted
>ip is at ip_finish_output2+0x41/0x560
>unat: 0000000000000000 pfs : 0000000000000797 rsc : 0000000000000003
>rnat: e000000007ac7890 bsps: 0000000000001310 pr  : 82aa6aa6a555969b
>ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
>csd : 0000000000000000 ssd : 0000000000000000
>b0  : a000000100451900 b6  : a000000100282bc0 b7  : a000000100485cc0
>f6  : 000000000000000000000 f7  : 1003e000000000011d541
>f8  : 1003e000000000016fa06 f9  : 1003e0000000000b38153
>f10 : 1003e000000000023aa82 f11 : 1003e0000000000000001
>r1  : a000000100988130 r2  : 000000005dde178a r3  : 0000000000100000
>r8  : 0000000000000001 r9  : 00000000000006a8 r10 : 840d3f83e0420336
>r11 : 840d3f83e04202b6 r12 : e000000007286830 r13 : e000000007280000
>r14 : 0000000000000001 r15 : a0000001004af498 r16 : a0000001004af530
>r17 : 000000005e01380a r18 : 0000000000232080 r19 : a0000001004af528
>r20 : a00000010084b2c0 r21 : 0000000000100000 r22 : 0000000000100000
>r23 : a0000001007e4ca0 r24 : a000000100485cc0 r25 : e000000007286840
>r26 : 00000000ffffffff r27 : e00000003feb66d8 r28 : 0000000000000000
>r29 : e00000003feb66d4 r30 : 0000000000100000 r31 : e00000003feb66d0
>
>Call Trace:
> [<a0000001000169e0>] show_stack+0x80/0xa0
>                                spà00000007286a38 bspà00000007286a18
> [<a000000100039270>] die+0x170/0x200
>                                spà00000007286c08 bspà000000072869d8
> [<a000000100057760>] ia64_do_page_fault+0x720/0xa60
>                                spà00000007286c08 bspà00000007286970
> [<a00000010000d560>] ia64_leave_kernel+0x0/0x260
>                                spà00000007286c98 bspà00000007286970
> <0>Kernel panic: Aiee, killing interrupt handler!
>In interrupt handler - not syncing

You should be getting a better backtrace than that.  ia64_leave_kernel
is the start of the interrupt handler for the oops, the preceding trace
points are the interesting ones but they are missing.

Which compiler are you using?  It could be a compiler fault, not
generating correct unwind data.

Can you try running with the kdb patch to see if that gives a better
backtrace for the oops?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
  2004-02-20 14:18 ` Keith Owens
@ 2004-02-20 14:52 ` Andreas Schwab
  2004-02-20 16:41 ` David Mosberger
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-20 14:52 UTC (permalink / raw)
  To: linux-ia64

Keith Owens <kaos@sgi.com> writes:

> You should be getting a better backtrace than that.  ia64_leave_kernel
> is the start of the interrupt handler for the oops, the preceding trace
> points are the interesting ones but they are missing.
>
> Which compiler are you using?

gcc 3.3.3

>  It could be a compiler fault, not generating correct unwind data.

The unwind check says this:

ERROR: ia64_monarch_init_handler: 186 slots, total region length = 0
1 error detected in 8160 functions.

> Can you try running with the kdb patch to see if that gives a better
> backtrace for the oops?

I'll try.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
  2004-02-20 14:18 ` Keith Owens
  2004-02-20 14:52 ` Andreas Schwab
@ 2004-02-20 16:41 ` David Mosberger
  2004-02-20 17:11 ` Andreas Schwab
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-20 16:41 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 20 Feb 2004 14:34:03 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> This happens regularily on all our Intel Tiger SMP systems,
  Andreas> haven't seen it on UP systems, and neither on any HP system
  Andreas> we have.  Theses oopses are all from different systems
  Andreas> running kernels 2.6.2 and 2.6.3 in various incarnations.

As an additional data-point: we have a tiger here, too, and it's
running 2.6.2-rc2 just fine (it's a build-server so we try not to
reboot it too often).  Are you saying that 2.6.1 was OK and you
started to see the problems only with 2.6.2?

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (2 preceding siblings ...)
  2004-02-20 16:41 ` David Mosberger
@ 2004-02-20 17:11 ` Andreas Schwab
  2004-02-20 23:09 ` David Mosberger
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-20 17:11 UTC (permalink / raw)
  To: linux-ia64

David Mosberger <davidm@napali.hpl.hp.com> writes:

>>>>>> On Fri, 20 Feb 2004 14:34:03 +0100, Andreas Schwab <schwab@suse.de> said:
>
>   Andreas> This happens regularily on all our Intel Tiger SMP systems,
>   Andreas> haven't seen it on UP systems, and neither on any HP system
>   Andreas> we have.  Theses oopses are all from different systems
>   Andreas> running kernels 2.6.2 and 2.6.3 in various incarnations.
>
> As an additional data-point: we have a tiger here, too, and it's
> running 2.6.2-rc2 just fine (it's a build-server so we try not to
> reboot it too often).  Are you saying that 2.6.1 was OK and you
> started to see the problems only with 2.6.2?

I can't tell when it started, but I think 2.6.1 already had similar
problems.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (3 preceding siblings ...)
  2004-02-20 17:11 ` Andreas Schwab
@ 2004-02-20 23:09 ` David Mosberger
  2004-02-22 13:58 ` Andreas Schwab
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-20 23:09 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 20 Feb 2004 18:11:36 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> David Mosberger <davidm@napali.hpl.hp.com> writes:
  >>>>>>> On Fri, 20 Feb 2004 14:34:03 +0100, Andreas Schwab <schwab@suse.de> said:
  Andreas> This happens regularily on all our Intel Tiger SMP systems,
  Andreas> haven't seen it on UP systems, and neither on any HP system
  Andreas> we have.  Theses oopses are all from different systems
  Andreas> running kernels 2.6.2 and 2.6.3 in various incarnations.

  >> As an additional data-point: we have a tiger here, too, and it's
  >> running 2.6.2-rc2 just fine (it's a build-server so we try not to
  >> reboot it too often).  Are you saying that 2.6.1 was OK and you
  >> started to see the problems only with 2.6.2?

  Andreas> I can't tell when it started, but I think 2.6.1 already had similar
  Andreas> problems.

That's rather strange.  I'm sure the Intel folks test on Tiger all the
time and Andrew also seems to test on his Tiger quite frequently.
Is this with a particular workload?

In any case, it's not good that the stack-trace is truncated.
gcc-3.3.3 should be good enough, so I'm not sure off hand what's going
wrong.  If you could enable unwind-debugging (set UNW_DEBUG to 5 in
arch/ia64/kernel/unwind.c) and capture the resulting output during a
crash, it might get us further.

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (4 preceding siblings ...)
  2004-02-20 23:09 ` David Mosberger
@ 2004-02-22 13:58 ` Andreas Schwab
  2004-02-22 14:08 ` Keith Owens
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-22 13:58 UTC (permalink / raw)
  To: linux-ia64

Keith Owens <kaos@sgi.com> writes:

> Can you try running with the kdb patch to see if that gives a better
> backtrace for the oops?

kdb couldn't do it better.

[2]kdb> bt
Stack traceback for pid 21693
0xe00000002faf0000    21693        1  1    2   R  0xe00000002faf04b0 *pdflush
0xa0000001003ff290 kdba_main_loop+0x150
        args (0x5, 0x5, 0x20000000030, 0x4, 0xe00000002faf6640)
        kernel 0xa0000001003ff140 0xa0000001003ff2c0
0xa000000100283e60 kdb+0x820
        args (0x5, 0x20000000030, 0xe00000002faf6640, 0xa00000010082bbd0, 0xa000000100887a84)
        kernel 0xa000000100283640 0xa000000100284bc0
0xa0000001000393c0 die+0x1e0
        args (0xe00000002faf64b0, 0xe00000002faf6640, 0x20000000030, 0xa00000010071fa80, 0xa000000100039570)
        kernel 0xa0000001000391e0 0xa000000100039400
0xa000000100039570 ia64_fault+0x110
        args (0x18, 0x20000000030, 0xe000000004c86b60, 0x3, 0xe00000002faf6640)
        kernel 0xa000000100039460 0xa00000010003a4e0
0xa00000010000d640 ia64_leave_kernel
        args (0x18, 0x20000000030, 0xe000000004c86b60, 0x3, 0xe00000002faf6640)
        kernel 0xa00000010000d640 0xa00000010000d8a0
0xe00000002faf5f00 - No name.  May be an area that has no unwind data

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (5 preceding siblings ...)
  2004-02-22 13:58 ` Andreas Schwab
@ 2004-02-22 14:08 ` Keith Owens
  2004-02-22 16:52 ` Andreas Schwab
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-22 14:08 UTC (permalink / raw)
  To: linux-ia64

On Sun, 22 Feb 2004 14:58:34 +0100, 
Andreas Schwab <schwab@suse.de> wrote:
>[2]kdb> bt
>Stack traceback for pid 21693
>0xe00000002faf0000    21693        1  1    2   R  0xe00000002faf04b0 *pdflush
>0xa0000001003ff290 kdba_main_loop+0x150
>        args (0x5, 0x5, 0x20000000030, 0x4, 0xe00000002faf6640)
>        kernel 0xa0000001003ff140 0xa0000001003ff2c0
>0xa000000100283e60 kdb+0x820
>        args (0x5, 0x20000000030, 0xe00000002faf6640, 0xa00000010082bbd0, 0xa000000100887a84)
>        kernel 0xa000000100283640 0xa000000100284bc0
>0xa0000001000393c0 die+0x1e0
>        args (0xe00000002faf64b0, 0xe00000002faf6640, 0x20000000030, 0xa00000010071fa80, 0xa000000100039570)
>        kernel 0xa0000001000391e0 0xa000000100039400
>0xa000000100039570 ia64_fault+0x110
>        args (0x18, 0x20000000030, 0xe000000004c86b60, 0x3, 0xe00000002faf6640)
>        kernel 0xa000000100039460 0xa00000010003a4e0
>0xa00000010000d640 ia64_leave_kernel
>        args (0x18, 0x20000000030, 0xe000000004c86b60, 0x3, 0xe00000002faf6640)
>        kernel 0xa00000010000d640 0xa00000010000d8a0
>0xe00000002faf5f00 - No name.  May be an area that has no unwind data

Which makes it a general unwind problem.  Apply this patch to turn on
unwind debugging when in kdb.

Index: 25.2/arch/ia64/kernel/unwind.c
--- 25.2/arch/ia64/kernel/unwind.c Wed, 11 Feb 2004 11:17:55 +1100 kaos (linux-2.4/r/c/42_unwind.c 1.1.2.1.1.2.3.1.1.1.1.3.1.1.1.2 644)
+++ 25.2(w)/arch/ia64/kernel/unwind.c Mon, 23 Feb 2004 01:05:02 +1100 kaos (linux-2.4/r/c/42_unwind.c 1.1.2.1.1.2.3.1.1.1.1.3.1.1.1.2 644)
@@ -56,11 +56,12 @@
 
 #define UNW_STATS	0	/* WARNING: this disabled interrupts for long time-spans!! */
 
+#define UNW_DEBUG 6
 #ifdef UNW_DEBUG
   static unsigned int unw_debug_level = UNW_DEBUG;
 #  ifdef CONFIG_KDB
 #    include <linux/kdb.h>
-#    define UNW_DEBUG_ON(n)	(unw_debug_level >= n && !KDB_IS_RUNNING())
+#    define UNW_DEBUG_ON(n)	(unw_debug_level >= n && KDB_IS_RUNNING())
 #    define UNW_DPRINT(n, ...)	if (UNW_DEBUG_ON(n)) kdb_printf(__VA_ARGS__)
 #  else	/* !CONFIG_KDB */
 #    define UNW_DEBUG_ON(n)	unw_debug_level >= n


When pdflush drops into kdb, type these commands.  I assume that you
can capture the output on a serial console.

set LINES 2000
set BTSP 1
bt


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (6 preceding siblings ...)
  2004-02-22 14:08 ` Keith Owens
@ 2004-02-22 16:52 ` Andreas Schwab
  2004-02-24  1:54 ` Grant Grundler
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-22 16:52 UTC (permalink / raw)
  To: linux-ia64

Keith Owens <kaos@sgi.com> writes:

> +#define UNW_DEBUG 6
>  #ifdef UNW_DEBUG
>    static unsigned int unw_debug_level = UNW_DEBUG;
>  #  ifdef CONFIG_KDB
>  #    include <linux/kdb.h>
> -#    define UNW_DEBUG_ON(n)	(unw_debug_level >= n && !KDB_IS_RUNNING())
> +#    define UNW_DEBUG_ON(n)	(unw_debug_level >= n && KDB_IS_RUNNING())
>  #    define UNW_DPRINT(n, ...)	if (UNW_DEBUG_ON(n)) kdb_printf(__VA_ARGS__)
>  #  else	/* !CONFIG_KDB */
>  #    define UNW_DEBUG_ON(n)	unw_debug_level >= n

This does not fit to the current sources even with the kdb patch applied.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (7 preceding siblings ...)
  2004-02-22 16:52 ` Andreas Schwab
@ 2004-02-24  1:54 ` Grant Grundler
  2004-02-27 10:16 ` Andreas Schwab
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Grant Grundler @ 2004-02-24  1:54 UTC (permalink / raw)
  To: linux-ia64

On Fri, Feb 20, 2004 at 02:34:03PM +0100, Andreas Schwab wrote:
> This happens regularily on all our Intel Tiger SMP systems, haven't seen
> it on UP systems, and neither on any HP system we have.  Theses oopses are
> all from different systems running kernels 2.6.2 and 2.6.3 in various
> incarnations.  They are all using e1000 and mptscsih, unlike the HP
> systems, so it might be a bug in either of those drivers.

I have been running e1000 (and tg3) drivers locally quite a bit
the past couple of monthes.
rx2000 uses e1000 for built-in GigE NIC.

rx2600, zx6000, and a few other machines use LSI 53c1030 chip
(MPT/Fusion driver) on the motherboard. My rx2600 seems to be
happy with 2.6.3-rc2 though I've not run any disk IO stress
tests on it lately.

hth,
grant

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (8 preceding siblings ...)
  2004-02-24  1:54 ` Grant Grundler
@ 2004-02-27 10:16 ` Andreas Schwab
  2004-02-27 13:58 ` Keith Owens
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-27 10:16 UTC (permalink / raw)
  To: linux-ia64

David Mosberger <davidm@napali.hpl.hp.com> writes:

> In any case, it's not good that the stack-trace is truncated.
> gcc-3.3.3 should be good enough, so I'm not sure off hand what's going
> wrong.  If you could enable unwind-debugging (set UNW_DEBUG to 5 in
> arch/ia64/kernel/unwind.c) and capture the resulting output during a
> crash, it might get us further.

Here's what I get:

pdflush[18140]: Oops 11012296146944 [1]

Pid: 18140, CPU 1, comm:              pdflush
psr : 0000121008026018 ifs : 8000000000000590 ip  : [<a00000010046e0d1>]    Not tainted
ip is at nf_iterate+0x111/0x240
unat: 0000000000000000 pfs : 0000000000000590 rsc : 0000000000000003
rnat: e00000003ccc4800 bsps: 0000000000000000 pr  : 82aa6aa6a555a59b
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000010046e0f0 b6  : a0000001002968e0 b7  : a000000100260540
f6  : 000000000000000000000 f7  : 1003e000000000011d541
f8  : 1003e0000000000680264 f9  : 1003e00000000032c92ae
f10 : 1003e000000000023aa82 f11 : 1003e0000000000000001
r1  : a000000100a17200 r2  : ffffffffffefffff r3  : 0000000000100000
r8  : 0000000000000001 r9  : 0000000000000000 r10 : a000000100a17228
r11 : 0000000000000000 r12 : e0000000110e6790 r13 : e0000000110e0000
r14 : 0000000000000001 r15 : a000000100a17200 r16 : a000000100a17210
r17 : e00000003feb5400 r18 : a000000100a17200 r19 : 0000000000000000
r20 : a000000100a17200 r21 : 0000000000100000 r22 : 0000000000100000
r23 : a000000100885220 r24 : e0000000110e6750 r25 : e00000003feb541c
r26 : 00000000ffffffff r27 : e00000003feb5418 r28 : 0000000000000000
r29 : e00000003feb5414 r30 : 0000000000100000 r31 : e00000003feb5410
unwind.init_frame_info:
  task   0xe0000000110e0000
  rbs = [0xe0000000110e0ef0-0xe0000000110e6ac8)
  stk = [0xe0000000110e6ac8-0xe0000000110e8000)
  pr     0x82aa6aa6a55596a7
  sw     0xe0000000110e6160
  sp     0xe0000000110e6ac8
unwind.unw_init_frame_info:
  bsp    0xe0000000110e6aa8
  sol    0x4
  ip     0xa000000100016ac0
unwind.build_script: ip 0xa000000100016ac0
unwind.build_script: state record for func 0xa000000100016a40, t$:
  ar.pfs <- r34		0
  psp <- psp+0x1d0		1
  rp <- r33		4

Call Trace:
 [<a000000100016ac0>] show_stack+0x80/0xa0
                                spà000000110e6ac8 bspà000000110e6aa8
unwind.build_script: ip 0xa000000100039350
unwind.build_script: state record for func 0xa0000001000391e0, ti:
  ar.pfs <- r37		0
  rp <- r36		4
 [<a000000100039350>] die+0x170/0x220
                                spà000000110e6c98 bspà000000110e6a70
unwind.build_script: ip 0xa000000100058e60
unwind.build_script: state record for func 0xa000000100058740, t42:
  ar.pfs <- r43		0
  psp <- psp+0x90		1
  rp <- r42		10
 [<a000000100058e60>] ia64_do_page_fault+0x720/0xa60
                                spà000000110e6c98 bspà000000110e6a08
unwind.build_script: ip 0xa00000010000d640
unwind.desc_abi: interrupt frame
unwind.build_script: state record for func 0xa00000010000d640, t=0:
  ar.pfs <- [sp+0x60]		-1
  psp <- psp+0x1d0		-1
  rp <- [sp+0x58]		-1
  ar.unat <- [sp+0x68]		-1
  pr <- [sp+0x90]		-1
  ar.fpsr <- [sp+0xc0]		-1
 [<a00000010000d640>] ia64_leave_kernel+0x0/0x260
                                spà000000110e6d28 bspà000000110e6a08
unwind.unw_unwind: reached user-space (ip=0x148e)
 unwind.init_frame_info:
  task   0xe0000000110e0000
  rbs = [0xe0000000110e0ef0-0xe0000000110e6b88)
  stk = [0xe0000000110e6b88-0xe0000000110e8000)
  pr     0x82aa6955a69aa99b
  sw     0xe0000000110e6300
  sp     0xe0000000110e6b88
unwind.init_frame_info:
  task   0xe0000000026f8000
  rbs = [0xe0000000026f8ef0-0xe0000000026f93d8)
  stk = [0xe0000000026ffbf0-0xe000000002700000)
  pr     0x90000050a655955b
  sw     0xe0000000026ff9f0
  sp     0xe0000000026ffbf0
unwind.init_frame_info:
  task   0xe000000021e10000
  rbs = [0xe000000021e10ef0-0xe000000021e11a10)
  stk = [0xe000000021e173a0-0xe000000021e18000)
  pr     0x900155566655955b
  sw     0xe000000021e171a0
  sp     0xe000000021e173a0
unwind.init_frame_info:
  task   0xe000000003b70000
  rbs = [0xe000000003b70ef0-0xe000000003b71948)
  stk = [0xe000000003b77810-0xe000000003b78000)
  pr     0x90000050a655959b
  sw     0xe000000003b77610
  sp     0xe000000003b77810
unwind.unw_init_frame_info:
  bsp    0xe0000000026f9370
  sol    0xd
  ip     0xa0000001003ff290
unwind.unw_init_frame_info:
  bsp    0xe000000003b718e0
  sol    0xd
  ip     0xa0000001003ff290
unwind.unw_init_frame_info:
  bsp    0xe000000021e119a0
  sol    0xd
  ip     0xa0000001003ff290
unwind.build_script: ip 0xa0000001003ff290
unwind.build_script: ip 0xa0000001003ff290
unwind.build_script: ip 0xa0000001003ff290
unwind.build_script: state record for func 0xa0000001003ff140, tc:
  ar.pfs <- r43		0
  psp <- psp+0x20		1
  rp <- r42		7
unwind.build_script: state record for func 0xa0000001003ff140, tc:
  ar.pfs <- r43		0
  psp <- psp+0x20		1
  rp <- r42		7
unwind.build_script: state record for func 0xa0000001003ff140, tc:
  ar.pfs <- r43		0
  psp <- psp+0x20		1
  rp <- r42		7
unwind.unw_init_frame_info:
  bsp    0xe0000000110e6b20
  sol    0xd
  ip     0xa0000001003ff290
unwind.build_script: ip 0xa0000001003ff290
unwind.build_script: state record for func 0xa0000001003ff140, tc:
  ar.pfs <- r43		0
  psp <- psp+0x20		1
  rp <- r42		7

Entering kdb (current=0xe0000000110e0000, pid 18140) on processor 1 Oops: <NULL>
due to oops @ 0xa00000010046e0d1
 psr: 0x0000121008026018   ifs: 0x8000000000000590    ip: 0xa00000010046e0d0  
unat: 0x0000000000000000   pfs: 0x0000000000000590   rsc: 0x0000000000000003  
rnat: 0xe00000003ccc4800  bsps: 0x0000000000000000    pr: 0x82aa6aa6a555a59b  
ldrs: 0x0000000000000000   ccv: 0x0000000000000000  fpsr: 0x0009804c0270033f  
  b0: 0xa00000010046e0f0    b6: 0xa0000001002968e0    b7: 0xa000000100260540  
  r1: 0xa000000100a17200    r2: 0xffffffffffefffff    r3: 0x0000000000100000  
  r8: 0x0000000000000001    r9: 0x0000000000000000   r10: 0xa000000100a17228  
 r11: 0x0000000000000000   r12: 0xe0000000110e6790   r13: 0xe0000000110e0000  
 r14: 0x0000000000000001   r15: 0xa000000100a17200   r16: 0xa000000100a17210  
 r17: 0xe00000003feb5400   r18: 0xa000000100a17200   r19: 0x0000000000000000  
 r20: 0xa000000100a17200   r21: 0x0000000000100000   r22: 0x0000000000100000  
 r23: 0xa000000100885220   r24: 0xe0000000110e6750   r25: 0xe00000003feb541c  
 r26: 0x00000000ffffffff   r27: 0xe00000003feb5418   r28: 0x0000000000000000  
 r29: 0xe00000003feb5414   r30: 0x0000000000100000   r31: 0xe00000003feb5410  
&regs = e0000000110e65d0

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (9 preceding siblings ...)
  2004-02-27 10:16 ` Andreas Schwab
@ 2004-02-27 13:58 ` Keith Owens
  2004-02-28  6:52 ` David Mosberger
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-27 13:58 UTC (permalink / raw)
  To: linux-ia64

On Fri, 27 Feb 2004 11:16:03 +0100, 
Andreas Schwab <schwab@suse.de> wrote:
>pdflush[18140]: Oops 11012296146944 [1]
>
>Pid: 18140, CPU 1, comm:              pdflush
>psr : 0000121008026018 ifs : 8000000000000590 ip  : [<a00000010046e0d1>]    Not tainted
>ip is at nf_iterate+0x111/0x240
>unwind.init_frame_info:
>  task   0xe0000000110e0000
>  rbs = [0xe0000000110e0ef0-0xe0000000110e6ac8)
>  stk = [0xe0000000110e6ac8-0xe0000000110e8000)
>  pr     0x82aa6aa6a55596a7
>  sw     0xe0000000110e6160
>  sp     0xe0000000110e6ac8

Ouch.  rbs and stack have collided, kernel stack overflow.  rbs shows
a normal start, then it loops with the same data over and over again

0xe0000000110e0ef0 3d0bf108   ....=.ñ.
0xe0000000110e0ef8 3fdad668   ....?ÚÖh
0xe0000000110e0f00 3d376460   ....}`
0xe0000000110e0f08 3c72c008   ....<rÀ.
0xe0000000110e0f10 3d376660   ....\x7f`
0xe0000000110e0f18 00000000   ........
0xe0000000110e0f20 3fd1ebf8   ....?Ñëø
0xe0000000110e0f28 8000000000000001   ........
0xe0000000110e0f30 3d376e90   ....=7n.
0xe0000000110e0f38 3d376a60   ....=7j`
0xe0000000110e0f40 3d376e80   ....=7n.
0xe0000000110e0f48 3d376ea8   ....=7n¨
0xe0000000110e0f50 00000001   ........
0xe0000000110e0f58 3d376e88   ....=7n.
0xe0000000110e0f60 3d375810   ....=7X.
0xe0000000110e0f68 3c91e590   ....<.å.
0xe0000000110e0f70 3c8d7060   ....<.p`
0xe0000000110e0f78 00000206   ........
0xe0000000110e0f80 3cb02000   ....<° .
0xe0000000110e0f88 3d0bf108   ....=.ñ.
0xe0000000110e0f90 00001491   ........
0xe0000000110e0f98 3c8d8b50   ....<..P
0xe0000000110e0fa0 00000998   ........
0xe0000000110e0fa8 80000000afb5952b   ....¯µ.+
0xe0000000110e0fb0 3ff42000   ....?ô .
0xe0000000110e0fb8 00000001   ........
0xe0000000110e0fc0 3fdacd90   ....?ÚÍ.
0xe0000000110e0fc8 a00000010082d898 num_physpages  
0xe0000000110e0fd0 a0000001000085a0 _start+0x280  
0xe0000000110e0fd8 00000998   ........
0xe0000000110e0fe0 a000000100a17200    ....¡r.
0xe0000000110e0fe8 a00000010068ce90 start_kernel+0x530  
0xe0000000110e0ff0 00000611   ........
0xe0000000110e0ff8 00000000   ........
0xe0000000110e1000 00000611   ........
0xe0000000110e1008 a000000100680990 __kstrtab_csum_partial_copy_nocheck+0xbc80  
0xe0000000110e1010 00000000   ........
0xe0000000110e1018 00000000   ........
0xe0000000110e1020 e0000000046e0000   à....n..
0xe0000000110e1028 a000000100009090 rest_init+0x30  
0xe0000000110e1030 00000186   ........
0xe0000000110e1038 a000000100a17200    ....¡r.
0xe0000000110e1040 a0000001006de230 __initcall_pdflush_init  
0xe0000000110e1048 00000000   ........
0xe0000000110e1050 a000000100818660 _GLOBAL_OFFSET_TABLE_+0x1460  
0xe0000000110e1058 a000000100818668 _GLOBAL_OFFSET_TABLE_+0x1468  
0xe0000000110e1060 a000000100818678 _GLOBAL_OFFSET_TABLE_+0x1478  
0xe0000000110e1068 a000000100818670 _GLOBAL_OFFSET_TABLE_+0x1470  
0xe0000000110e1070 a0000001006d38f0 initcall_debug  
0xe0000000110e1078 a0000001006de4f8 __con_initcall_start  
0xe0000000110e1080 a000000100015480 kernel_thread+0x100  
0xe0000000110e1088 00000389   ........
0xe0000000110e1090 a000000100a17200    ....¡r.
0xe0000000110e1098 a000000100009600 init+0x460  
0xe0000000110e10a0 0000058e   ........
0xe0000000110e10a8 00000000   ........
0xe0000000110e10b0 a0000001006a6bd0 pdflush_init+0x30  
0xe0000000110e10b8 00000183   ........
0xe0000000110e10c0 00000000   ........
0xe0000000110e10c8 a000000100682540 __kstrtab_csum_partial_copy_nocheck+0xd830  
0xe0000000110e10d0 00000000   ........
0xe0000000110e10d8 00000000   ........
0xe0000000110e10e0 e00000003c738000   à...<s..
0xe0000000110e10e8 a0000001000eb790 start_one_pdflush_thread+0x30  
0xe0000000110e10f0 00000186   ........
0xe0000000110e10f8 a000000100a17200    ....¡r.
0xe0000000110e1100 a00000010072f670 pdflush_list  
0xe0000000110e1108 e00000003c65fe18   à...<eþ.
0xe0000000110e1110 a00000010082d858 pdflush_lock  
0xe0000000110e1118 e00000003c65fe08   à...<eþ.
0xe0000000110e1120 e00000003c65fe20   à...<eþ 
0xe0000000110e1128 a00000010082b8a0 jiffies  
0xe0000000110e1130 e00000003c65fe28   à...<eþ(
0xe0000000110e1138 00004000   ......@.
0xe0000000110e1140 00000001   ........
0xe0000000110e1148 a00000010082d860 last_empty_jifs  
0xe0000000110e1150 e00000003c65fe10   à...<eþ.
0xe0000000110e1158 a00000010081cff8 _GLOBAL_OFFSET_TABLE_+0x5df8  
0xe0000000110e1160 a00000010082ba80 nr_pdflush_threads  
0xe0000000110e1168 00000400   ........
0xe0000000110e1170 a00000010081d000 _GLOBAL_OFFSET_TABLE_+0x5e00  
0xe0000000110e1178 a00000010072f678 pdflush_list+0x8  
0xe0000000110e1180 a000000100015480 kernel_thread+0x100  
0xe0000000110e1188 00000389   ........
0xe0000000110e1190 a000000100a17200    ....¡r.
0xe0000000110e1198 a0000001000ebb40 pdflush+0x380  
0xe0000000110e11a0 00000994   ........
0xe0000000110e11a8 e00000003c65fcd8   à...<eüØ
0xe0000000110e11b0 a000000100682540 __kstrtab_csum_partial_copy_nocheck+0xd830  

Repeat from __kstrtab_csum_partial_copy_nocheck+0xd830 until the stack
overflows.  That certainly explains why the backtrace is failing.  Now
the real question is why did the code loop?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (10 preceding siblings ...)
  2004-02-27 13:58 ` Keith Owens
@ 2004-02-28  6:52 ` David Mosberger
  2004-02-28  9:39 ` David Mosberger
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-28  6:52 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Sat, 28 Feb 2004 00:58:20 +1100, Keith Owens <kaos@sgi.com> said:

  Keith> On Fri, 27 Feb 2004 11:16:03 +0100,
  Keith> Andreas Schwab <schwab@suse.de> wrote:
  >> pdflush[18140]: Oops 11012296146944 [1]

  >> Pid: 18140, CPU 1, comm:              pdflush
  >> psr : 0000121008026018 ifs : 8000000000000590 ip  : [<a00000010046e0d1>]    Not tainted
  >> ip is at nf_iterate+0x111/0x240
  >> unwind.init_frame_info:
  >> task   0xe0000000110e0000
  >> rbs = [0xe0000000110e0ef0-0xe0000000110e6ac8)
  >> stk = [0xe0000000110e6ac8-0xe0000000110e8000)
  >> pr     0x82aa6aa6a55596a7
  >> sw     0xe0000000110e6160
  >> sp     0xe0000000110e6ac8

  Keith> Ouch.  rbs and stack have collided, kernel stack overflow.  rbs shows
  Keith> a normal start, then it loops with the same data over and over again

So if I'm reading this right, we get a case that looks like unbounded
recursion:

	pdflush -> start_one_pdflush_thread -> kernel_thread -> pdflush ...

Except, I don't think this is real recursion.  Instead, we effectively
get a (potentially unbounded) sequence of one kernel thread creating
another thread.  Each new kernel thread gets nested one deeper,
eventually leading to a stack overflow...

Argh, this wasn't supposed to happen!  It's not entirely trivial to
fix.  Obviously we could try to modify copy_thread() so it resets the
stack to the top, but in doing so, we still must preserve the stack
frame of kernel_thread().  That wouldn't be a problem---if only we
knew how big that frame was!  (Well, OK, then there would also be RNaT
slots to worry about, but that could be handled by ensuring that the
new and old stacks are congruent in that regard).

Hmmh, I think perhaps the right way to fix this is to use a separate
continuation function, which will then take care of doing the
child-specific actions.  Let me see if I can come up with something.

Oh, well, now I'm finding that this is of course exactly how Linus
changed the x86 code some 19 months ago (for other reasons though, it
seems):

  http://linux.bkbits.net:8080/linux-2.5/diffs/arch/i386/kernel/process.c@1.19.1.11

Say, Andreas, did you by chance have 3 disk drives in your Tiger?
Does it boot fine if you remove one or two of the disks?

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (11 preceding siblings ...)
  2004-02-28  6:52 ` David Mosberger
@ 2004-02-28  9:39 ` David Mosberger
  2004-02-28  9:45 ` Keith Owens
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-28  9:39 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Fri, 27 Feb 2004 22:52:46 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:

  David> Hmmh, I think perhaps the right way to fix this is to use a separate
  David> continuation function, which will then take care of doing the
  David> child-specific actions.  Let me see if I can come up with something.

OK, how about the attached patch?  Does it fix the problem for you,
Andreas?

	--david

=== arch/ia64/kernel/head.S 1.16 vs edited ==--- 1.16/arch/ia64/kernel/head.S	Wed Dec 10 17:28:59 2003
+++ edited/arch/ia64/kernel/head.S	Sat Feb 28 00:40:31 2004
@@ -816,6 +816,19 @@
 	br.ret.sptk.many rp
 END(ia64_delay_loop)
 
+GLOBAL_ENTRY(ia64_invoke_kernel_thread_helper)
+	.prologue
+	.save rp, r0		// this is the end of the call-chain
+	.body
+	alloc r2 = ar.pfs, 0, 0, 2, 0
+	mov out0 = r9
+	mov out1 = r11;;
+	br.call.sptk.many rp = kernel_thread_helper;;
+	mov out0 = r8
+	br.call.sptk.many rp = sys_exit;;
+1:	br.sptk.few 1b				// not reached
+END(ia64_invoke_kernel_thread_helper)
+
 #ifdef CONFIG_IA64_BRL_EMU
 
 /*
=== arch/ia64/kernel/process.c 1.51 vs edited ==--- 1.51/arch/ia64/kernel/process.c	Thu Jan  8 17:52:52 2004
+++ edited/arch/ia64/kernel/process.c	Sat Feb 28 01:19:59 2004
@@ -259,10 +259,12 @@
  *
  * We get here through the following  call chain:
  *
- *	<clone syscall>
- *	sys_clone
- *	do_fork
- *	copy_thread
+ *	from user-level:	from kernel:
+ *
+ *	<clone syscall>	        <some kernel call frames>
+ *	sys_clone		   :
+ *	do_fork			do_fork
+ *	copy_thread		copy_thread
  *
  * This means that the stack layout is as follows:
  *
@@ -276,9 +278,6 @@
  *	|                     | <-- sp (lowest addr)
  *	+---------------------+
  *
- * Note: if we get called through kernel_thread() then the memory above "(highest addr)"
- * is valid kernel stack memory that needs to be copied as well.
- *
  * Observe that we copy the unat values that are in pt_regs and switch_stack.  Spilling an
  * integer to address X causes bit N in ar.unat to be set to the NaT bit of the register,
  * with N=(X & 0x1ff)/8.  Thus, copying the unat value preserves the NaT bits ONLY if the
@@ -291,9 +290,9 @@
 	     unsigned long user_stack_base, unsigned long user_stack_size,
 	     struct task_struct *p, struct pt_regs *regs)
 {
-	unsigned long rbs, child_rbs, rbs_size, stack_offset, stack_top, stack_used;
-	struct switch_stack *child_stack, *stack;
 	extern char ia64_ret_from_clone, ia32_ret_from_clone;
+	struct switch_stack *child_stack, *stack;
+	unsigned long rbs, child_rbs, rbs_size;
 	struct pt_regs *child_ptregs;
 	int retval = 0;
 
@@ -306,16 +305,13 @@
 		return 0;
 #endif
 
-	stack_top = (unsigned long) current + IA64_STK_OFFSET;
 	stack = ((struct switch_stack *) regs) - 1;
-	stack_used = stack_top - (unsigned long) stack;
-	stack_offset = IA64_STK_OFFSET - stack_used;
 
-	child_stack = (struct switch_stack *) ((unsigned long) p + stack_offset);
-	child_ptregs = (struct pt_regs *) (child_stack + 1);
+	child_ptregs = (struct pt_regs *) ((unsigned long) p + IA64_STK_OFFSET) - 1;
+	child_stack = (struct switch_stack *) child_ptregs - 1;
 
 	/* copy parent's switch_stack & pt_regs to child: */
-	memcpy(child_stack, stack, stack_used);
+	memcpy(child_stack, stack, sizeof(*child_ptregs) + sizeof(*child_stack));
 
 	rbs = (unsigned long) current + IA64_RBS_OFFSET;
 	child_rbs = (unsigned long) p + IA64_RBS_OFFSET;
@@ -324,7 +320,7 @@
 	/* copy the parent's register backing store to the child: */
 	memcpy((void *) child_rbs, (void *) rbs, rbs_size);
 
-	if (user_mode(child_ptregs)) {
+	if (likely(user_mode(child_ptregs))) {
 		if ((clone_flags & CLONE_SETTLS) && !IS_IA32_PROCESS(regs))
 			child_ptregs->r13 = regs->r16;	/* see sys_clone2() in entry.S */
 		if (user_stack_base) {
@@ -341,14 +337,14 @@
 		 * been taken care of by the caller of sys_clone()
 		 * already.
 		 */
-		child_ptregs->r12 = (unsigned long) (child_ptregs + 1); /* kernel sp */
+		child_ptregs->r12 = (unsigned long) child_ptregs - 16; /* kernel sp */
 		child_ptregs->r13 = (unsigned long) p;		/* set `current' pointer */
 	}
+	child_stack->ar_bspstore = child_rbs + rbs_size;
 	if (IS_IA32_PROCESS(regs))
 		child_stack->b0 = (unsigned long) &ia32_ret_from_clone;
 	else
 		child_stack->b0 = (unsigned long) &ia64_ret_from_clone;
-	child_stack->ar_bspstore = child_rbs + rbs_size;
 
 	/* copy parts of thread_struct: */
 	p->thread.ksp = (unsigned long) child_stack - 16;
@@ -358,8 +354,8 @@
 	 * therefore we must specify them explicitly here and not include them in
 	 * IA64_PSR_BITS_TO_CLEAR.
 	 */
-	child_ptregs->cr_ipsr =  ((child_ptregs->cr_ipsr | IA64_PSR_BITS_TO_SET)
-			      & ~(IA64_PSR_BITS_TO_CLEAR | IA64_PSR_PP | IA64_PSR_UP));
+	child_ptregs->cr_ipsr = ((child_ptregs->cr_ipsr | IA64_PSR_BITS_TO_SET)
+				 & ~(IA64_PSR_BITS_TO_CLEAR | IA64_PSR_PP | IA64_PSR_UP));
 
 	/*
 	 * NOTE: The calling convention considers all floating point
@@ -578,27 +574,43 @@
 pid_t
 kernel_thread (int (*fn)(void *), void *arg, unsigned long flags)
 {
-	struct task_struct *parent = current;
-	int result; 
-	pid_t tid;
+	extern void ia64_invoke_kernel_thread_helper (void);
+	unsigned long *helper_fptr = (unsigned long *) &ia64_invoke_kernel_thread_helper;
+	struct {
+		struct switch_stack sw;
+		struct pt_regs pt;
+	} regs;
+
+	memset(&regs, 0, sizeof(regs));
+	regs.pt.cr_iip = helper_fptr[0];	/* set entry point (IP) */
+	regs.pt.r1 = helper_fptr[1];		/* set GP */
+	regs.pt.r9 = (unsigned long) fn;	/* 1st argument */
+	regs.pt.r11 = (unsigned long) arg;	/* 2nd argument */
+	/* Preserve PSR bits, except for bits 32-34 and 37-45, which we can't read.  */
+	regs.pt.cr_ipsr = ia64_getreg(_IA64_REG_PSR) | IA64_PSR_BN;
+	regs.pt.cr_ifs = 1UL << 63;		/* mark as valid, empty frame */
+	regs.sw.ar_fpsr = regs.pt.ar_fpsr = ia64_getreg(_IA64_REG_AR_FPSR);
+	regs.sw.ar_bspstore = (unsigned long) current + IA64_RBS_OFFSET;
+
+	return do_fork(flags | CLONE_VM | CLONE_UNTRACED, 0, &regs.pt, 0, NULL, NULL);
+}
+EXPORT_SYMBOL(kernel_thread);
 
-	tid = clone(flags | CLONE_VM | CLONE_UNTRACED, 0);
-	if (parent != current) {
+/* This gets called from kernel_thread() via ia64_invoke_thread_helper().  */
+int
+kernel_thread_helper (int (*fn)(void *), void *arg)
+{
 #ifdef CONFIG_IA32_SUPPORT
-		if (IS_IA32_PROCESS(ia64_task_regs(current))) {
-			/* A kernel thread is always a 64-bit process. */
-			current->thread.map_base  = DEFAULT_MAP_BASE;
-			current->thread.task_size = DEFAULT_TASK_SIZE;
-			ia64_set_kr(IA64_KR_IO_BASE, current->thread.old_iob);
-			ia64_set_kr(IA64_KR_TSSD, current->thread.old_k1);
-		}
-#endif
-		result = (*fn)(arg);
-		_exit(result);
+	if (IS_IA32_PROCESS(ia64_task_regs(current))) {
+		/* A kernel thread is always a 64-bit process. */
+		current->thread.map_base  = DEFAULT_MAP_BASE;
+		current->thread.task_size = DEFAULT_TASK_SIZE;
+		ia64_set_kr(IA64_KR_IO_BASE, current->thread.old_iob);
+		ia64_set_kr(IA64_KR_TSSD, current->thread.old_k1);
 	}
-	return tid;
+#endif
+	return (*fn)(arg);
 }
-EXPORT_SYMBOL(kernel_thread);
 
 /*
  * Flush thread state.  This is called when a thread does an execve().

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (12 preceding siblings ...)
  2004-02-28  9:39 ` David Mosberger
@ 2004-02-28  9:45 ` Keith Owens
  2004-02-28 10:00 ` Keith Owens
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-28  9:45 UTC (permalink / raw)
  To: linux-ia64

On Fri, 27 Feb 2004 22:52:46 -0800, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>>>>>> On Sat, 28 Feb 2004 00:58:20 +1100, Keith Owens <kaos@sgi.com> said:
>  Keith> Ouch.  rbs and stack have collided, kernel stack overflow.  rbs shows
>  Keith> a normal start, then it loops with the same data over and over again
>
>So if I'm reading this right, we get a case that looks like unbounded
>recursion:
>
>	pdflush -> start_one_pdflush_thread -> kernel_thread -> pdflush ...
>
>Except, I don't think this is real recursion.  Instead, we effectively
>get a (potentially unbounded) sequence of one kernel thread creating
>another thread.  Each new kernel thread gets nested one deeper,
>eventually leading to a stack overflow...
>
>Hmmh, I think perhaps the right way to fix this is to use a separate
>continuation function, which will then take care of doing the
>child-specific actions.  Let me see if I can come up with something.

Separate the pdflush thread creation and move it to a single master
thread.  This restricts the maximum stack depth already in use when
starting a worker pdflush thread.

--- 2.6.3-pristine/mm/pdflush.c	Thu Dec 18 14:00:02 2003
+++ 2.6.3-pdflush/mm/pdflush.c	Sat Feb 28 20:42:04 2004
@@ -5,6 +5,9 @@
  *
  * 09Apr2002	akpm@zip.com.au
  *		Initial version
+ * 28Feb2004	kaos@sgi.com
+ *		Move worker thread creation to a master thread to avoid chewing
+ *		up stack space with nested calls to kernel_thread.
  */
 
 #include <linux/sched.h>
@@ -18,6 +21,7 @@
 #include <linux/fs.h>		// Needed by writeback.h
 #include <linux/writeback.h>	// Prototypes pdflush_operation()
 
+#include <asm/semaphore.h>
 
 /*
  * Minimum and maximum number of pdflush instances
@@ -58,6 +62,11 @@ int nr_pdflush_threads = 0;
 static unsigned long last_empty_jifs;
 
 /*
+ * up() this to start a new pdflush thread.
+ */
+static struct semaphore new_pdflush;
+
+/*
  * The pdflush thread.
  *
  * Thread pool management algorithm:
@@ -207,13 +216,31 @@ int pdflush_operation(void (*fn)(unsigne
 
 static void start_one_pdflush_thread(void)
 {
-	kernel_thread(pdflush, NULL, CLONE_KERNEL);
+	up(&new_pdflush);
+}
+
+/*
+ * Create all pdflush worker threads from a single master thread.  Creating
+ * worker threads from inside worker threads chews up kernel stack space and
+ * eventually overflows the kernel stack.
+ */
+static int pdflush_master(void *dummy)
+{
+	daemonize("pdflush_master");
+	while (1) {
+		if (down_interruptible(&new_pdflush))
+			continue;
+		kernel_thread(pdflush, NULL, CLONE_KERNEL);
+	}
+	return 0;
 }
 
 static int __init pdflush_init(void)
 {
 	int i;
 
+	kernel_thread(pdflush_master, NULL, CLONE_KERNEL);
+
 	for (i = 0; i < MIN_PDFLUSH_THREADS; i++)
 		start_one_pdflush_thread();
 	return 0;

===========================

This is what the ia64 stack for a pdflush worker thread looks like now.
It has used 560 bytes of stack from creation to sleep.

0xe00000003db08000       16        1  0    2   S  0xe00000003db08570  pdflush
0xa0000001000830f0 schedule+0xf30
0xa0000001000dab70 __pdflush+0x230
0xa0000001000daf00 pdflush+0x20
0xa000000100016bc0 kernel_thread+0x100
0xa0000001000db250 pdflush_master+0xb0
0xa000000100016bc0 kernel_thread+0x100
0xa000000100650670 pdflush_init+0x30
0xa000000100641100 do_initcalls+0xc0
0xa000000100009300 init+0xe0
0xa000000100016bc0 kernel_thread+0x100
0xa000000100009090 rest_init+0x30
0xa000000100640f80 start_kernel+0x460
0xa0000001000085a0 _start+0x280

It would be nice if kernel_thread reset the stack every time it was
called, but that requires arch specific helper code.  Until that is
available for every arch, avoid recursive calls to kernel_thread.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (13 preceding siblings ...)
  2004-02-28  9:45 ` Keith Owens
@ 2004-02-28 10:00 ` Keith Owens
  2004-02-28 10:20 ` David Mosberger
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-28 10:00 UTC (permalink / raw)
  To: linux-ia64

On Sat, 28 Feb 2004 20:45:38 +1100, 
Keith Owens <kaos@sgi.com> wrote:
>This is what the ia64 stack for a pdflush worker thread looks like now.
>It has used 560 bytes of stack from creation to sleep.
>
>0xe00000003db08000       16        1  0    2   S  0xe00000003db08570  pdflush
>0xa0000001000830f0 schedule+0xf30
>0xa0000001000dab70 __pdflush+0x230
>0xa0000001000daf00 pdflush+0x20
>0xa000000100016bc0 kernel_thread+0x100
>0xa0000001000db250 pdflush_master+0xb0
>0xa000000100016bc0 kernel_thread+0x100
>0xa000000100650670 pdflush_init+0x30
>0xa000000100641100 do_initcalls+0xc0
>0xa000000100009300 init+0xe0
>0xa000000100016bc0 kernel_thread+0x100
>0xa000000100009090 rest_init+0x30
>0xa000000100640f80 start_kernel+0x460
>0xa0000001000085a0 _start+0x280

Without DavidM's patch to add ia64_invoke_kernel_thread_helper, pdflush
starts with 560 bytes of stack and 744 bytes of rbs.  With
ia64_invoke_kernel_thread_helper, that reduces to 554 bytes of stack
and 272 bytes of rbs.  Backtrace with ia64_invoke_kernel_thread_helper.

0xa000000100083290 schedule+0xf30
0xa0000001000dad10 __pdflush+0x230
0xa0000001000db0a0 pdflush+0x20
0xa000000100016d70 kernel_thread_helper+0xd0
0xa000000100009040 ia64_invoke_kernel_thread_helper+0x20

We need both ia64_invoke_kernel_thread_helper and my patch to pdflush.
Until all architectures have a kernel_thread helper, nested calls to
kernel_thread will chew up stack.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (14 preceding siblings ...)
  2004-02-28 10:00 ` Keith Owens
@ 2004-02-28 10:20 ` David Mosberger
  2004-02-28 10:23 ` Andrew Morton
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-28 10:20 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Sat, 28 Feb 2004 21:00:54 +1100, Keith Owens <kaos@sgi.com> said:

  Keith> Without DavidM's patch to add
  Keith> ia64_invoke_kernel_thread_helper, pdflush starts with 560
  Keith> bytes of stack and 744 bytes of rbs.  With
  Keith> ia64_invoke_kernel_thread_helper, that reduces to 554 bytes
  Keith> of stack and 272 bytes of rbs.  Backtrace with
  Keith> ia64_invoke_kernel_thread_helper.

Cool.  Thanks for reporting that and for providing the stack-dump (thanks
to Andreas, too, of course).

  Keith> We need both ia64_invoke_kernel_thread_helper and my patch to pdflush.
  Keith> Until all architectures have a kernel_thread helper, nested calls to
  Keith> kernel_thread will chew up stack.

Yes, I think so, too.  Andrew?

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (15 preceding siblings ...)
  2004-02-28 10:20 ` David Mosberger
@ 2004-02-28 10:23 ` Andrew Morton
  2004-02-28 12:00 ` Andrew Morton
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andrew Morton @ 2004-02-28 10:23 UTC (permalink / raw)
  To: linux-ia64

Keith Owens <kaos@sgi.com> wrote:
>
>  >So if I'm reading this right, we get a case that looks like unbounded
>  >recursion:
>  >
>  >	pdflush -> start_one_pdflush_thread -> kernel_thread -> pdflush ...
>  >

Yes.  Ow.  Thanks.

>  >Except, I don't think this is real recursion.  Instead, we effectively
>  >get a (potentially unbounded) sequence of one kernel thread creating
>  >another thread.  Each new kernel thread gets nested one deeper,
>  >eventually leading to a stack overflow...
>  >
>  >Hmmh, I think perhaps the right way to fix this is to use a separate
>  >continuation function, which will then take care of doing the
>  >child-specific actions.  Let me see if I can come up with something.
> 
>  Separate the pdflush thread creation and move it to a single master
>  thread.  This restricts the maximum stack depth already in use when
>  starting a worker pdflush thread.

We should use the new kthread infrastructure rather than open-coding it. 
It delegates thread startup to keventd and should thus avoid the stack
windup.

I'll take a look at this over the weekend unless someone else tell me
they're doing it.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (16 preceding siblings ...)
  2004-02-28 10:23 ` Andrew Morton
@ 2004-02-28 12:00 ` Andrew Morton
  2004-02-28 14:47 ` Keith Owens
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andrew Morton @ 2004-02-28 12:00 UTC (permalink / raw)
  To: linux-ia64

Andrew Morton <akpm@osdl.org> wrote:
>
> Keith Owens <kaos@sgi.com> wrote:
> >
> >  >So if I'm reading this right, we get a case that looks like unbounded
> >  >recursion:
> >  >
> >  >	pdflush -> start_one_pdflush_thread -> kernel_thread -> pdflush ...
> >  >
> 
> Yes.  Ow.  Thanks.

Having just looked at the code, I don't understand the problem.

If kernel thread A starts kernel thread B and kernel thread B starts
kernel thread C and so on, how does that cause stack windup?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (17 preceding siblings ...)
  2004-02-28 12:00 ` Andrew Morton
@ 2004-02-28 14:47 ` Keith Owens
  2004-02-28 14:55 ` Andreas Schwab
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-28 14:47 UTC (permalink / raw)
  To: linux-ia64

On Sat, 28 Feb 2004 04:00:34 -0800, 
Andrew Morton <akpm@osdl.org> wrote:
>Having just looked at the code, I don't understand the problem.
>
>If kernel thread A starts kernel thread B and kernel thread B starts
>kernel thread C and so on, how does that cause stack windup?

Backtrace from a pdflush task on standard 2.6.3 ia64.

schedule+0xf30
__pdflush+0x230
pdflush+0x20
kernel_thread+0x100
pdflush_init+0x30
do_initcalls+0xc0
init+0xe0
kernel_thread+0x100
rest_init+0x30
start_kernel+0x460
_start+0x280

Each use of kernel_thread results in a cloned stack which inherits the
stack usage from the previous thread.  Some architectures have a
kernel_thread helper which resets the stack, others do not.  Without a
helper, each call whittles away at the stack.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (18 preceding siblings ...)
  2004-02-28 14:47 ` Keith Owens
@ 2004-02-28 14:55 ` Andreas Schwab
  2004-02-28 18:26 ` David Mosberger
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-02-28 14:55 UTC (permalink / raw)
  To: linux-ia64

David Mosberger <davidm@napali.hpl.hp.com> writes:

> Say, Andreas, did you by chance have 3 disk drives in your Tiger?

No, only one.

Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (19 preceding siblings ...)
  2004-02-28 14:55 ` Andreas Schwab
@ 2004-02-28 18:26 ` David Mosberger
  2004-02-28 23:59 ` Keith Owens
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-02-28 18:26 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Sat, 28 Feb 2004 15:55:37 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> David Mosberger <davidm@napali.hpl.hp.com> writes:

  >> Say, Andreas, did you by chance have 3 disk drives in your Tiger?

  Andreas> No, only one.

Hmmh, kind of throws that theory out of the water...

I still don't understand why the problem triggered only on certain
machine.

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (20 preceding siblings ...)
  2004-02-28 18:26 ` David Mosberger
@ 2004-02-28 23:59 ` Keith Owens
  2004-02-29  3:44 ` Keith Owens
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-28 23:59 UTC (permalink / raw)
  To: linux-ia64

On Sat, 28 Feb 2004 10:26:18 -0800, 
David Mosberger <davidm@napali.hpl.hp.com> wrote:
>>>>>> On Sat, 28 Feb 2004 15:55:37 +0100, Andreas Schwab <schwab@suse.de> said:
>
>  Andreas> David Mosberger <davidm@napali.hpl.hp.com> writes:
>
>  >> Say, Andreas, did you by chance have 3 disk drives in your Tiger?
>
>  Andreas> No, only one.
>
>Hmmh, kind of throws that theory out of the water...
>
>I still don't understand why the problem triggered only on certain
>machine.

pdflush threads are per filesystem, independent of the number of
physical disks.  When you have concurrent heavy I/O load against more
than MIN_PDFLUSH_THREADS filesystems, pdflush will detect that all its
worker tasks are in use and will fork new tasks.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (21 preceding siblings ...)
  2004-02-28 23:59 ` Keith Owens
@ 2004-02-29  3:44 ` Keith Owens
  2004-02-29  5:27 ` Andrew Morton
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Keith Owens @ 2004-02-29  3:44 UTC (permalink / raw)
  To: linux-ia64

On Sat, 28 Feb 2004 02:23:23 -0800, 
Andrew Morton <akpm@osdl.org> wrote:
>We should use the new kthread infrastructure rather than open-coding it. 
>It delegates thread startup to keventd and should thus avoid the stack
>windup.

Convert pdflush to kthread to avoid stack windup.

Index: 4-rc1.1/mm/pdflush.c
--- 4-rc1.1/mm/pdflush.c Thu, 18 Dec 2003 16:46:13 +1100 kaos (linux-2.6/20_pdflush.c 1.1 644)
+++ 4-rc1.1(w)/mm/pdflush.c Sun, 29 Feb 2004 14:28:02 +1100 kaos (linux-2.6/20_pdflush.c 1.1 644)
@@ -5,6 +5,9 @@
  *
  * 09Apr2002	akpm@zip.com.au
  *		Initial version
+ * 29Feb2004	kaos@sgi.com
+ *		Move worker thread creation to kthread to avoid chewing
+ *		up stack space with nested calls to kernel_thread.
  */
 
 #include <linux/sched.h>
@@ -17,6 +20,7 @@
 #include <linux/suspend.h>
 #include <linux/fs.h>		// Needed by writeback.h
 #include <linux/writeback.h>	// Prototypes pdflush_operation()
+#include <linux/kthread.h>
 
 
 /*
@@ -207,7 +211,7 @@ int pdflush_operation(void (*fn)(unsigne
 
 static void start_one_pdflush_thread(void)
 {
-	kernel_thread(pdflush, NULL, CLONE_KERNEL);
+	kthread_run(pdflush, NULL, "pdflush");
 }
 
 static int __init pdflush_init(void)


Rusty, does pdflush() still need to call daemonize() or does kthread
make that redundant?


pdflush backtrace using kthread, without ia64 kernel_thread_helper.
This has got worse.  2.6.3 using kernel_thread used 560 bytes of stack
and 744 bytes of rbs, 2.6.4-rc1 with kthread uses 1120 stack and 1312
rbs.

See http://marc.theaimsgroup.com/?l=linux-ia64&m\x107796262112591&w=2
for 2.6.3 backtraces.

0xa000000100083650 schedule+0xf30
        sp 0xe00000003dabfba0 bsp 0xe00000003dab9440 cfm 0x0000000000000f26
0xa0000001000dc730 __pdflush+0x230
0xa0000001000dcac0 pdflush+0x20
0xa0000001000c0720 kthread+0x200
0xa000000100016bc0 kernel_thread+0x100
0xa0000001000c0770 keventd_create_kthread+0x30
0xa0000001000b7b10 worker_thread+0x450
0xa0000001000c0720 kthread+0x200
0xa000000100016bc0 kernel_thread+0x100
0xa0000001000c0770 keventd_create_kthread+0x30
0xa0000001000c0a00 kthread_create+0x200
0xa0000001000b80c0 create_workqueue_thread+0x100
0xa0000001000b82e0 create_workqueue+0x1a0
0xa0000001000b89a0 init_workqueues+0x20
0xa000000100645290 do_basic_setup+0x50
0xa000000100009300 init+0xe0
0xa000000100016bc0 kernel_thread+0x100
0xa000000100009090 rest_init+0x30
0xa000000100644fc0 start_kernel+0x4a0
0xa0000001000085a0 _start+0x280
        sp 0xe00000003dabfe30 bsp 0xe00000003dab8f20 cfm 0x0000000000000794

pdflush backtrace using kthread, with ia64 kernel_thread_helper.  This
has also got worse.  2.6.3 using kernel_thread and helper used 554
bytes of stack and 272 bytes of rbs, 2.6.4-rc1 with kthread and helper
uses 640 stack and 376 rbs.

0xa0000001000837f0 schedule+0xf30
        sp 0xe00000003dabfd80 bsp 0xe00000003dab9098 cfm 0x0000000000000f26
0xa0000001000dc8d0 __pdflush+0x230
0xa0000001000dcc60 pdflush+0x20
0xa0000001000c08c0 kthread+0x200
0xa000000100016d70 kernel_thread_helper+0xd0
0xa000000100009040 ia64_invoke_kernel_thread_helper+0x20
        sp 0xe00000003dabfe30 bsp 0xe00000003dab8f20 cfm 0x0000000000000002

This says two things - every architecture should have a kernel_thread
helper and using kthread adds some stack overhead, even with a helper.
The latter is acceptable if it makes the code more reliable.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (22 preceding siblings ...)
  2004-02-29  3:44 ` Keith Owens
@ 2004-02-29  5:27 ` Andrew Morton
  2004-03-01 10:34 ` Andreas Schwab
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andrew Morton @ 2004-02-29  5:27 UTC (permalink / raw)
  To: linux-ia64

Keith Owens <kaos@sgi.com> wrote:
>
> On Sat, 28 Feb 2004 02:23:23 -0800, 
> Andrew Morton <akpm@osdl.org> wrote:
> >We should use the new kthread infrastructure rather than open-coding it. 
> >It delegates thread startup to keventd and should thus avoid the stack
> >windup.
> 
> Convert pdflush to kthread to avoid stack windup.

Thanks Keith.  Tricky patch ;)

> Rusty, does pdflush() still need to call daemonize() or does kthread
> make that redundant?

It's redundant - threads launched by kthread have a genuine kernel thread
as a parent and hence do not need to perform all that "disassociate me from
my userspace parent" stuff.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (23 preceding siblings ...)
  2004-02-29  5:27 ` Andrew Morton
@ 2004-03-01 10:34 ` Andreas Schwab
  2004-03-01 19:46 ` David Mosberger
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 29+ messages in thread
From: Andreas Schwab @ 2004-03-01 10:34 UTC (permalink / raw)
  To: linux-ia64

David Mosberger <davidm@napali.hpl.hp.com> writes:

>>>>>> On Fri, 27 Feb 2004 22:52:46 -0800, David Mosberger <davidm@linux.hpl.hp.com> said:
>
>   David> Hmmh, I think perhaps the right way to fix this is to use a separate
>   David> continuation function, which will then take care of doing the
>   David> child-specific actions.  Let me see if I can come up with something.
>
> OK, how about the attached patch?  Does it fix the problem for you,
> Andreas?

The system has been under moderate load for a couple of days without any
problems, so it seems to be fixed.

Thanks, Andreas.

-- 
Andreas Schwab, SuSE Labs, schwab@suse.de
SuSE Linux AG, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (24 preceding siblings ...)
  2004-03-01 10:34 ` Andreas Schwab
@ 2004-03-01 19:46 ` David Mosberger
  2006-09-06 13:39 ` D.N.Jagannathan
  2006-09-06 17:44 ` Chen, Kenneth W
  27 siblings, 0 replies; 29+ messages in thread
From: David Mosberger @ 2004-03-01 19:46 UTC (permalink / raw)
  To: linux-ia64

>>>>> On Mon, 01 Mar 2004 11:34:33 +0100, Andreas Schwab <schwab@suse.de> said:

  Andreas> The system has been under moderate load for a couple of
  Andreas> days without any problems, so it seems to be fixed.

Cool!

	--david

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (25 preceding siblings ...)
  2004-03-01 19:46 ` David Mosberger
@ 2006-09-06 13:39 ` D.N.Jagannathan
  2006-09-06 17:44 ` Chen, Kenneth W
  27 siblings, 0 replies; 29+ messages in thread
From: D.N.Jagannathan @ 2006-09-06 13:39 UTC (permalink / raw)
  To: linux-ia64

can anybody explain what is and the importance of  pdflush process 
running in Linux.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: Oops in pdflush
  2004-02-20 13:34 Oops in pdflush Andreas Schwab
                   ` (26 preceding siblings ...)
  2006-09-06 13:39 ` D.N.Jagannathan
@ 2006-09-06 17:44 ` Chen, Kenneth W
  27 siblings, 0 replies; 29+ messages in thread
From: Chen, Kenneth W @ 2006-09-06 17:44 UTC (permalink / raw)
  To: linux-ia64

D.N.Jagannathan wrote on Wednesday, September 06, 2006 6:27 AM
> can anybody explain what is and the importance of pdflush process 
> running in Linux.

pdflush is a kernel thread that periodically flushing dirty pages in
page cache in order to ensure in-memory data are consistent with disk
storage.

- Ken

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2006-09-06 17:44 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-20 13:34 Oops in pdflush Andreas Schwab
2004-02-20 14:18 ` Keith Owens
2004-02-20 14:52 ` Andreas Schwab
2004-02-20 16:41 ` David Mosberger
2004-02-20 17:11 ` Andreas Schwab
2004-02-20 23:09 ` David Mosberger
2004-02-22 13:58 ` Andreas Schwab
2004-02-22 14:08 ` Keith Owens
2004-02-22 16:52 ` Andreas Schwab
2004-02-24  1:54 ` Grant Grundler
2004-02-27 10:16 ` Andreas Schwab
2004-02-27 13:58 ` Keith Owens
2004-02-28  6:52 ` David Mosberger
2004-02-28  9:39 ` David Mosberger
2004-02-28  9:45 ` Keith Owens
2004-02-28 10:00 ` Keith Owens
2004-02-28 10:20 ` David Mosberger
2004-02-28 10:23 ` Andrew Morton
2004-02-28 12:00 ` Andrew Morton
2004-02-28 14:47 ` Keith Owens
2004-02-28 14:55 ` Andreas Schwab
2004-02-28 18:26 ` David Mosberger
2004-02-28 23:59 ` Keith Owens
2004-02-29  3:44 ` Keith Owens
2004-02-29  5:27 ` Andrew Morton
2004-03-01 10:34 ` Andreas Schwab
2004-03-01 19:46 ` David Mosberger
2006-09-06 13:39 ` D.N.Jagannathan
2006-09-06 17:44 ` Chen, Kenneth W

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox