From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Thu, 23 May 2013 00:57:20 -0400 (EDT) From: CAI Qian Message-ID: <1125086079.5019070.1369285040855.JavaMail.root@redhat.com> In-Reply-To: <20130523034611.GX24543@dastard> References: <40971621.4497871.1369211701112.JavaMail.root@redhat.com> <1805266998.4499261.1369211998387.JavaMail.root@redhat.com> <20130522095300.GK29466@dastard> <1483868349.4996990.1369279016162.JavaMail.root@redhat.com> <20130523034611.GX24543@dastard> Subject: 3.9.2/3.9.3: stack overrun on s390x and ppc64 (WAS Re: 3.9.2: xfstests triggered panic) MIME-Version: 1.0 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com List-Archive: List-Post: To: linux-s390 , linuxppc-dev@lists.ozlabs.org Cc: LKML , Steve Best , xfs@oss.sgi.com, stable@vger.kernel.org, Hendrik Brueckner List-ID: T3JpZ2luYWwgcmVwb3J0OgpodHRwOi8vb3NzLnNnaS5jb20vYXJjaGl2ZXMveGZzLzIwMTMtMDUv bXNnMDA2ODMuaHRtbAoKQWxzbyBzZWVuIG9uIFBvd2VyNzoKaHR0cDovL21hcmMuaW5mby8/bD1s aW51eC1rZXJuZWwmbT0xMzY5Mjc5MDQ5MDA2OTImdz0yCgpDQUkgUWlhbgoKLS0tLS0gT3JpZ2lu YWwgTWVzc2FnZSAtLS0tLQo+IEZyb206ICJEYXZlIENoaW5uZXIiIDxkYXZpZEBmcm9tb3JiaXQu Y29tPgo+IFRvOiAiQ0FJIFFpYW4iIDxjYWlxaWFuQHJlZGhhdC5jb20+Cj4gQ2M6ICJMS01MIiA8 bGludXgta2VybmVsQHZnZXIua2VybmVsLm9yZz4sIHN0YWJsZUB2Z2VyLmtlcm5lbC5vcmcsIHhm c0Bvc3Muc2dpLmNvbQo+IFNlbnQ6IFRodXJzZGF5LCBNYXkgMjMsIDIwMTMgMTE6NDY6MTEgQU0K PiBTdWJqZWN0OiBSZTogMy45LjI6IHhmc3Rlc3RzIHRyaWdnZXJlZCBwYW5pYwo+IAo+IE9uIFdl ZCwgTWF5IDIyLCAyMDEzIGF0IDExOjE2OjU2UE0gLTA0MDAsIENBSSBRaWFuIHdyb3RlOgo+ID4g LS0tLS0gT3JpZ2luYWwgTWVzc2FnZSAtLS0tLQo+ID4gPiBGcm9tOiAiRGF2ZSBDaGlubmVyIiA8 ZGF2aWRAZnJvbW9yYml0LmNvbT4KPiA+ID4gVG86ICJDQUkgUWlhbiIgPGNhaXFpYW5AcmVkaGF0 LmNvbT4KPiA+ID4gQ2M6ICJMS01MIiA8bGludXgta2VybmVsQHZnZXIua2VybmVsLm9yZz4sIHN0 YWJsZUB2Z2VyLmtlcm5lbC5vcmcsCj4gPiA+IHhmc0Bvc3Muc2dpLmNvbQo+ID4gPiBTZW50OiBX ZWRuZXNkYXksIE1heSAyMiwgMjAxMyA1OjUzOjAwIFBNCj4gPiA+IFN1YmplY3Q6IFJlOiAzLjku MjogeGZzdGVzdHMgdHJpZ2dlcmVkIHBhbmljCj4gPiA+IAo+ID4gPiBPbiBXZWQsIE1heSAyMiwg MjAxMyBhdCAwNDozOTo1OEFNIC0wNDAwLCBDQUkgUWlhbiB3cm90ZToKPiA+ID4gPiBSZXByb2R1 Y2VkIG9uIGFsbW9zdCBhbGwgczM5MHggZ3Vlc3RzIGJ5IHJ1bm5pbmcgeGZzdGVzdHMuCj4gPiA+ ID4gCj4gPiA+ID4gMTQ2MzQuMzk2NjU4wqggWEZTIChkbS0xKTogTW91bnRpbmcgRmlsZXN5c3Rl bQo+ID4gPiA+IDE0NjM0LjUyNTUyMsKoIFhGUyAoZG0tMSk6IEVuZGluZyBjbGVhbiBtb3VudAo+ ID4gPiA+IDE0NjQwLjQxMzAwN8KoICA8MDAwMDAwMDAwMDE3YzZkND7CqCBpZGxlX2JhbGFuY2Ur MHgxYTAvMHgzNDAKPiA+ID4gPiAxNDY0MC40MTMwMTDCqCAgPDAwMDAwMDAwMDA2MzMwM2U+wqgg X19zY2hlZHVsZSsweGEyMi8weGFmMAo+ID4gPiA+IDE0NjQwLjQyODI3OcKoICA8MDAwMDAwMDAw MDYzMGRhNj7CqCBzY2hlZHVsZV90aW1lb3V0KzB4MTg2LzB4MmMwCj4gPiA+ID4gMTQ2NDAuNDI4 Mjg5wqggIDwwMDAwMDAwMDAwMWNmODY0PsKoIHJjdV9ncF9rdGhyZWFkKzB4MWJjLzB4Mjk4Cj4g PiA+ID4gMTQ2NDAuNDI4MzAwwqggIDwwMDAwMDAwMDAwMTU4YzVhPsKoIGt0aHJlYWQrMHhlNi8w eGVjCj4gPiA+ID4gMTQ2NDAuNDI4MzA0wqggIDwwMDAwMDAwMDAwNjM0ZGU2PsKoIGtlcm5lbF90 aHJlYWRfc3RhcnRlcisweDYvMHhjCj4gPiA+ID4gMTQ2NDAuNDI4MzA4wqggIDwwMDAwMDAwMDAw NjM0ZGUwPsKoIGtlcm5lbF90aHJlYWRfc3RhcnRlcisweDAvMHhjCj4gPiA+ID4gMTQ2NDAuNDI4 MzExwqggTGFzdCBCcmVha2luZy1FdmVudC1BZGRyZXNzOgo+ID4gPiA+IDE0NjQwLjQyODMxNMKo ICA8MDAwMDAwMDAwMDE2YmQ3Nj7CqCB3YWxrX3RnX3RyZWVfZnJvbSsweDNhLzB4ZjQKPiA+ID4g PiAxNDY0MC40MjgzMTnCqCAgbGlzdF9hZGQgY29ycnVwdGlvbi4gbmV4dC0+cHJldiBzaG91bGQg YmUgcHJldgo+ID4gPiA+ICgwMDAwMDAwMDAwMDAwOTE4Cj4gPiA+ID4gKSwgYnV0IHdhcyAgICAg ICAgICAgKG51bGwpLiAobmV4dD0gICAgICAgICAgKG51bGwpKS4KPiA+ID4gCj4gPiA+IFdoZXJl J3MgWEZTIGluIHRoaXM/IHdhbGtfdGdfdHJlZV9mcm9tKCkgaXMgcGFydCBvZiB0aGUgc2NoZWR1 bGVyCj4gPiA+IGNvZGUuIFRoaXMga2luZCBvZiBpbXBsaWVzIGEgc3RhY2sgY29ycnVwdGlvbi4u Li4KPiA+ID4gCj4gPiA+ID4gU29tZXRpbWVzLCB0aGlzIHBvcHMgdXAsCj4gPiA+ID4gWzE2OTA3 LjI3NTAwMl0gV0FSTklORzogYXQga2VybmVsL3JjdXRyZWUuYzoxOTYwCj4gPiA+ID4gCj4gPiA+ ID4gb3IgdGhpcywKPiA+ID4gPiAxNTMxNi4xNTQxNzHCqCBYRlMgKGRtLTEpOiBNb3VudGluZyBG aWxlc3lzdGVtCj4gPiA+ID4gMTUzMTYuMjU1Nzk2wqggWEZTIChkbS0xKTogRW5kaW5nIGNsZWFu IG1vdW50Cj4gPiA+ID4gMTUzMjAuMzY0MjQ2wqggICAgICAgICAgICAwMDAwMDAwMDAwNjM2N2Ey OiBlMzEwYjAwODAwMDQgICAgICAgIGxnCj4gPiA+ID4gJXIxLDgoJXIKPiA+ID4gPiAxMSkKPiA+ ID4gPiAxNTMyMC4zNjQyNDnCqCAgICAgICAgICAgIDAwMDAwMDAwMDA2MzY3YTg6IDQxMTAxMDEw ICAgICAgICAgICAgbGEKPiA+ID4gPiAlcjEsMTYoJQo+ID4gPiA+IHIxKQo+ID4gPiA+IDE1MzIw LjM2NDI1McKoICAgICAgICAgICAgMDAwMDAwMDAwMDYzNjdhYzogZTMzMDEwMDAwMDA0ICAgICAg ICBsZwo+ID4gPiA+ICVyMywwKCVyCj4gPiA+ID4gMSkKPiA+ID4gPiAxNTMyMC4zNjQyNTLCqCBD YWxsIFRyYWNlOgo+ID4gPiA+IDE1MzIwLjM2NDI1MsKoIExhc3QgQnJlYWtpbmctRXZlbnQtQWRk cmVzczoKPiA+ID4gPiAxNTMyMC4zNjQyNTPCqCAg77+9IDwwMDAwMDAwMDAwMDAwMDAwPsKoIEtl cm5lbCBzdGFjayBvdmVyZmxvdy4KPiA+ID4gPiAxNTMyMC4zNjQzMDjCqCBDUFU6IDAgVGFpbnRl ZDogR0YgICAgICAgVyAgICAzLjkuMiAjMQo+ID4gPiA+IDE1MzIwLjM2NDMwOcKoIFByb2Nlc3Mg cmh0cy10ZXN0LXJ1bm5lIChwaWQ6IDYyNSwgdGFzazoKPiA+ID4gPiAwMDAwMDAwMDNkY2NjODkw LAo+ID4gPiA+IGtzcDogMAo+ID4gPiAKPiA+ID4gLi4uLiBhbmQgdGhlcmUgeW91IGdvIC0gYSBz dGFjayBvdmVyZmxvdy4gWW91ciBrZXJuZWwgc3RhY2sgc2l6ZSBpcwo+ID4gPiB0b28gc21hbGwu Cj4gPiA+IAo+ID4gPiBJJ2Qgc3VnZ2VzdCB0aGF0IHlvdSBuZWVkIDE2ayBzdGFja3Mgb24gczM5 MCAtIElJUkMgZXZlcnkgZnVuY3Rpb24KPiA+ID4gY2FsbCBoYXMgMTI4IGJ5dGUgc3RhY2sgZnJh bWUsIGFuZCB0aGVyZSBhcmUgY2FsbCBjaGFpbnMgNzAtODAKPiA+ID4gZnVuY3Rpb25zIGRlZXAg aW4gdGhlIHN0b3JhZ2Ugc3RhY2suLi4KPiA+IEhtbSwgSSBhbSB1bnN1cmUgaG93IHRvIHNldCB0 byAxNmsgc3RhY2sgdGhlcmUKPiAKPiBBcmUgeW91IGJ1aWxkIGEgNjQgYml0IHMzOTAga2VybmVs IG9yIGEgMzIgYml0IGtlcm5lbD8gMzIgYml0Cj4ga2VybmVscyBvbmx5IGhhdmUgYW4gOGsgc3Rh Y2sgc2l6ZSwgNjQgYml0IGtlcm5lbHMgYXJlIDE2ayAoc2VlCj4gYXJjaC9zMzkwL01ha2VmaWxl KS4KPiAKPiAkIGdpdCBncmVwIFNUQUNLX1NJWkUgYXJjaC9zMzkwIHxoZWFkIC0yCj4gYXJjaC9z MzkwL01ha2VmaWxlOlNUQUNLX1NJWkUgICA6PSA4MTkyCj4gYXJjaC9zMzkwL01ha2VmaWxlOlNU QUNLX1NJWkUgICA6PSAxNjM4NAo+IAo+IEFzIGl0IGlzLCB0aGUgc3RhY2sgZnJhbWUgdXNhZ2Ug aXMgd29yc2UgdGhhbiBJIHRob3VnaHQ6Cj4gCj4gJCBnaXQgZ3JlcCBTVEFDS19GUkFNRV9PVkVS SEVBRCBhcmNoL3MzOTAgfGhlYWQgLTIKPiBhcmNoL3MzOTAvaW5jbHVkZS91YXBpL2FzbS9wdHJh Y2UuaDojZGVmaW5lIFNUQUNLX0ZSQU1FX09WRVJIRUFEIDk2ICAgICAgLyoKPiBzaXplIG9mIG1p bmltdW0gc3RhY2sgZnJhbWUgKi8KPiBhcmNoL3MzOTAvaW5jbHVkZS91YXBpL2FzbS9wdHJhY2Uu aDojZGVmaW5lIFNUQUNLX0ZSQU1FX09WRVJIRUFEIDE2MCAgICAgIC8qCj4gc2l6ZSBvZiBtaW5p bXVtIHN0YWNrIGZyYW1lICovCj4gCj4gT3ZlcmhlYWQgaXMgOTYgYnl0ZXMgZm9yIDMyIGJpdCBh bmQgMTYwIGJ5dGVzIGZvciA2NCBiaXQuIFNvIDE2awo+IHN0YWNrIHNpemUgaXMgZ29pbmcgdG8g aGF2ZSBiaWcgdHJvdWJsZXMgd2l0aCBhIDcwLTgwIGZ1bmN0aW9uIGRlZXAKPiBjYWxsIGNoYWlu Lgo+IAo+IEFzIGZvciBwb3dlcnBjOgo+IAo+IGFyY2gvcG93ZXJwYy9pbmNsdWRlL2FzbS9wcGNf YXNtLmg6I2RlZmluZSBTVEFDS0ZSQU1FU0laRSAyNTYKPiAKPiBZZWFoLCBzYW1lIGlzc3VlLgo+ IAo+IEJ1dCwgc2VyaW91c2x5LCB0aGVzZSBzdGFjayB0cmFjZXMgYXJlIG1lYW5pbmdsZXNzIHRv IGFueW9uZSBub3QKPiBmYW1pbGlhciB3aXRoIHMzOTAgb3IgcG93ZXI3IC0gdGhleSBpbmRpY2F0 ZSBhIHByb2JsZW0gZGV0ZWN0ZWQKPiBpbiB0aGUgaWRsZSBsb29wLCBub3Qgd2hlcmUgZXZlciB0 aGUgc3RhY2sgb3ZlcnJhbi4KPiAKPiBDYW4geW91IHBsZWFzZSB3b3JrIHdpdGggdGhlIHMzOTAv cG93ZXI3IHBlb3BsZSB0byBvYnRhaW4gd2hhdGV2ZXIKPiBzdGFjayBpdCB3YXMgdGhhdCBvdmVy Zmxvd2VkLCBhbmQgd2UgY2FuIGdvIGZyb20gdGhlcmUuCj4gCj4gQ2hlZXJzLAo+IAo+IERhdmUu Cj4gLS0KPiBEYXZlIENoaW5uZXIKPiBkYXZpZEBmcm9tb3JiaXQuY29tCj4gCgpfX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwp4ZnMgbWFpbGluZyBsaXN0Cnhm c0Bvc3Muc2dpLmNvbQpodHRwOi8vb3NzLnNnaS5jb20vbWFpbG1hbi9saXN0aW5mby94ZnMK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx4-phx2.redhat.com (mx4-phx2.redhat.com [209.132.183.25]) by ozlabs.org (Postfix) with ESMTP id 044E02C00A5 for ; Thu, 23 May 2013 14:57:24 +1000 (EST) Date: Thu, 23 May 2013 00:57:20 -0400 (EDT) From: CAI Qian To: linux-s390 , linuxppc-dev@lists.ozlabs.org Message-ID: <1125086079.5019070.1369285040855.JavaMail.root@redhat.com> In-Reply-To: <20130523034611.GX24543@dastard> References: <40971621.4497871.1369211701112.JavaMail.root@redhat.com> <1805266998.4499261.1369211998387.JavaMail.root@redhat.com> <20130522095300.GK29466@dastard> <1483868349.4996990.1369279016162.JavaMail.root@redhat.com> <20130523034611.GX24543@dastard> Subject: 3.9.2/3.9.3: stack overrun on s390x and ppc64 (WAS Re: 3.9.2: xfstests triggered panic) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: Dave Chinner , LKML , Steve Best , xfs@oss.sgi.com, stable@vger.kernel.org, Hendrik Brueckner List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original report: http://oss.sgi.com/archives/xfs/2013-05/msg00683.html Also seen on Power7: http://marc.info/?l=3Dlinux-kernel&m=3D136927904900692&w=3D2 CAI Qian ----- Original Message ----- > From: "Dave Chinner" > To: "CAI Qian" > Cc: "LKML" , stable@vger.kernel.org, xfs@os= s.sgi.com > Sent: Thursday, May 23, 2013 11:46:11 AM > Subject: Re: 3.9.2: xfstests triggered panic >=20 > On Wed, May 22, 2013 at 11:16:56PM -0400, CAI Qian wrote: > > ----- Original Message ----- > > > From: "Dave Chinner" > > > To: "CAI Qian" > > > Cc: "LKML" , stable@vger.kernel.org, > > > xfs@oss.sgi.com > > > Sent: Wednesday, May 22, 2013 5:53:00 PM > > > Subject: Re: 3.9.2: xfstests triggered panic > > >=20 > > > On Wed, May 22, 2013 at 04:39:58AM -0400, CAI Qian wrote: > > > > Reproduced on almost all s390x guests by running xfstests. > > > >=20 > > > > 14634.396658=C2=A8 XFS (dm-1): Mounting Filesystem > > > > 14634.525522=C2=A8 XFS (dm-1): Ending clean mount > > > > 14640.413007=C2=A8 <000000000017c6d4>=C2=A8 idle_balance+0x1a0/0x3= 40 > > > > 14640.413010=C2=A8 <000000000063303e>=C2=A8 __schedule+0xa22/0xaf0 > > > > 14640.428279=C2=A8 <0000000000630da6>=C2=A8 schedule_timeout+0x186= /0x2c0 > > > > 14640.428289=C2=A8 <00000000001cf864>=C2=A8 rcu_gp_kthread+0x1bc/0= x298 > > > > 14640.428300=C2=A8 <0000000000158c5a>=C2=A8 kthread+0xe6/0xec > > > > 14640.428304=C2=A8 <0000000000634de6>=C2=A8 kernel_thread_starter+= 0x6/0xc > > > > 14640.428308=C2=A8 <0000000000634de0>=C2=A8 kernel_thread_starter+= 0x0/0xc > > > > 14640.428311=C2=A8 Last Breaking-Event-Address: > > > > 14640.428314=C2=A8 <000000000016bd76>=C2=A8 walk_tg_tree_from+0x3a= /0xf4 > > > > 14640.428319=C2=A8 list_add corruption. next->prev should be prev > > > > (0000000000000918 > > > > ), but was (null). (next=3D (null)). > > >=20 > > > Where's XFS in this? walk_tg_tree_from() is part of the scheduler > > > code. This kind of implies a stack corruption.... > > >=20 > > > > Sometimes, this pops up, > > > > [16907.275002] WARNING: at kernel/rcutree.c:1960 > > > >=20 > > > > or this, > > > > 15316.154171=C2=A8 XFS (dm-1): Mounting Filesystem > > > > 15316.255796=C2=A8 XFS (dm-1): Ending clean mount > > > > 15320.364246=C2=A8 00000000006367a2: e310b0080004 = lg > > > > %r1,8(%r > > > > 11) > > > > 15320.364249=C2=A8 00000000006367a8: 41101010 = la > > > > %r1,16(% > > > > r1) > > > > 15320.364251=C2=A8 00000000006367ac: e33010000004 = lg > > > > %r3,0(%r > > > > 1) > > > > 15320.364252=C2=A8 Call Trace: > > > > 15320.364252=C2=A8 Last Breaking-Event-Address: > > > > 15320.364253=C2=A8 =EF=BF=BD <0000000000000000>=C2=A8 Kernel stack= overflow. > > > > 15320.364308=C2=A8 CPU: 0 Tainted: GF W 3.9.2 #1 > > > > 15320.364309=C2=A8 Process rhts-test-runne (pid: 625, task: > > > > 000000003dccc890, > > > > ksp: 0 > > >=20 > > > .... and there you go - a stack overflow. Your kernel stack size is > > > too small. > > >=20 > > > I'd suggest that you need 16k stacks on s390 - IIRC every function > > > call has 128 byte stack frame, and there are call chains 70-80 > > > functions deep in the storage stack... > > Hmm, I am unsure how to set to 16k stack there >=20 > Are you build a 64 bit s390 kernel or a 32 bit kernel? 32 bit > kernels only have an 8k stack size, 64 bit kernels are 16k (see > arch/s390/Makefile). >=20 > $ git grep STACK_SIZE arch/s390 |head -2 > arch/s390/Makefile:STACK_SIZE :=3D 8192 > arch/s390/Makefile:STACK_SIZE :=3D 16384 >=20 > As it is, the stack frame usage is worse than I thought: >=20 > $ git grep STACK_FRAME_OVERHEAD arch/s390 |head -2 > arch/s390/include/uapi/asm/ptrace.h:#define STACK_FRAME_OVERHEAD 96 = /* > size of minimum stack frame */ > arch/s390/include/uapi/asm/ptrace.h:#define STACK_FRAME_OVERHEAD 160 = /* > size of minimum stack frame */ >=20 > Overhead is 96 bytes for 32 bit and 160 bytes for 64 bit. So 16k > stack size is going to have big troubles with a 70-80 function deep > call chain. >=20 > As for powerpc: >=20 > arch/powerpc/include/asm/ppc_asm.h:#define STACKFRAMESIZE 256 >=20 > Yeah, same issue. >=20 > But, seriously, these stack traces are meaningless to anyone not > familiar with s390 or power7 - they indicate a problem detected > in the idle loop, not where ever the stack overran. >=20 > Can you please work with the s390/power7 people to obtain whatever > stack it was that overflowed, and we can go from there. >=20 > Cheers, >=20 > Dave. > -- > Dave Chinner > david@fromorbit.com >=20 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751529Ab3EWE5b (ORCPT ); Thu, 23 May 2013 00:57:31 -0400 Received: from mx4-phx2.redhat.com ([209.132.183.25]:58139 "EHLO mx4-phx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751452Ab3EWE50 convert rfc822-to-8bit (ORCPT ); Thu, 23 May 2013 00:57:26 -0400 Date: Thu, 23 May 2013 00:57:20 -0400 (EDT) From: CAI Qian To: linux-s390 , linuxppc-dev@lists.ozlabs.org Cc: LKML , stable@vger.kernel.org, xfs@oss.sgi.com, Steve Best , Hendrik Brueckner , Dave Chinner Message-ID: <1125086079.5019070.1369285040855.JavaMail.root@redhat.com> In-Reply-To: <20130523034611.GX24543@dastard> References: <40971621.4497871.1369211701112.JavaMail.root@redhat.com> <1805266998.4499261.1369211998387.JavaMail.root@redhat.com> <20130522095300.GK29466@dastard> <1483868349.4996990.1369279016162.JavaMail.root@redhat.com> <20130523034611.GX24543@dastard> Subject: 3.9.2/3.9.3: stack overrun on s390x and ppc64 (WAS Re: 3.9.2: xfstests triggered panic) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT X-Originating-IP: [10.5.82.11] X-Mailer: Zimbra 8.0.3_GA_5664 (ZimbraWebClient - FF20 (Linux)/8.0.3_GA_5664) Thread-Topic: 3.9.2/3.9.3: stack overrun on s390x and ppc64 (WAS Re: 3.9.2: xfstests triggered panic) Thread-Index: UrqmsS7Y5wTYDzl3iTO6fgrXyEgS3g== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Original report: http://oss.sgi.com/archives/xfs/2013-05/msg00683.html Also seen on Power7: http://marc.info/?l=linux-kernel&m=136927904900692&w=2 CAI Qian ----- Original Message ----- > From: "Dave Chinner" > To: "CAI Qian" > Cc: "LKML" , stable@vger.kernel.org, xfs@oss.sgi.com > Sent: Thursday, May 23, 2013 11:46:11 AM > Subject: Re: 3.9.2: xfstests triggered panic > > On Wed, May 22, 2013 at 11:16:56PM -0400, CAI Qian wrote: > > ----- Original Message ----- > > > From: "Dave Chinner" > > > To: "CAI Qian" > > > Cc: "LKML" , stable@vger.kernel.org, > > > xfs@oss.sgi.com > > > Sent: Wednesday, May 22, 2013 5:53:00 PM > > > Subject: Re: 3.9.2: xfstests triggered panic > > > > > > On Wed, May 22, 2013 at 04:39:58AM -0400, CAI Qian wrote: > > > > Reproduced on almost all s390x guests by running xfstests. > > > > > > > > 14634.396658¨ XFS (dm-1): Mounting Filesystem > > > > 14634.525522¨ XFS (dm-1): Ending clean mount > > > > 14640.413007¨ <000000000017c6d4>¨ idle_balance+0x1a0/0x340 > > > > 14640.413010¨ <000000000063303e>¨ __schedule+0xa22/0xaf0 > > > > 14640.428279¨ <0000000000630da6>¨ schedule_timeout+0x186/0x2c0 > > > > 14640.428289¨ <00000000001cf864>¨ rcu_gp_kthread+0x1bc/0x298 > > > > 14640.428300¨ <0000000000158c5a>¨ kthread+0xe6/0xec > > > > 14640.428304¨ <0000000000634de6>¨ kernel_thread_starter+0x6/0xc > > > > 14640.428308¨ <0000000000634de0>¨ kernel_thread_starter+0x0/0xc > > > > 14640.428311¨ Last Breaking-Event-Address: > > > > 14640.428314¨ <000000000016bd76>¨ walk_tg_tree_from+0x3a/0xf4 > > > > 14640.428319¨ list_add corruption. next->prev should be prev > > > > (0000000000000918 > > > > ), but was (null). (next= (null)). > > > > > > Where's XFS in this? walk_tg_tree_from() is part of the scheduler > > > code. This kind of implies a stack corruption.... > > > > > > > Sometimes, this pops up, > > > > [16907.275002] WARNING: at kernel/rcutree.c:1960 > > > > > > > > or this, > > > > 15316.154171¨ XFS (dm-1): Mounting Filesystem > > > > 15316.255796¨ XFS (dm-1): Ending clean mount > > > > 15320.364246¨ 00000000006367a2: e310b0080004 lg > > > > %r1,8(%r > > > > 11) > > > > 15320.364249¨ 00000000006367a8: 41101010 la > > > > %r1,16(% > > > > r1) > > > > 15320.364251¨ 00000000006367ac: e33010000004 lg > > > > %r3,0(%r > > > > 1) > > > > 15320.364252¨ Call Trace: > > > > 15320.364252¨ Last Breaking-Event-Address: > > > > 15320.364253¨ � <0000000000000000>¨ Kernel stack overflow. > > > > 15320.364308¨ CPU: 0 Tainted: GF W 3.9.2 #1 > > > > 15320.364309¨ Process rhts-test-runne (pid: 625, task: > > > > 000000003dccc890, > > > > ksp: 0 > > > > > > .... and there you go - a stack overflow. Your kernel stack size is > > > too small. > > > > > > I'd suggest that you need 16k stacks on s390 - IIRC every function > > > call has 128 byte stack frame, and there are call chains 70-80 > > > functions deep in the storage stack... > > Hmm, I am unsure how to set to 16k stack there > > Are you build a 64 bit s390 kernel or a 32 bit kernel? 32 bit > kernels only have an 8k stack size, 64 bit kernels are 16k (see > arch/s390/Makefile). > > $ git grep STACK_SIZE arch/s390 |head -2 > arch/s390/Makefile:STACK_SIZE := 8192 > arch/s390/Makefile:STACK_SIZE := 16384 > > As it is, the stack frame usage is worse than I thought: > > $ git grep STACK_FRAME_OVERHEAD arch/s390 |head -2 > arch/s390/include/uapi/asm/ptrace.h:#define STACK_FRAME_OVERHEAD 96 /* > size of minimum stack frame */ > arch/s390/include/uapi/asm/ptrace.h:#define STACK_FRAME_OVERHEAD 160 /* > size of minimum stack frame */ > > Overhead is 96 bytes for 32 bit and 160 bytes for 64 bit. So 16k > stack size is going to have big troubles with a 70-80 function deep > call chain. > > As for powerpc: > > arch/powerpc/include/asm/ppc_asm.h:#define STACKFRAMESIZE 256 > > Yeah, same issue. > > But, seriously, these stack traces are meaningless to anyone not > familiar with s390 or power7 - they indicate a problem detected > in the idle loop, not where ever the stack overran. > > Can you please work with the s390/power7 people to obtain whatever > stack it was that overflowed, and we can go from there. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >