From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S275384AbTHMTtm (ORCPT ); Wed, 13 Aug 2003 15:49:42 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S275375AbTHMTtm (ORCPT ); Wed, 13 Aug 2003 15:49:42 -0400 Received: from e2.ny.us.ibm.com ([32.97.182.102]:27074 "EHLO e2.ny.us.ibm.com") by vger.kernel.org with ESMTP id S275355AbTHMTtZ (ORCPT ); Wed, 13 Aug 2003 15:49:25 -0400 Date: Wed, 13 Aug 2003 12:53:02 -0700 From: "Martin J. Bligh" To: linux-kernel cc: robbiew@us.ibm.com Subject: [Bug 1097] New: NFS causing kernel BUG at lines 1701 and 1702 in slab.c Message-ID: <941260000.1060804382@flay> X-Mailer: Mulberry/2.1.2 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org http://bugme.osdl.org/show_bug.cgi?id=1097 Summary: NFS causing kernel BUG at lines 1701 and 1702 in slab.c Kernel Version: 2.6.0-test3 Status: NEW Severity: blocking Owner: trond.myklebust@fys.uio.no Submitter: robbiew@us.ibm.com CC: sglass@us.ibm.com Distribution: SuSE 8.0 Hardware Environment: Described in testplan located at http://ltp.sf.net/nfs Software Environment: glibc 2.2.5 and nfs-utils 1.0.5 on server. Problem Description: One NFS client hit the 1701 bug, while the other hit the 1702. Both hit after a few hours of performing a loop of random file I/O system calls over NFS. Afterwards, even though the test was still running, all NFS activity had stopped on both clients. Also, a cat /proc/slabinfo produced a Segmentation Fault. Here's an example of both bugs: =========================== kernel BUG at mm/slab.c:1701! invalid operand: 0000 [#1] CPU: 0 EIP: 0060:[cache_alloc_refill+258/720] Not tainted EFLAGS: 00010002 EIP is at cache_alloc_refill+0x102/0x2d0 eax: 00000000 ebx: 00000006 ecx: c74a7000 edx: c74a7018 esi: 00000007 edi: 00000004 ebp: c43dfcd8 esp: c43dfca4 ds: 007b es: 007b ss: 0068 Process fsstress (pid: 26441, threadinfo=c43de000 task=cf139940) Stack: 00000000 cfdaee98 00000286 cfda6958 00000006 c74a7000 00000006 cfdaeea4 cfdaeea4 cfdaeeac 00001000 c74a7018 cfda6948 c43dfcfc c0130ea4 cfdaee98 000000d0 00000000 cefd8b80 cfe82fec c0426b98 c43dfd1c c43dfd0c c019719b Call Trace: [kmem_cache_alloc+76/280] kmem_cache_alloc+0x4c/0x118 [nfs_alloc_inode+19/64] nfs_alloc_inode+0x13/0x40 [alloc_inode+22/332] alloc_inode+0x16/0x14c [get_new_inode+18/196] get_new_inode+0x12/0xc4 [iget5_locked+125/136] iget5_locked+0x7d/0x88 [nfs_find_actor+0/176] nfs_find_actor+0x0/0xb0 [nfs_init_locked+0/60] nfs_init_locked+0x0/0x3c [__nfs_fhget+95/808] __nfs_fhget+0x5f/0x328 [nfs_find_actor+0/176] nfs_find_actor+0x0/0xb0 [nfs_init_locked+0/60] nfs_init_locked+0x0/0x3c [nfs_fhget+67/76] nfs_fhget+0x43/0x4c [nfs_instantiate+87/172] nfs_instantiate+0x57/0xac [nfs_symlink+325/432] nfs_symlink+0x145/0x1b0 [call_timeout+32/264] call_timeout+0x20/0x108 [nfs_permission+284/392] nfs_permission+0x11c/0x188 [permission+38/60] permission+0x26/0x3c [vfs_symlink+95/136] vfs_symlink+0x5f/0x88 [sys_symlink+121/204] sys_symlink+0x79/0xcc [syscall_call+7/11] syscall_call+0x7/0xb Code: 0f 0b a5 06 87 8a 38 c0 89 f6 85 c0 7c 04 39 d8 72 08 0f 0b =========================== kernel BUG at mm/slab.c:1702! invalid operand: 0000 [#1] CPU: 0 EIP: 0060:[cache_alloc_refill+276/720] Not tainted EFLAGS: 00010086 EIP is at cache_alloc_refill+0x114/0x2d0 eax: cfdb875c ebx: 00000006 ecx: cfdb1eac edx: cfdb1ec4 esi: 00000002 edi: 00000007 ebp: c345fcd8 esp: c345fca4 ds: 007b es: 007b ss: 0068 Process fsstress (pid: 15862, threadinfo=c345e000 task=c7ce59c0) Stack: 00000000 cfdb1e98 00000286 cfdb0958 00000006 cfdb1eac 00000006 cfdb1ea4 cfdb1ea4 cfdb1eac 00000000 cfdb1ec4 cfdb0948 c345fcfc c0130ea4 cfdb1e98 000000d0 00000000 cf034b80 cfe8c080 c0426bb0 c345fd1c c345fd0c c019719b Call Trace: [kmem_cache_alloc+76/280] kmem_cache_alloc+0x4c/0x118 [nfs_alloc_inode+19/64] nfs_alloc_inode+0x13/0x40 [alloc_inode+22/332] alloc_inode+0x16/0x14c [get_new_inode+18/196] get_new_inode+0x12/0xc4 [iget5_locked+125/136] iget5_locked+0x7d/0x88 [nfs_find_actor+0/176] nfs_find_actor+0x0/0xb0 [nfs_init_locked+0/60] nfs_init_locked+0x0/0x3c [__nfs_fhget+95/808] __nfs_fhget+0x5f/0x328 [nfs_find_actor+0/176] nfs_find_actor+0x0/0xb0 [nfs_init_locked+0/60] nfs_init_locked+0x0/0x3c [nfs_fhget+67/76] nfs_fhget+0x43/0x4c [nfs_instantiate+87/172] nfs_instantiate+0x57/0xac [nfs_mknod+172/232] nfs_mknod+0xac/0xe8 [parse_rock_ridge_inode_internal+696/1504] parse_rock_ridge_inode_internal+0x2b8/0x5e0 [nfs_permission+354/392] nfs_permission+0x162/0x188 [permission+38/60] permission+0x26/0x3c [vfs_mknod+166/208] vfs_mknod+0xa6/0xd0 [sys_mknod+283/372] sys_mknod+0x11b/0x174 [syscall_call+7/11] syscall_call+0x7/0xb Code: 0f 0b a6 06 87 8a 38 c0 8b 55 f8 8b 04 82 83 f8 ff 75 d5 8b =========================== Steps to reproduce: See the http://ltp.sf.net/nfs for the test plan and location of the test, fsstress, that produces the random file i/o related system calls. I'm thinking this may be an NFS issue, rather than slab, so I decided to start here first.