From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id o348F9ZV121150 for ; Sun, 4 Apr 2010 03:15:10 -0500 Received: from peace.netnation.com (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 2F30F299773 for ; Sun, 4 Apr 2010 01:16:57 -0700 (PDT) Received: from peace.netnation.com (newpeace.netnation.com [204.174.223.7]) by cuda.sgi.com with ESMTP id Rny3fek9N9EVideT for ; Sun, 04 Apr 2010 01:16:57 -0700 (PDT) Received: from sim by peace.netnation.com with local (Exim 4.63) (envelope-from ) id 1NyL0r-0006iG-BT for xfs@oss.sgi.com; Sun, 04 Apr 2010 01:16:57 -0700 Date: Sun, 4 Apr 2010 01:16:57 -0700 From: Simon Kirby Subject: Oops in xfs_dir2_block_lookup_int from storage corruption [2.6.33] Message-ID: <20100404081657.GA24143@hostway.ca> MIME-Version: 1.0 Content-Disposition: inline List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com Hi! :) We had a fun event of two HA servers crashing back and forth to each other because of corrupted storage. xfs_repair eventually corrected the issue, but in the meantime our HA NFS servers kept breaking shortly after mounting and serving the file systems (all XFS). There is an identical backtrace on the two servers. This crashed on a kernel with CONFIG_DEBUG_INFO=y, so a useful vmlinux is available. Apr 3 04:41:59 nas03 kernel: XFS mounting filesystem dm-13 Apr 3 04:42:00 nas03 kernel: Starting XFS recovery on filesystem: dm-13 (logdev: internal) Apr 3 04:42:00 nas03 kernel: Ending XFS recovery on filesystem: dm-13 (logdev: internal) Apr 3 04:42:00 nas03 kernel: NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory Apr 3 04:42:00 nas03 kernel: NFSD: starting 90-second grace period Apr 3 04:42:32 nas03 kernel: BUG: unable to handle kernel paging request at ffff8800f674bf00 Apr 3 04:42:32 nas03 kernel: IP: [] xfs_dir2_block_lookup_int+0xb1/0x1d0 Apr 3 04:42:32 nas03 kernel: PGD 182b063 PUD 0 Apr 3 04:42:32 nas03 kernel: Oops: 0000 [#1] SMP Apr 3 04:42:32 nas03 kernel: last sysfs file: /sys/devices/virtual/block/dm-13/removable Apr 3 04:42:32 nas03 kernel: CPU 6 Apr 3 04:42:32 nas03 kernel: Pid: 29724, comm: nfsd Not tainted 2.6.33-hw #1 0M788G/PowerEdge 1950 Apr 3 04:42:32 nas03 kernel: RIP: 0010:[] [] xfs_dir2_block_lookup_int+0xb1/0x1d0 Apr 3 04:42:32 nas03 kernel: RSP: 0000:ffff8803ca8738a0 EFLAGS: 00010203 Apr 3 04:42:32 nas03 kernel: RAX: 0000000015c6535e RBX: 0000000015c6535e RCX: 0000000000000000 Apr 3 04:42:32 nas03 kernel: RDX: 000000002b8ca6bd RSI: ffff8803137c7cc0 RDI: 000000000000172e Apr 3 04:42:32 nas03 kernel: RBP: ffff8803ca873910 R08: ffff880048422410 R09: 0000000000000000 Apr 3 04:42:32 nas03 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 000000003e3c623e Apr 3 04:42:32 nas03 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff8803ca873960 Apr 3 04:42:32 nas03 kernel: FS: 0000000000000000(0000) GS:ffff880028380000(0000) knlGS:0000000000000000 Apr 3 04:42:32 nas03 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Apr 3 04:42:32 nas03 kernel: CR2: ffff8800f674bf00 CR3: 000000043d1b6000 CR4: 00000000000006e0 Apr 3 04:42:32 nas03 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 3 04:42:32 nas03 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Apr 3 04:42:32 nas03 kernel: Process nfsd (pid: 29724, threadinfo ffff8803ca872000, task ffff880416f46900) Apr 3 04:42:32 nas03 kernel: Stack: Apr 3 04:42:32 nas03 kernel: ffff8803ca8738e0 ffff8803ca87392c ffff8803ca873920 ffff8803010c8ff8 Apr 3 04:42:32 nas03 kernel: <0> ffff880416309400 0000000000000000 ffff8803010c8000 0000000000000000 Apr 3 04:42:32 nas03 kernel: <0> ffff8803137c7cc0 ffff8803ca8739ec ffff8803ca873960 0000000000000000 Apr 3 04:42:32 nas03 kernel: Call Trace: Apr 3 04:42:32 nas03 kernel: [] xfs_dir2_block_lookup+0x2c/0xf0 Apr 3 04:42:32 nas03 kernel: [] xfs_dir_lookup+0xdf/0x180 Apr 3 04:42:32 nas03 kernel: [] xfs_lookup+0x81/0x110 Apr 3 04:42:32 nas03 kernel: [] xfs_fs_get_parent+0x25/0x50 Apr 3 04:42:32 nas03 kernel: [] reconnect_path+0x150/0x2e0 Apr 3 04:42:32 nas03 kernel: [] exportfs_decode_fh+0xf2/0x260 Apr 3 04:42:32 nas03 kernel: [] ? nfsd_acceptable+0x0/0x100 Apr 3 04:42:32 nas03 kernel: [] ? exp_find_key+0x5d/0xc0 Apr 3 04:42:32 nas03 kernel: [] fh_verify+0x42d/0x670 Apr 3 04:42:32 nas03 kernel: [] ? default_wake_function+0xd/0x10 Apr 3 04:42:32 nas03 kernel: [] ? __wake_up_common+0x5a/0x90 Apr 3 04:42:32 nas03 kernel: [] nfsd_lookup_dentry+0x49/0x450 Apr 3 04:42:32 nas03 kernel: [] ? sunrpc_cache_lookup+0x66/0x170 Apr 3 04:42:32 nas03 kernel: [] nfsd_lookup+0x2e/0xd0 Apr 3 04:42:32 nas03 kernel: [] ? cache_check+0xf4/0x3a0 Apr 3 04:42:32 nas03 kernel: [] nfsd3_proc_lookup+0xa1/0x120 Apr 3 04:42:32 nas03 kernel: [] nfsd_dispatch+0xb5/0x240 Apr 3 04:42:32 nas03 kernel: [] svc_process+0x475/0x770 Apr 3 04:42:32 nas03 kernel: [] ? nfsd+0x0/0x130 Apr 3 04:42:32 nas03 kernel: [] nfsd+0xaf/0x130 Apr 3 04:42:32 nas03 kernel: [] kthread+0x8e/0xa0 Apr 3 04:42:32 nas03 kernel: [] kernel_thread_helper+0x4/0x10 Apr 3 04:42:32 nas03 kernel: [] ? kthread+0x0/0xa0 Apr 3 04:42:32 nas03 kernel: [] ? kernel_thread_helper+0x0/0x10 Apr 3 04:42:32 nas03 kernel: Code: f8 48 89 4d a8 8b 50 f8 49 89 c8 0f ca 89 d0 31 c9 ff ca 48 c1 e0 03 49 29 c0 eb 07 8d 4b 01 39 ca 7c 1d 8d 1c 0a d1 fb 48 63 c3 <45> 8b 24 c0 41 0f cc 44 39 e7 74 1d 77 e3 8d 53 ff 39 ca 7d e3 Apr 3 04:42:32 nas03 kernel: RIP [] xfs_dir2_block_lookup_int+0xb1/0x1d0 Apr 3 04:42:32 nas03 kernel: RSP Apr 3 04:42:32 nas03 kernel: CR2: ffff8800f674bf00 Apr 3 04:42:32 nas03 kernel: ---[ end trace e16c66e592592c64 ]--- 0xB1 == 177 decimal: Dump of assembler code for function xfs_dir2_block_lookup_int: 0xffffffff8125e3d0 : push %rbp 0xffffffff8125e3d1 : xor %r9d,%r9d 0xffffffff8125e3d4 : mov $0xffffffffffffffff,%rcx 0xffffffff8125e3db : mov %rsp,%rbp 0xffffffff8125e3de : push %r15 0xffffffff8125e3e0 : mov %rdi,%r15 0xffffffff8125e3e3 : lea -0x30(%rbp),%r8 0xffffffff8125e3e7 : push %r14 0xffffffff8125e3e9 : push %r13 0xffffffff8125e3eb : push %r12 0xffffffff8125e3ed : push %rbx 0xffffffff8125e3ee : sub $0x48,%rsp 0xffffffff8125e3f2 : mov %rsi,-0x60(%rbp) 0xffffffff8125e3f6 : mov %rdx,-0x68(%rbp) 0xffffffff8125e3fa : mov 0x30(%rdi),%rsi 0xffffffff8125e3fe : mov 0x48(%rdi),%rax 0xffffffff8125e402 : mov %rax,-0x48(%rbp) 0xffffffff8125e406 : mov (%rsi),%rdx 0xffffffff8125e409 : mov %rax,%rdi 0xffffffff8125e40c : mov %rdx,-0x50(%rbp) 0xffffffff8125e410 : mov 0x2f8(%rdx),%edx 0xffffffff8125e416 : callq 0xffffffff8125b170 0xffffffff8125e41b : test %eax,%eax 0xffffffff8125e41d : mov %eax,%r9d 0xffffffff8125e420 : je 0xffffffff8125e434 0xffffffff8125e422 : add $0x48,%rsp 0xffffffff8125e426 : mov %r9d,%eax 0xffffffff8125e429 : pop %rbx 0xffffffff8125e42a : pop %r12 0xffffffff8125e42c : pop %r13 0xffffffff8125e42e : pop %r14 0xffffffff8125e430 : pop %r15 0xffffffff8125e432 : leaveq 0xffffffff8125e433 : retq 0xffffffff8125e434 : mov -0x30(%rbp),%rsi 0xffffffff8125e438 : mov -0x50(%rbp),%rdx 0xffffffff8125e43c : mov 0x20(%r15),%edi 0xffffffff8125e440 : mov 0x8(%rsi),%rax 0xffffffff8125e444 : mov %rax,-0x40(%rbp) 0xffffffff8125e448 : movslq 0x2f0(%rdx),%rax 0xffffffff8125e44f : add -0x40(%rbp),%rax 0xffffffff8125e453 : lea -0x8(%rax),%rcx 0xffffffff8125e457 : mov %rcx,-0x58(%rbp) 0xffffffff8125e45b : mov -0x8(%rax),%edx 0xffffffff8125e45e : mov %rcx,%r8 0xffffffff8125e461 : bswap %edx 0xffffffff8125e463 : mov %edx,%eax 0xffffffff8125e465 : xor %ecx,%ecx 0xffffffff8125e467 : dec %edx 0xffffffff8125e469 : shl $0x3,%rax 0xffffffff8125e46d : sub %rax,%r8 0xffffffff8125e470 : jmp 0xffffffff8125e479 0xffffffff8125e472 : lea 0x1(%rbx),%ecx 0xffffffff8125e475 : cmp %ecx,%edx 0xffffffff8125e477 : jl 0xffffffff8125e496 0xffffffff8125e479 : lea (%rdx,%rcx,1),%ebx 0xffffffff8125e47c : sar %ebx 0xffffffff8125e47e : movslq %ebx,%rax 0xffffffff8125e481 : mov (%r8,%rax,8),%r12d <====== 0xffffffff8125e485 : bswap %r12d 0xffffffff8125e488 : cmp %r12d,%edi 0xffffffff8125e48b : je 0xffffffff8125e4aa Recompiling the function with asm("nop") inserted, I think I tracked down that instruction to this line: /* * Loop doing a binary search for our hash value. * Find our entry, ENOENT if it's not there. */ for (low = 0, high = be32_to_cpu(btp->count) - 1; ; ) { ASSERT(low <= high); mid = (low + high) >> 1; ======> if ((hash = be32_to_cpu(blp[mid].hashval)) == args->hashval) break; blp is 8 bytes which matches the multiplication of the array index in the instruction. So, this means that blp[mid] pointed to 0xffff8800f674bf00, and mid was 0x0000000015c6535e. That sounds kind of high...? xfs_repair did spit out a number of errors such as this one which may or may not explain the Oops: entry ".." at block 0 offset 32 in directory inode 1137203971 references non-existent inode 3255359147 clearing inode number in entry at offset 32... Cheers! Simon- _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs