* Daily crash in xfs_cmn_err
@ 2012-10-29 10:55 Juerg Haefliger
2012-10-29 12:53 ` Dave Chinner
2012-10-29 14:23 ` Carlos Maiolino
0 siblings, 2 replies; 6+ messages in thread
From: Juerg Haefliger @ 2012-10-29 10:55 UTC (permalink / raw)
To: xfs
Hi,
I have a node that used to crash every day at 6:25am in xfs_cmn_err
(Null pointer dereference). I'm running 2.6.38 (yes I know it's old)
and I'm not reporting a bug as I'm sure the problem has been fixed in
newer kernels :-) The crashes occurred daily when logrotate ran,
/var/log is a separate 100GB XFS volume. I finally took the node down
and ran xfs_check on the the /var/log volume and it did indeed report
some errors/warnings (see end of email). I then ran xfs_repair (see
end of email for output) and the node is stable ever since.
So my questions are:
1) I was under the impression that during the mounting of an XFS
volume some sort of check/repair is performed. How does that differ
from running xfs_check and/or xfs_repair?
2) Any ideas how the filesystem might have gotten into this state? I
don't have the history of that node but it's possible that it crashed
previously due to an unrelated problem. Could this have left the
filesystem is this state?
3) What exactly does the ouput of the xfs_check mean? How serious is
it? Are those warning or errors? Will some of them get cleanup up
during the mounting of the filesystem?
4) We have a whole bunch of production nodes running the same kernel.
I'm more than a little concerned that we might have a ticking timebomb
with some filesystems being in a state that might trigger a crash
eventually. Is there any way to perform a live check on a mounted
filesystem so that I can get an idea of how big of a problem we have
(if any)? i don't claim to know exactly what I'm doing but I picked a
node, froze the filesystem and then ran a modified xfs_check (which
bypasses the is_mounted check and ignores non-committed metadata) and
it did report some issues. At this point I believe those are false
positive. Do you have any suggestions short of rebooting the nodes and
running xfs_check on the unmounted filesystem?
Thanks
....Juerg
(initramfs) xfs_check /dev/mapper/vg0-varlog
block 0/3538 expected type unknown got free2
block 0/13862 expected type unknown got free2
block 0/13863 expected type unknown got free2
<SNIP>
block 0/16983 expected type unknown got free2
block 0/16984 expected type unknown got free2
block 0/21700 expected type unknown got data
block 0/21701 expected type unknown got data
<SNIP>
block 0/21826 expected type unknown got data
block 0/21827 expected type unknown got data
block 0/21700 claimed by inode 178, previous inum 148
block 0/21701 claimed by inode 178, previous inum 148
<SNIP>
block 0/21826 claimed by inode 178, previous inum 148
block 0/21827 claimed by inode 178, previous inum 148
block 0/1250 expected type unknown got data
block 0/1251 expected type unknown got data
<SNIP>
block 0/1264 expected type unknown got data
block 0/1265 expected type unknown got data
block 0/1250 claimed by inode 1706, previous inum 148
block 0/1251 claimed by inode 1706, previous inum 148
<SNIP>
block 0/1264 claimed by inode 1706, previous inum 148
block 0/1265 claimed by inode 1706, previous inum 148
block 0/16729 expected type unknown got data
block 0/16730 expected type unknown got data
<SNIP>
block 0/16889 expected type unknown got data
block 0/16890 expected type unknown got data
block 0/16729 claimed by inode 1710, previous inum 148
block 0/16730 claimed by inode 1710, previous inum 148
<SNIP>
block 0/16889 claimed by inode 1710, previous inum 148
block 0/16890 claimed by inode 1710, previous inum 148
block 0/3523 expected type unknown got data
block 0/3524 expected type unknown got data
<SNIP>
block 0/3536 expected type unknown got data
block 0/3537 expected type unknown got data
block 0/3523 claimed by inode 1994, previous inum 148
block 0/3524 claimed by inode 1994, previous inum 148
<SNIP>
block 0/3536 claimed by inode 1994, previous inum 148
block 0/3537 claimed by inode 1994, previous inum 148
block 0/510 type unknown not expected
block 0/511 type unknown not expected
block 0/512 type unknown not expected
<SNIP>
block 0/2341 type unknown not expected
block 0/2342 type unknown not expected
(initramfs) xfs_repair /dev/mapper/vg0-varlog
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
data fork in ino 148 claims free block 3538
data fork in ino 148 claims free block 13862
data fork in ino 148 claims free block 13863
data fork in ino 148 claims free block 16891
data fork in ino 148 claims free block 16892
data fork in regular inode 178 claims used block 21700
bad data fork in inode 178
cleared inode 178
data fork in regular inode 1706 claims used block 1250
bad data fork in inode 1706
cleared inode 1706
data fork in regular inode 1710 claims used block 16729
bad data fork in inode 1710
cleared inode 1710
data fork in regular inode 1994 claims used block 3523
bad data fork in inode 1994
cleared inode 1994
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
entry "syslog" at block 0 offset 2384 in directory inode 128
references free inode 178
- agno = 3
clearing inode number in entry at offset 2384...
- agno = 2
data fork in ino 148 claims dup extent, off - 0, start - 1250, cnt 16
bad data fork in inode 148
cleared inode 148
entry "nova-api.log" at block 0 offset 552 in directory inode 165
references free inode 1994
clearing inode number in entry at offset 552...
entry "nova-network.log.6" at block 0 offset 2608 in directory inode
165 references free inode 1706
clearing inode number in entry at offset 2608...
entry "nova-api.log.6" at block 0 offset 4032 in directory inode 165
references free inode 1710
clearing inode number in entry at offset 4032...
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
entry "syslog.1" in directory inode 128 points to free inode 148
bad hash table for directory inode 128 (no data entry): rebuilding
rebuilding directory inode 128
bad hash table for directory inode 165 (no data entry): rebuilding
rebuilding directory inode 165
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Daily crash in xfs_cmn_err
2012-10-29 10:55 Daily crash in xfs_cmn_err Juerg Haefliger
@ 2012-10-29 12:53 ` Dave Chinner
2012-10-30 8:58 ` Juerg Haefliger
2012-10-29 14:23 ` Carlos Maiolino
1 sibling, 1 reply; 6+ messages in thread
From: Dave Chinner @ 2012-10-29 12:53 UTC (permalink / raw)
To: Juerg Haefliger; +Cc: xfs
On Mon, Oct 29, 2012 at 11:55:15AM +0100, Juerg Haefliger wrote:
> Hi,
>
> I have a node that used to crash every day at 6:25am in xfs_cmn_err
> (Null pointer dereference).
Stack trace, please.
> 1) I was under the impression that during the mounting of an XFS
> volume some sort of check/repair is performed. How does that differ
> from running xfs_check and/or xfs_repair?
Journal recovery is performed at mount time, not a consistency
check.
http://en.wikipedia.org/wiki/Filesystem_journaling
> 2) Any ideas how the filesystem might have gotten into this state? I
> don't have the history of that node but it's possible that it crashed
> previously due to an unrelated problem. Could this have left the
> filesystem is this state?
<shrug>
How long is a piece of string?
> 3) What exactly does the ouput of the xfs_check mean? How serious is
> it? Are those warning or errors? Will some of them get cleanup up
> during the mounting of the filesystem?
xfs_check is deprecated. The output of xfs_repair indicates
cross-linked extent indexes. Will only get properly detected and
fixed by xfs_repair. And "fixed" may mean corrupt files are removed
from the filesystem - repair does nto guarantee that your data is
preserved or consistent after it runs, just that the filesystem is
consistent and error free.
> 4) We have a whole bunch of production nodes running the same kernel.
> I'm more than a little concerned that we might have a ticking timebomb
> with some filesystems being in a state that might trigger a crash
> eventually. Is there any way to perform a live check on a mounted
> filesystem so that I can get an idea of how big of a problem we have
> (if any)?
Read the xfs_repair man page?
-n No modify mode. Specifies that xfs_repair should not
modify the filesystem but should only scan the filesystem
and indicate what repairs would have been made.
.....
-d Repair dangerously. Allow xfs_repair to repair an XFS
filesystem mounted read only. This is typically done on a
root fileystem from single user mode, immediately followed by
a reboot.
So, remount read only, run xfs_repair -d -n will check the
filesystem as best as can be done online. If there are any problems,
then you can repair them and immediately reboot.
> i don't claim to know exactly what I'm doing but I picked a
> node, froze the filesystem and then ran a modified xfs_check (which
> bypasses the is_mounted check and ignores non-committed metadata) and
> it did report some issues. At this point I believe those are false
> positive. Do you have any suggestions short of rebooting the nodes and
> running xfs_check on the unmounted filesystem?
Don't bother with xfs_check. xfs_repair will detect all the same
errors (and more) and can fix them at the same time.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Daily crash in xfs_cmn_err
2012-10-29 10:55 Daily crash in xfs_cmn_err Juerg Haefliger
2012-10-29 12:53 ` Dave Chinner
@ 2012-10-29 14:23 ` Carlos Maiolino
2012-10-30 9:07 ` Juerg Haefliger
1 sibling, 1 reply; 6+ messages in thread
From: Carlos Maiolino @ 2012-10-29 14:23 UTC (permalink / raw)
To: Juerg Haefliger; +Cc: xfs
Hi,
>
> I have a node that used to crash every day at 6:25am in xfs_cmn_err
> (Null pointer dereference). I'm running 2.6.38 (yes I know it's old)
> and I'm not reporting a bug as I'm sure the problem has been fixed in
> newer kernels :-) The crashes occurred daily when logrotate ran,
> /var/log is a separate 100GB XFS volume. I finally took the node down
> and ran xfs_check on the the /var/log volume and it did indeed report
> some errors/warnings (see end of email). I then ran xfs_repair (see
> end of email for output) and the node is stable ever since.
>
> So my questions are:
>
> 1) I was under the impression that during the mounting of an XFS
> volume some sort of check/repair is performed. How does that differ
> from running xfs_check and/or xfs_repair?
>
AFAICT, there is no repair or check happening at mount time, only information
needed to mount the filesystem are checked, everything else (like consistency of
blocks and inodes) are not checked during mount time. Maybe you mean the journal
replay that happens during mount time?
> 2) Any ideas how the filesystem might have gotten into this state? I
> don't have the history of that node but it's possible that it crashed
> previously due to an unrelated problem. Could this have left the
> filesystem is this state?
>
> 3) What exactly does the ouput of the xfs_check mean? How serious is
> it? Are those warning or errors? Will some of them get cleanup up
> during the mounting of the filesystem?
>
comments below
> 4) We have a whole bunch of production nodes running the same kernel.
> I'm more than a little concerned that we might have a ticking timebomb
> with some filesystems being in a state that might trigger a crash
> eventually. Is there any way to perform a live check on a mounted
> filesystem so that I can get an idea of how big of a problem we have
> (if any)? i don't claim to know exactly what I'm doing but I picked a
> node, froze the filesystem and then ran a modified xfs_check (which
> bypasses the is_mounted check and ignores non-committed metadata) and
> it did report some issues. At this point I believe those are false
> positive. Do you have any suggestions short of rebooting the nodes and
> running xfs_check on the unmounted filesystem?
>
You must umount the filesystem before run xfs_repair/check, even with a freezed
fs you may get inconsistent data wich may lead you to have more problems.
> Thanks
> ....Juerg
>
>
> (initramfs) xfs_check /dev/mapper/vg0-varlog
> block 0/3538 expected type unknown got free2
> block 0/13862 expected type unknown got free2
> block 0/13863 expected type unknown got free2
>
> <SNIP>
>
> block 0/16983 expected type unknown got free2
> block 0/16984 expected type unknown got free2
> block 0/21700 expected type unknown got data
> block 0/21701 expected type unknown got data
>
> <SNIP>
>
> block 0/21826 expected type unknown got data
> block 0/21827 expected type unknown got data
> block 0/21700 claimed by inode 178, previous inum 148
> block 0/21701 claimed by inode 178, previous inum 148
>
> <SNIP>
>
> block 0/21826 claimed by inode 178, previous inum 148
> block 0/21827 claimed by inode 178, previous inum 148
> block 0/1250 expected type unknown got data
> block 0/1251 expected type unknown got data
>
> <SNIP>
>
> block 0/1264 expected type unknown got data
> block 0/1265 expected type unknown got data
> block 0/1250 claimed by inode 1706, previous inum 148
> block 0/1251 claimed by inode 1706, previous inum 148
>
> <SNIP>
>
> block 0/1264 claimed by inode 1706, previous inum 148
> block 0/1265 claimed by inode 1706, previous inum 148
> block 0/16729 expected type unknown got data
> block 0/16730 expected type unknown got data
>
> <SNIP>
>
> block 0/16889 expected type unknown got data
> block 0/16890 expected type unknown got data
> block 0/16729 claimed by inode 1710, previous inum 148
> block 0/16730 claimed by inode 1710, previous inum 148
>
> <SNIP>
>
> block 0/16889 claimed by inode 1710, previous inum 148
> block 0/16890 claimed by inode 1710, previous inum 148
> block 0/3523 expected type unknown got data
> block 0/3524 expected type unknown got data
>
> <SNIP>
>
> block 0/3536 expected type unknown got data
> block 0/3537 expected type unknown got data
> block 0/3523 claimed by inode 1994, previous inum 148
> block 0/3524 claimed by inode 1994, previous inum 148
>
> <SNIP>
>
> block 0/3536 claimed by inode 1994, previous inum 148
> block 0/3537 claimed by inode 1994, previous inum 148
> block 0/510 type unknown not expected
> block 0/511 type unknown not expected
> block 0/512 type unknown not expected
>
> <SNIP>
>
> block 0/2341 type unknown not expected
> block 0/2342 type unknown not expected
>
>
>
> (initramfs) xfs_repair /dev/mapper/vg0-varlog
> Phase 1 - find and verify superblock...
> Phase 2 - using internal log
> - zero log...
> - scan filesystem freespace and inode maps...
> - found root inode chunk
> Phase 3 - for each AG...
> - scan and clear agi unlinked lists...
> - process known inodes and perform inode discovery...
> - agno = 0
> data fork in ino 148 claims free block 3538
> data fork in ino 148 claims free block 13862
> data fork in ino 148 claims free block 13863
> data fork in ino 148 claims free block 16891
> data fork in ino 148 claims free block 16892
> data fork in regular inode 178 claims used block 21700
> bad data fork in inode 178
> cleared inode 178
> data fork in regular inode 1706 claims used block 1250
> bad data fork in inode 1706
> cleared inode 1706
> data fork in regular inode 1710 claims used block 16729
> bad data fork in inode 1710
> cleared inode 1710
> data fork in regular inode 1994 claims used block 3523
> bad data fork in inode 1994
> cleared inode 1994
> - agno = 1
> - agno = 2
> - agno = 3
> - process newly discovered inodes...
> Phase 4 - check for duplicate blocks...
> - setting up duplicate extent list...
> - check for inodes claiming duplicate blocks...
> - agno = 0
> - agno = 1
> entry "syslog" at block 0 offset 2384 in directory inode 128
> references free inode 178
> - agno = 3
> clearing inode number in entry at offset 2384...
> - agno = 2
> data fork in ino 148 claims dup extent, off - 0, start - 1250, cnt 16
> bad data fork in inode 148
> cleared inode 148
> entry "nova-api.log" at block 0 offset 552 in directory inode 165
> references free inode 1994
> clearing inode number in entry at offset 552...
> entry "nova-network.log.6" at block 0 offset 2608 in directory inode
> 165 references free inode 1706
> clearing inode number in entry at offset 2608...
> entry "nova-api.log.6" at block 0 offset 4032 in directory inode 165
> references free inode 1710
> clearing inode number in entry at offset 4032...
> Phase 5 - rebuild AG headers and trees...
> - reset superblock...
> Phase 6 - check inode connectivity...
> - resetting contents of realtime bitmap and summary inodes
> - traversing filesystem ...
> entry "syslog.1" in directory inode 128 points to free inode 148
> bad hash table for directory inode 128 (no data entry): rebuilding
> rebuilding directory inode 128
> bad hash table for directory inode 165 (no data entry): rebuilding
> rebuilding directory inode 165
> - traversal finished ...
> - moving disconnected inodes to lost+found ...
> Phase 7 - verify and correct link counts...
> done
>
Looking through these logs, I'd say that the system crashed before and was not
using a kind of storage without a battery backed unit and some metadata has been
lost during the crash, or, a xfs_repair were ran, or the filesystem mounted
after drop the dirty logs into the filesystem (like run xfs_repair -L).
These are my points of view though, some other people may give you more detailed
answer.
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
--
--Carlos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Daily crash in xfs_cmn_err
2012-10-29 12:53 ` Dave Chinner
@ 2012-10-30 8:58 ` Juerg Haefliger
2012-10-30 19:02 ` Eric Sandeen
0 siblings, 1 reply; 6+ messages in thread
From: Juerg Haefliger @ 2012-10-30 8:58 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
On Mon, Oct 29, 2012 at 1:53 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Oct 29, 2012 at 11:55:15AM +0100, Juerg Haefliger wrote:
>> Hi,
>>
>> I have a node that used to crash every day at 6:25am in xfs_cmn_err
>> (Null pointer dereference).
>
> Stack trace, please.
[128185.204521] BUG: unable to handle kernel NULL pointer dereference
at 00000000000000f8
[128185.213436] IP: [<ffffffffa010c95f>] xfs_cmn_err+0x4f/0xc0 [xfs]
[128185.220302] PGD 17dd180067 PUD 17ddddf067 PMD 0
[128185.225612] Oops: 0000 [#1] SMP
[128185.229359] last sysfs file: /sys/module/ip_tables/initstate
[128185.235802] CPU 6
[128185.237937] Modules linked in: ipmi_devintf ipmi_si
ipmi_msghandler xt_recent xt_multiport bridge xt_conntrack iptable_nat
iptable_mangle nbd ebtable_nat ebtables kvm_intel kvm ib_iser rdma_cm
ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp
libiscsi scsi_transport_iscsi 8021q garp stp ipt_REJECT ipt_LOG
xt_limit xt_tcpudp ipt_addrtype xt_state ip6table_filter ip6_tables
nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack iptable_filter ip_tables
x_tables ghes i7core_edac lp serio_raw hed edac_core parport xfs
exportfs usbhid hid igb hpsa dca
[128185.299779]
[128185.301565] Pid: 25484, comm: logrotate Not tainted
2.6.38-15-server #62~lp1026116slubv201208301417 HP SE2170s /SE2170s
[128185.317470] RIP: 0010:[<ffffffffa010c95f>] [<ffffffffa010c95f>]
xfs_cmn_err+0x4f/0xc0 [xfs]
[128185.327065] RSP: 0018:ffff880bdd6bba78 EFLAGS: 00010246
[128185.333118] RAX: 0000000000000000 RBX: ffff8817de320dc0 RCX:
ffffffffa01152d8
[128185.341232] RDX: 0000000000000000 RSI: ffff880bdd6bbab8 RDI:
ffffffffa011b65b
[128185.349347] RBP: ffff880bdd6bbae8 R08: ffffffffa011a47a R09:
00000000000005a9
[128185.357463] R10: 0000000000000003 R11: 0000000000000000 R12:
ffff8817dde5c230
[128185.365579] R13: 0000000000000075 R14: 00000000000046c2 R15:
0000000000000080
[128185.373695] FS: 00007fb25e73c7c0(0000) GS:ffff88183fc00000(0000)
knlGS:0000000000000000
[128185.382877] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[128185.389415] CR2: 00000000000000f8 CR3: 00000017de695000 CR4:
00000000000006e0
[128185.397531] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[128185.405647] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[128185.413761] Process logrotate (pid: 25484, threadinfo
ffff880bdd6ba000, task ffff880bbe9ec590)
[128185.423524] Stack:
[128185.425891] ffffffffa00c2d72 00000000000000bd ffff880b00000020
ffff880bdd6bbaf8
[128185.434325] ffff880bdd6bbab8 ffff880bdea3b800 ffffffffa01152d8
ffff880bdd6bba88
[128185.442774] ffff880bdd6bbb74 00000000000046c2 ffff880bdd6bbb08
ffffffffa00ac33e
[128185.451212] Call Trace:
[128185.454077] [<ffffffffa00c2d72>] ?
xfs_btree_rec_addr.clone.7+0x12/0x20 [xfs]
[128185.462298] [<ffffffffa00ac33e>] ? xfs_alloc_get_rec+0x2e/0x80 [xfs]
[128185.469624] [<ffffffffa00d72e0>] xfs_error_report+0x40/0x50 [xfs]
[128185.476656] [<ffffffffa00af31b>] ? xfs_free_extent+0x9b/0xc0 [xfs]
[128185.483785] [<ffffffffa00ad550>] xfs_free_ag_extent+0x4a0/0x760 [xfs]
[128185.491203] [<ffffffffa00af31b>] xfs_free_extent+0x9b/0xc0 [xfs]
[128185.498138] [<ffffffffa00be144>] xfs_bmap_finish+0x164/0x1b0 [xfs]
[128185.505271] [<ffffffffa00de1f9>] xfs_itruncate_finish+0x159/0x360 [xfs]
[128185.512889] [<ffffffffa00faac9>] xfs_inactive+0x319/0x470 [xfs]
[128185.519730] [<ffffffffa0108cde>] xfs_fs_evict_inode+0x9e/0xf0 [xfs]
[128185.526949] [<ffffffff8117dd34>] evict+0x24/0xc0
[128185.532323] [<ffffffff8117e8ca>] iput_final+0x16a/0x250
[128185.538379] [<ffffffff8117e9eb>] iput+0x3b/0x50
[128185.543656] [<ffffffff8117acc0>] d_kill+0xe0/0x120
[128185.549224] [<ffffffff8117bc32>] dput+0xd2/0x1a0
[128185.554600] [<ffffffff8116673b>] __fput+0x13b/0x1f0
[128185.560265] [<ffffffff81166815>] fput+0x25/0x30
[128185.565543] [<ffffffff81163130>] filp_close+0x60/0x90
[128185.571402] [<ffffffff81163937>] sys_close+0xb7/0x120
[128185.577262] [<ffffffff8100bfc2>] system_call_fastpath+0x16/0x1b
[128185.584091] Code: 8d 75 10 48 8d 45 a0 c7 45 a0 20 00 00 00 48 c7
c7 5b b6 11 a0 48 89 4d c0 48 89 75 a8 48 8d 75 d0 48 89 45 c8 31 c0
48 89 75 b0 <48> 8b b2 f8 00 00 00 48 8d 55 c0 e8 07 bd 4c e1 c9 c3 48
c7 c7
[128185.606061] RIP [<ffffffffa010c95f>] xfs_cmn_err+0x4f/0xc0 [xfs]
[128185.613010] RSP <ffff880bdd6bba78>
[128185.617025] CR2: 00000000000000f8
mp passed to xfs_cmn_err was a Null pointer and mp->m_fsname in the
printk line caused the crash (offset of m_fsname is 0xf8).
Error message extracted from the dump:
XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1449 of file
fs/xfs/xfs_alloc.c.
And the comments in the source:
1444 /*
1445 * If this failure happens the request to free this
1446 * space was invalid, it's (partly) already free.
1447 * Very bad.
1448 */
1449 XFS_WANT_CORRUPTED_GOTO(gtbno >= bno + len, error0);
>> 1) I was under the impression that during the mounting of an XFS
>> volume some sort of check/repair is performed. How does that differ
>> from running xfs_check and/or xfs_repair?
>
> Journal recovery is performed at mount time, not a consistency
> check.
>
> http://en.wikipedia.org/wiki/Filesystem_journaling
Ah OK. Thanks for the clarification.
>> 2) Any ideas how the filesystem might have gotten into this state? I
>> don't have the history of that node but it's possible that it crashed
>> previously due to an unrelated problem. Could this have left the
>> filesystem is this state?
>
> <shrug>
>
> How long is a piece of string?
>
>> 3) What exactly does the ouput of the xfs_check mean? How serious is
>> it? Are those warning or errors? Will some of them get cleanup up
>> during the mounting of the filesystem?
>
> xfs_check is deprecated. The output of xfs_repair indicates
> cross-linked extent indexes. Will only get properly detected and
> fixed by xfs_repair. And "fixed" may mean corrupt files are removed
> from the filesystem - repair does nto guarantee that your data is
> preserved or consistent after it runs, just that the filesystem is
> consistent and error free.
>
>> 4) We have a whole bunch of production nodes running the same kernel.
>> I'm more than a little concerned that we might have a ticking timebomb
>> with some filesystems being in a state that might trigger a crash
>> eventually. Is there any way to perform a live check on a mounted
>> filesystem so that I can get an idea of how big of a problem we have
>> (if any)?
>
> Read the xfs_repair man page?
>
> -n No modify mode. Specifies that xfs_repair should not
> modify the filesystem but should only scan the filesystem
> and indicate what repairs would have been made.
> .....
>
> -d Repair dangerously. Allow xfs_repair to repair an XFS
> filesystem mounted read only. This is typically done on a
> root fileystem from single user mode, immediately followed by
> a reboot.
>
> So, remount read only, run xfs_repair -d -n will check the
> filesystem as best as can be done online. If there are any problems,
> then you can repair them and immediately reboot.
>
>> i don't claim to know exactly what I'm doing but I picked a
>> node, froze the filesystem and then ran a modified xfs_check (which
>> bypasses the is_mounted check and ignores non-committed metadata) and
>> it did report some issues. At this point I believe those are false
>> positive. Do you have any suggestions short of rebooting the nodes and
>> running xfs_check on the unmounted filesystem?
>
> Don't bother with xfs_check. xfs_repair will detect all the same
> errors (and more) and can fix them at the same time.
Thanks for the hints.
...Juerg
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Daily crash in xfs_cmn_err
2012-10-29 14:23 ` Carlos Maiolino
@ 2012-10-30 9:07 ` Juerg Haefliger
0 siblings, 0 replies; 6+ messages in thread
From: Juerg Haefliger @ 2012-10-30 9:07 UTC (permalink / raw)
To: Carlos Maiolino; +Cc: xfs
> Hi,
>>
>> I have a node that used to crash every day at 6:25am in xfs_cmn_err
>> (Null pointer dereference). I'm running 2.6.38 (yes I know it's old)
>> and I'm not reporting a bug as I'm sure the problem has been fixed in
>> newer kernels :-) The crashes occurred daily when logrotate ran,
>> /var/log is a separate 100GB XFS volume. I finally took the node down
>> and ran xfs_check on the the /var/log volume and it did indeed report
>> some errors/warnings (see end of email). I then ran xfs_repair (see
>> end of email for output) and the node is stable ever since.
>>
>> So my questions are:
>>
>> 1) I was under the impression that during the mounting of an XFS
>> volume some sort of check/repair is performed. How does that differ
>> from running xfs_check and/or xfs_repair?
>>
> AFAICT, there is no repair or check happening at mount time, only information
> needed to mount the filesystem are checked, everything else (like consistency of
> blocks and inodes) are not checked during mount time. Maybe you mean the journal
> replay that happens during mount time?
I knew about the journaling replay but wasn't sure if there's anything
else being checked or fixed at mount time. Apparently not.
>> 2) Any ideas how the filesystem might have gotten into this state? I
>> don't have the history of that node but it's possible that it crashed
>> previously due to an unrelated problem. Could this have left the
>> filesystem is this state?
>>
>> 3) What exactly does the ouput of the xfs_check mean? How serious is
>> it? Are those warning or errors? Will some of them get cleanup up
>> during the mounting of the filesystem?
>>
> comments below
>
>> 4) We have a whole bunch of production nodes running the same kernel.
>> I'm more than a little concerned that we might have a ticking timebomb
>> with some filesystems being in a state that might trigger a crash
>> eventually. Is there any way to perform a live check on a mounted
>> filesystem so that I can get an idea of how big of a problem we have
>> (if any)? i don't claim to know exactly what I'm doing but I picked a
>> node, froze the filesystem and then ran a modified xfs_check (which
>> bypasses the is_mounted check and ignores non-committed metadata) and
>> it did report some issues. At this point I believe those are false
>> positive. Do you have any suggestions short of rebooting the nodes and
>> running xfs_check on the unmounted filesystem?
>>
> You must umount the filesystem before run xfs_repair/check, even with a freezed
> fs you may get inconsistent data wich may lead you to have more problems.
And that's the problem I'm facing. I don't want to reboot production
nodes to run xfs_check/repair just to learn that the node is clean and
doesn't need fixing. I'd like to determine if a filesystem repair is
necessary while the node is still up.
>> Thanks
>> ....Juerg
>>
>>
>> (initramfs) xfs_check /dev/mapper/vg0-varlog
>> block 0/3538 expected type unknown got free2
>> block 0/13862 expected type unknown got free2
>> block 0/13863 expected type unknown got free2
>>
>> <SNIP>
>>
>> block 0/16983 expected type unknown got free2
>> block 0/16984 expected type unknown got free2
>> block 0/21700 expected type unknown got data
>> block 0/21701 expected type unknown got data
>>
>> <SNIP>
>>
>> block 0/21826 expected type unknown got data
>> block 0/21827 expected type unknown got data
>> block 0/21700 claimed by inode 178, previous inum 148
>> block 0/21701 claimed by inode 178, previous inum 148
>>
>> <SNIP>
>>
>> block 0/21826 claimed by inode 178, previous inum 148
>> block 0/21827 claimed by inode 178, previous inum 148
>> block 0/1250 expected type unknown got data
>> block 0/1251 expected type unknown got data
>>
>> <SNIP>
>>
>> block 0/1264 expected type unknown got data
>> block 0/1265 expected type unknown got data
>> block 0/1250 claimed by inode 1706, previous inum 148
>> block 0/1251 claimed by inode 1706, previous inum 148
>>
>> <SNIP>
>>
>> block 0/1264 claimed by inode 1706, previous inum 148
>> block 0/1265 claimed by inode 1706, previous inum 148
>> block 0/16729 expected type unknown got data
>> block 0/16730 expected type unknown got data
>>
>> <SNIP>
>>
>> block 0/16889 expected type unknown got data
>> block 0/16890 expected type unknown got data
>> block 0/16729 claimed by inode 1710, previous inum 148
>> block 0/16730 claimed by inode 1710, previous inum 148
>>
>> <SNIP>
>>
>> block 0/16889 claimed by inode 1710, previous inum 148
>> block 0/16890 claimed by inode 1710, previous inum 148
>> block 0/3523 expected type unknown got data
>> block 0/3524 expected type unknown got data
>>
>> <SNIP>
>>
>> block 0/3536 expected type unknown got data
>> block 0/3537 expected type unknown got data
>> block 0/3523 claimed by inode 1994, previous inum 148
>> block 0/3524 claimed by inode 1994, previous inum 148
>>
>> <SNIP>
>>
>> block 0/3536 claimed by inode 1994, previous inum 148
>> block 0/3537 claimed by inode 1994, previous inum 148
>> block 0/510 type unknown not expected
>> block 0/511 type unknown not expected
>> block 0/512 type unknown not expected
>>
>> <SNIP>
>>
>> block 0/2341 type unknown not expected
>> block 0/2342 type unknown not expected
>>
>>
>>
>> (initramfs) xfs_repair /dev/mapper/vg0-varlog
>> Phase 1 - find and verify superblock...
>> Phase 2 - using internal log
>> - zero log...
>> - scan filesystem freespace and inode maps...
>> - found root inode chunk
>> Phase 3 - for each AG...
>> - scan and clear agi unlinked lists...
>> - process known inodes and perform inode discovery...
>> - agno = 0
>> data fork in ino 148 claims free block 3538
>> data fork in ino 148 claims free block 13862
>> data fork in ino 148 claims free block 13863
>> data fork in ino 148 claims free block 16891
>> data fork in ino 148 claims free block 16892
>> data fork in regular inode 178 claims used block 21700
>> bad data fork in inode 178
>> cleared inode 178
>> data fork in regular inode 1706 claims used block 1250
>> bad data fork in inode 1706
>> cleared inode 1706
>> data fork in regular inode 1710 claims used block 16729
>> bad data fork in inode 1710
>> cleared inode 1710
>> data fork in regular inode 1994 claims used block 3523
>> bad data fork in inode 1994
>> cleared inode 1994
>> - agno = 1
>> - agno = 2
>> - agno = 3
>> - process newly discovered inodes...
>> Phase 4 - check for duplicate blocks...
>> - setting up duplicate extent list...
>> - check for inodes claiming duplicate blocks...
>> - agno = 0
>> - agno = 1
>> entry "syslog" at block 0 offset 2384 in directory inode 128
>> references free inode 178
>> - agno = 3
>> clearing inode number in entry at offset 2384...
>> - agno = 2
>> data fork in ino 148 claims dup extent, off - 0, start - 1250, cnt 16
>> bad data fork in inode 148
>> cleared inode 148
>> entry "nova-api.log" at block 0 offset 552 in directory inode 165
>> references free inode 1994
>> clearing inode number in entry at offset 552...
>> entry "nova-network.log.6" at block 0 offset 2608 in directory inode
>> 165 references free inode 1706
>> clearing inode number in entry at offset 2608...
>> entry "nova-api.log.6" at block 0 offset 4032 in directory inode 165
>> references free inode 1710
>> clearing inode number in entry at offset 4032...
>> Phase 5 - rebuild AG headers and trees...
>> - reset superblock...
>> Phase 6 - check inode connectivity...
>> - resetting contents of realtime bitmap and summary inodes
>> - traversing filesystem ...
>> entry "syslog.1" in directory inode 128 points to free inode 148
>> bad hash table for directory inode 128 (no data entry): rebuilding
>> rebuilding directory inode 128
>> bad hash table for directory inode 165 (no data entry): rebuilding
>> rebuilding directory inode 165
>> - traversal finished ...
>> - moving disconnected inodes to lost+found ...
>> Phase 7 - verify and correct link counts...
>> done
>>
>
> Looking through these logs, I'd say that the system crashed before and was not
> using a kind of storage without a battery backed unit and some metadata has been
> lost during the crash, or, a xfs_repair were ran, or the filesystem mounted
> after drop the dirty logs into the filesystem (like run xfs_repair -L).
We're using battery backed RAID controllers. Why would that matter
though? We didn't suffer power outages, just regular kernel crashes
due to software bugs. In that case we're losing the page cache and any
IOs or metadata updates that are in-flight and thus could end up with
filesystem corruption, right?
> These are my points of view though, some other people may give you more detailed
> answer.
It's appreciated, thanks!
...Juerg
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>
> --
> --Carlos
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Daily crash in xfs_cmn_err
2012-10-30 8:58 ` Juerg Haefliger
@ 2012-10-30 19:02 ` Eric Sandeen
0 siblings, 0 replies; 6+ messages in thread
From: Eric Sandeen @ 2012-10-30 19:02 UTC (permalink / raw)
To: Juerg Haefliger; +Cc: xfs
On 10/30/12 3:58 AM, Juerg Haefliger wrote:
> On Mon, Oct 29, 2012 at 1:53 PM, Dave Chinner <david@fromorbit.com> wrote:
>> On Mon, Oct 29, 2012 at 11:55:15AM +0100, Juerg Haefliger wrote:
>>> Hi,
>>>
>>> I have a node that used to crash every day at 6:25am in xfs_cmn_err
>>> (Null pointer dereference).
>>
>> Stack trace, please.
>
>
> [128185.204521] BUG: unable to handle kernel NULL pointer dereference
> at 00000000000000f8
...
>
> mp passed to xfs_cmn_err was a Null pointer and mp->m_fsname in the
> printk line caused the crash (offset of m_fsname is 0xf8).
>
> Error message extracted from the dump:
> XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1449 of file
> fs/xfs/xfs_alloc.c.
>
> And the comments in the source:
> 1444 /*
> 1445 * If this failure happens the request to free this
> 1446 * space was invalid, it's (partly) already free.
> 1447 * Very bad.
> 1448 */
> 1449 XFS_WANT_CORRUPTED_GOTO(gtbno >= bno + len, error0);
and:
#define XFS_WANT_CORRUPTED_GOTO(x,l) \
...
XFS_ERROR_REPORT("XFS_WANT_CORRUPTED_GOTO", \
XFS_ERRLEVEL_LOW, NULL); \
so it explicitly passes a NULL to XFS_ERROR_REPORT(), which sends
it down the xfs_error_report->xfs_cmn_err path and boom.
So you have a persistent on-disk corruption, but it's causing
this to blow up due to an old bug.
I think it got fixed in 2.6.39, there it finds its way to
__xfs_printk() which does:
if (mp && mp->m_fsname) {
printk("%sXFS (%s): %pV\n", level, mp->m_fsname, vaf);
return;
}
and so handles the null mp situation.
Anyway; I'd repair the fs; if you are paranoid, do:
xfs_metadump -o /dev/blah metadumpfile
xfs_mdrestore metadumpfile filesystem.img
xfs_repair filesystem.img
mount -o loop filesystem.img /some/place
first, and you can see for sure what xfs_repair will do to the real
device, and what the fs looks like when it's done (no data will be
present in the metadumped image, just metadata)
-Eric
>
>>> 1) I was under the impression that during the mounting of an XFS
>>> volume some sort of check/repair is performed. How does that differ
>>> from running xfs_check and/or xfs_repair?
>>
>> Journal recovery is performed at mount time, not a consistency
>> check.
>>
>> http://en.wikipedia.org/wiki/Filesystem_journaling
>
> Ah OK. Thanks for the clarification.
>
>
>>> 2) Any ideas how the filesystem might have gotten into this state? I
>>> don't have the history of that node but it's possible that it crashed
>>> previously due to an unrelated problem. Could this have left the
>>> filesystem is this state?
>>
>> <shrug>
>>
>> How long is a piece of string?
>>
>>> 3) What exactly does the ouput of the xfs_check mean? How serious is
>>> it? Are those warning or errors? Will some of them get cleanup up
>>> during the mounting of the filesystem?
>>
>> xfs_check is deprecated. The output of xfs_repair indicates
>> cross-linked extent indexes. Will only get properly detected and
>> fixed by xfs_repair. And "fixed" may mean corrupt files are removed
>> from the filesystem - repair does nto guarantee that your data is
>> preserved or consistent after it runs, just that the filesystem is
>> consistent and error free.
>>
>>> 4) We have a whole bunch of production nodes running the same kernel.
>>> I'm more than a little concerned that we might have a ticking timebomb
>>> with some filesystems being in a state that might trigger a crash
>>> eventually. Is there any way to perform a live check on a mounted
>>> filesystem so that I can get an idea of how big of a problem we have
>>> (if any)?
>>
>> Read the xfs_repair man page?
>>
>> -n No modify mode. Specifies that xfs_repair should not
>> modify the filesystem but should only scan the filesystem
>> and indicate what repairs would have been made.
>> .....
>>
>> -d Repair dangerously. Allow xfs_repair to repair an XFS
>> filesystem mounted read only. This is typically done on a
>> root fileystem from single user mode, immediately followed by
>> a reboot.
>>
>> So, remount read only, run xfs_repair -d -n will check the
>> filesystem as best as can be done online. If there are any problems,
>> then you can repair them and immediately reboot.
>>
>>> i don't claim to know exactly what I'm doing but I picked a
>>> node, froze the filesystem and then ran a modified xfs_check (which
>>> bypasses the is_mounted check and ignores non-committed metadata) and
>>> it did report some issues. At this point I believe those are false
>>> positive. Do you have any suggestions short of rebooting the nodes and
>>> running xfs_check on the unmounted filesystem?
>>
>> Don't bother with xfs_check. xfs_repair will detect all the same
>> errors (and more) and can fix them at the same time.
>
> Thanks for the hints.
>
> ...Juerg
>
>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-10-30 19:00 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-29 10:55 Daily crash in xfs_cmn_err Juerg Haefliger
2012-10-29 12:53 ` Dave Chinner
2012-10-30 8:58 ` Juerg Haefliger
2012-10-30 19:02 ` Eric Sandeen
2012-10-29 14:23 ` Carlos Maiolino
2012-10-30 9:07 ` Juerg Haefliger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox