* Failing XFS filesystem underlying Ceph OSDs
@ 2015-07-03 9:07 Alex Gorbachev
2015-07-03 23:51 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-07-03 9:07 UTC (permalink / raw)
To: xfs
[-- Attachment #1.1: Type: text/plain, Size: 8751 bytes --]
Hello, we are seeing this and similar errors on multiple Supermicro nodes
running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1
Thank you for any info and troubleshooting advice.
Alex Gorbachev
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261899] BUG: unable to handle
kernel paging request at 000000190000001c
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261923] IP:
[<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261941] PGD 1035954067 PUD 0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261955] Oops: 0000 [#1] SMP
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261969] Modules linked in:
xfs libcrc32c ipmi_ssif intel_rapl iosf_mbi x86_pkg_temp_thermal
intel_powerclamp coretemp kvm crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper
ablk_helper cryptd sb_edac edac_core lpc_ich joydev mei_me mei ioatdma wmi
8021q ipmi_si garp 8250_fintek mrp ipmi_msghandler stp llc bonding mac_hid
lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid hid
igb ahci mpt2sas mlx4_core i2c_algo_bit libahci dca raid_class ptp
scsi_transport_sas pps_core arcmsr
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262182] CPU: 10 PID: 8711
Comm: ceph-osd Not tainted 4.1.0-040100-generic #201506220235
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262197] Hardware name:
Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262215] task:
ffff8800721f1420 ti: ffff880fbad54000 task.ti: ffff880fbad54000
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262229] RIP:
0010:[<ffffffff8118e476>] [<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262248] RSP:
0018:ffff880fbad571a8 EFLAGS: 00010246
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262258] RAX: ffff880004000158
RBX: 000000000000000e RCX: 0000000000000000
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262303] RDX: ffff880004000158
RSI: ffff880fbad571c0 RDI: 0000001900000000
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262347] RBP: ffff880fbad57208
R08: 00000000000000c0 R09: 00000000000000ff
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262391] R10: 0000000000000000
R11: 0000000000000220 R12: 00000000000000b6
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262435] R13: ffff880fbad57268
R14: 000000000000000a R15: ffff880fbad572d8
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262479] FS:
00007f98cb0e0700(0000) GS:ffff88103f480000(0000) knlGS:0000000000000000
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262524] CS: 0010 DS: 0000
ES: 0000 CR0: 0000000080050033
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262551] CR2: 000000190000001c
CR3: 0000001034f0e000 CR4: 00000000000407e0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262596] Stack:
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262618] ffff880fbad571f8
ffff880cf6076b30 ffff880bdde05da8 00000000000000e6
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262669] 0000000000000100
ffff880cf6076b28 00000000000000b5 ffff880fbad57258
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262721] ffff880fbad57258
ffff880fbad572d8 ffffffffffffffff ffff880cf6076b28
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262772] Call Trace:
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262801] [<ffffffff8119b482>]
pagevec_lookup_entries+0x22/0x30
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262831] [<ffffffff8119bd84>]
truncate_inode_pages_range+0xf4/0x700
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262862] [<ffffffff8119c415>]
truncate_inode_pages+0x15/0x20
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262891] [<ffffffff8119c53f>]
truncate_inode_pages_final+0x5f/0xa0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262949] [<ffffffffc0431c2c>]
xfs_fs_evict_inode+0x3c/0xe0 [xfs]
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262981] [<ffffffff81220558>]
evict+0xb8/0x190
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263009] [<ffffffff81220671>]
dispose_list+0x41/0x50
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263037] [<ffffffff8122176f>]
prune_icache_sb+0x4f/0x60
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263067] [<ffffffff81208ab5>]
super_cache_scan+0x155/0x1a0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263096] [<ffffffff8119d26f>]
do_shrink_slab+0x13f/0x2c0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263126] [<ffffffff811a22b0>]
? shrink_lruvec+0x330/0x370
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263157] [<ffffffff811b4189>]
? isolate_migratepages_block+0x299/0x5c0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263188] [<ffffffff8119d558>]
shrink_slab+0xd8/0x110
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263217] [<ffffffff811a25bf>]
shrink_zone+0x2cf/0x300
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263246] [<ffffffff811b4d3d>]
? compact_zone+0x7d/0x4f0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263275] [<ffffffff811a2a64>]
shrink_zones+0x104/0x2a0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263304] [<ffffffff811b53ad>]
? compact_zone_order+0x5d/0x70
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263336] [<ffffffff810f1666>]
? ktime_get+0x46/0xb0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263365] [<ffffffff811a2cd7>]
do_try_to_free_pages+0xd7/0x160
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263396] [<ffffffff811a3017>]
try_to_free_pages+0xb7/0x170
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263427] [<ffffffff8119571a>]
__alloc_pages_nodemask+0x5ba/0x9c0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263460] [<ffffffff811dc9bc>]
alloc_pages_current+0x9c/0x110
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263492] [<ffffffff811e4f2a>]
allocate_slab+0x20a/0x2e0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263522] [<ffffffff811e5031>]
new_slab+0x31/0x1f0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263553] [<ffffffff817f8dd9>]
__slab_alloc+0x18e/0x2a3
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263584] [<ffffffff816d7817>]
? __alloc_skb+0x87/0x2b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263614] [<ffffffff816d77e7>]
? __alloc_skb+0x57/0x2b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263643] [<ffffffff811e9b7b>]
__kmalloc_node_track_caller+0xbb/0x2b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263675] [<ffffffff816d7817>]
? __alloc_skb+0x87/0x2b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263704] [<ffffffff816d737c>]
__kmalloc_reserve.isra.57+0x3c/0xa0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263734] [<ffffffff816d7817>]
__alloc_skb+0x87/0x2b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263766] [<ffffffff81737de1>]
sk_stream_alloc_skb+0x41/0x130
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263796] [<ffffffff817388b3>]
tcp_sendmsg+0x2d3/0xa90
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263827] [<ffffffff81764477>]
inet_sendmsg+0x67/0xa0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263858] [<ffffffff816cea54>]
? copy_msghdr_from_user+0x154/0x1b0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263891] [<ffffffff816cdcfd>]
sock_sendmsg+0x4d/0x60
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263920] [<ffffffff816cef93>]
___sys_sendmsg+0x2b3/0x2c0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263950] [<ffffffff810a853c>]
? ttwu_do_wakeup+0x2c/0x100
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263979] [<ffffffff810a8826>]
? ttwu_do_activate.constprop.121+0x66/0x70
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264011] [<ffffffff810abef5>]
? try_to_wake_up+0x215/0x2a0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264040] [<ffffffff810abfb0>]
? wake_up_state+0x10/0x20
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264071] [<ffffffff810fce86>]
? wake_futex+0x76/0xb0
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264099] [<ffffffff810fe192>]
? futex_wake+0x72/0x140
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264127] [<ffffffff81222675>]
? __fget_light+0x25/0x70
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264155] [<ffffffff816cf9b9>]
__sys_sendmsg+0x49/0x90
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264184] [<ffffffff816cfa19>]
SyS_sendmsg+0x19/0x20
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264215] [<ffffffff8180d272>]
system_call_fastpath+0x16/0x75
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264243] Code: 00 4c 89 65 c0
31 d2 e9 86 00 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 3a 48 85 ff 0f 84 ad
00 00 00 40 f6 c7 03 0f 85 a9 00 00 00 <8b> 4f 1c 85 c9 74 e3 8d 71 01 4c
8d 47 1c 89 c8 f0 0f b1 77 1c
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264467] RIP
[<ffffffff8118e476>] find_get_entries+0x66/0x160
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264499] RSP
<ffff880fbad571a8>
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264522] CR2: 000000190000001c
Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264824] ---[ end trace
ae271fe24c8d817e ]---
[-- Attachment #1.2: Type: text/html, Size: 10052 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-03 9:07 Failing XFS filesystem underlying Ceph OSDs Alex Gorbachev
@ 2015-07-03 23:51 ` Dave Chinner
2015-07-04 14:46 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2015-07-03 23:51 UTC (permalink / raw)
To: Alex Gorbachev; +Cc: xfs
On Fri, Jul 03, 2015 at 05:07:29AM -0400, Alex Gorbachev wrote:
> Hello, we are seeing this and similar errors on multiple Supermicro nodes
> running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1
>
> Thank you for any info and troubleshooting advice.
Nothing to suggest that this is an XFS problem. Memory reclaim
triggered by network stack memory pressure is causing inode
eviction. While removing the page cache it's falling over in
the generic truncate code doing a radix tree lookup. That's all
generic code - XFS never touches the page cache radix tree directly.
I haven't seen this before - is this a new problem since you
upgraded your kernel to 4.1? Is it repeatable? if yes to both, then
a bisect may be in order to isolate the problematic commit...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-03 23:51 ` Dave Chinner
@ 2015-07-04 14:46 ` Alex Gorbachev
2015-07-04 23:38 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-07-04 14:46 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 1680 bytes --]
Hello Dave, thank you for the response. I got some recommendations on the
ceph-users list that essentially pointed to the problem with
vm.swappiness=0 and its new behavior - described here
https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/
Basically setting it to 0 creates these OOM conditions due to never
swapping anything out. So I changed these settings right away:
sysctl vm.swappiness=20 (can probably be 1 as per article)
sysctl vm.min_free_kbytes=262144
So far no issues, but I need to wait a week to see if anything shows up.
Thank you for reviewing the error codes.
Alex
On Fri, Jul 3, 2015 at 7:51 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Jul 03, 2015 at 05:07:29AM -0400, Alex Gorbachev wrote:
> > Hello, we are seeing this and similar errors on multiple Supermicro nodes
> > running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1
> >
> > Thank you for any info and troubleshooting advice.
>
> Nothing to suggest that this is an XFS problem. Memory reclaim
> triggered by network stack memory pressure is causing inode
> eviction. While removing the page cache it's falling over in
> the generic truncate code doing a radix tree lookup. That's all
> generic code - XFS never touches the page cache radix tree directly.
>
> I haven't seen this before - is this a new problem since you
> upgraded your kernel to 4.1? Is it repeatable? if yes to both, then
> a bisect may be in order to isolate the problematic commit...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
>
[-- Attachment #1.2: Type: text/html, Size: 3932 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-04 14:46 ` Alex Gorbachev
@ 2015-07-04 23:38 ` Dave Chinner
2015-07-05 4:25 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2015-07-04 23:38 UTC (permalink / raw)
To: Alex Gorbachev; +Cc: xfs
On Sat, Jul 04, 2015 at 10:46:24AM -0400, Alex Gorbachev wrote:
> Hello Dave, thank you for the response. I got some recommendations on the
> ceph-users list that essentially pointed to the problem with
> vm.swappiness=0 and its new behavior - described here
> https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/
>
> Basically setting it to 0 creates these OOM conditions due to never
> swapping anything out. So I changed these settings right away:
>
> sysctl vm.swappiness=20 (can probably be 1 as per article)
>
> sysctl vm.min_free_kbytes=262144
That's not an explanation for what looks to be page cache radix
tree coruption. Memory reclaim still occurs with the settings you
have now and, well, those changes occurred back in 3.5 - some
3 years ago - so it's not really an explanation for a problem with a
recent 4.1 kernel...
> So far no issues, but I need to wait a week to see if anything shows up.
> Thank you for reviewing the error codes.
I expect that you'll see the problems again...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-04 23:38 ` Dave Chinner
@ 2015-07-05 4:25 ` Alex Gorbachev
2015-07-05 23:24 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-07-05 4:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 19202 bytes --]
Hi Dave,
> > sysctl vm.swappiness=20 (can probably be 1 as per article)
> >
> > sysctl vm.min_free_kbytes=262144
>
> That's not an explanation for what looks to be page cache radix
> tree coruption. Memory reclaim still occurs with the settings you
> have now and, well, those changes occurred back in 3.5 - some
> 3 years ago - so it's not really an explanation for a problem with a
> recent 4.1 kernel...
>
> > So far no issues, but I need to wait a week to see if anything shows up.
> > Thank you for reviewing the error codes.
>
> I expect that you'll see the problems again...
>
We have experienced the problem in various guises with kernels 3.14, 3.19,
4.1-rc2 and now 4.1, so it's not new to us, just different error stack.
Below are some other stack dumps of what manifested as the same error.
Do you think we need to look at RAM handling by this Supermicro machine
type?
Best regards,
Alex
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070485.999993]
[<ffffffff817cf4b9>] schedule+0x29/0x70
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000010]
[<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000014]
[<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000030]
[<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000047]
[<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000050]
[<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000067]
[<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000083]
[<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000086]
[<ffffffff81095e29>] kthread+0xc9/0xe0
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000089]
[<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000093]
[<ffffffff817d3718>] ret_from_fork+0x58/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000096]
[<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000100] INFO: task
xfsaild/sdg1:2606 blocked for more than 120 seconds.
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000113] Not tainted
3.19.4-031904-generic #201504131440
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000125] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000140] xfsaild/sdg1 D
ffff8810102dfcd8 0 2606 2 0x00000000
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000143] ffff8810102dfcd8
ffff8810344c2d80 ffff8810102dffd8 0000000000013e40
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000146] ffff881033383b80
ffff8810389589d0 ffff881033491d70 ffff8810102dfcf8
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000149] ffff8810344c3100
ffff881033491d70 ffff881032acd128 ffff88107fc75100
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000152] Call Trace:
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000155]
[<ffffffff817cf4b9>] schedule+0x29/0x70
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000172]
[<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000175]
[<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000192]
[<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000208]
[<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000211]
[<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000228]
[<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000244]
[<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000247]
[<ffffffff81095e29>] kthread+0xc9/0xe0
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000251]
[<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000254]
[<ffffffff817d3718>] ret_from_fork+0x58/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000257]
[<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000261] INFO: task
xfsaild/sdd1:2707 blocked for more than 120 seconds.
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000273] Not tainted
3.19.4-031904-generic #201504131440
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000285] "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000300] xfsaild/sdd1 D
ffff88100f813cd8 0 2707 2 0x00000000
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000303] ffff88100f813cd8
ffff88102b1d9480 ffff88100f813fd8 0000000000013e40
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000306] ffff881033387700
ffff881038959d70 ffff8810351a09d0 ffff88100f813cf8
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000310] ffff88102b1d8800
ffff8810351a09d0 ffff8810329acd28 ffff88107fcb5100
Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000313] Call Trace:
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414753] BUG: unable to handle
kernel NULL pointer dereference at (null)
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414779] IP:
[<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414823] PGD 1013574067 PUD
100ffcb067 PMD 0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414842] Oops: 0000 [#1] SMP
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414856] Modules linked in:
xfs libcrc32c intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp
coretemp ipmi_ssif kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac
edac_core joydev mei_me mei lpc_ich ioatdma ipmi_si 8250_fintek
ipmi_msghandler 8021q garp mrp stp llc wmi bonding mac_hid lp parport
mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid mpt2sas ahci igb
hid libahci i2c_algo_bit mlx4_core dca raid_class ptp scsi_transport_sas
pps_core arcmsr
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415068] CPU: 0 PID: 687310
Comm: ceph-osd Not tainted 4.1.0-040100rc2-generic #201505032335
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415084] Hardware name:
Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415102] task:
ffff88100f33eeb0 ti: ffff880065cb0000 task.ti: ffff880065cb0000
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415116] RIP:
0010:[<ffffffffc04be80f>] [<ffffffffc04be80f>]
xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415155] RSP:
0018:ffff880065cb32c8 EFLAGS: 00010286
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415165] RAX: 0000000000000000
RBX: ffff880065cb37b0 RCX: 0000000000000000
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415210] RDX: ffff880065cb32f4
RSI: ffff880065cb32f0 RDI: ffff880004000000
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415254] RBP: ffff880065cb32c8
R08: ffff880b687bf478 R09: ffff880065cb3270
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415297] R10: dead000000100100
R11: dead000000200200 R12: ffffea000ccf9840
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415341] R13: ffffea000ccf9860
R14: ffffea000ccf9840 R15: ffff880ab6ce39d0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415386] FS:
00007ff4be95c700(0000) GS:ffff88103f200000(0000) knlGS:0000000000000000
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415432] CS: 0010 DS: 0000
ES: 0000 CR0: 0000000080050033
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415459] CR2: 0000000000000000
CR3: 000000007942d000 CR4: 00000000000407f0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415502] Stack:
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415523] ffff880065cb3328
ffffffffc04be880 0000000000000000 ffffea000ccf9840
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415574] ffff880065cb3498
0000000000000000 ffffea000ccf9840 ffff880065cb37b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415626] ffff880065cb3498
ffffea000ccf9860 ffffea000ccf9840 ffff880ab6ce3b28
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415677] Call Trace:
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415718] [<ffffffffc04be880>]
xfs_vm_releasepage+0x40/0x120 [xfs]
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415751] [<ffffffff8118a7d2>]
try_to_release_page+0x32/0x50
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415782] [<ffffffff8119fe6d>]
shrink_page_list+0x69d/0x720
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415812] [<ffffffff811a058d>]
shrink_inactive_list+0x1dd/0x5d0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415843] [<ffffffff8119bf2d>]
? get_lru_size+0x3d/0x40
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415872] [<ffffffff811a0fb4>]
shrink_lruvec+0x234/0x370
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415901] [<ffffffff81198e25>]
? pagevec_lru_move_fn+0xf5/0x110
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415931] [<ffffffff811a11db>]
shrink_zone+0xeb/0x300
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415961] [<ffffffff811b38fd>]
? compact_zone+0x7d/0x4f0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415990] [<ffffffff811a1864>]
shrink_zones+0x104/0x2a0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416019] [<ffffffff811b3f6d>]
? compact_zone_order+0x5d/0x70
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416051] [<ffffffff810f0666>]
? ktime_get+0x46/0xb0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416079] [<ffffffff811a1ad7>]
do_try_to_free_pages+0xd7/0x160
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416110] [<ffffffff811a1e17>]
try_to_free_pages+0xb7/0x170
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416139] [<ffffffff8119450a>]
__alloc_pages_nodemask+0x5ba/0x9c0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416172] [<ffffffff811db51c>]
alloc_pages_current+0x9c/0x110
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416203] [<ffffffff811e3a8a>]
allocate_slab+0x20a/0x2e0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416232] [<ffffffff811e3b91>]
new_slab+0x31/0x1f0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416261] [<ffffffff817ef6a4>]
__slab_alloc+0x18e/0x2a3
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416290] [<ffffffff816cea97>]
? __alloc_skb+0x87/0x2b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416318] [<ffffffff816cea67>]
? __alloc_skb+0x57/0x2b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416348] [<ffffffff811e8687>]
__kmalloc_node_track_caller+0xb7/0x290
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416379] [<ffffffff816cea97>]
? __alloc_skb+0x87/0x2b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416407] [<ffffffff816ce5fc>]
__kmalloc_reserve.isra.57+0x3c/0xa0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416437] [<ffffffff816cea97>]
__alloc_skb+0x87/0x2b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416467] [<ffffffff8172ee91>]
sk_stream_alloc_skb+0x41/0x130
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416496] [<ffffffff8172f963>]
tcp_sendmsg+0x2d3/0xa90
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416526] [<ffffffff8175b467>]
inet_sendmsg+0x67/0xa0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416555] [<ffffffff816c5cb4>]
? copy_msghdr_from_user+0x154/0x1b0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416585] [<ffffffff816c4f5d>]
sock_sendmsg+0x4d/0x60
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416613] [<ffffffff816c61f3>]
___sys_sendmsg+0x2b3/0x2c0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416644] [<ffffffff810fcf67>]
? futex_wait+0x1c7/0x2c0
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416673] [<ffffffff810fd122>]
? futex_wake+0x72/0x140
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416703] [<ffffffff81221185>]
? __fget_light+0x25/0x70
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416731] [<ffffffff816c6c19>]
__sys_sendmsg+0x49/0x90
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416760] [<ffffffff816c6c79>]
SyS_sendmsg+0x19/0x20
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416788] [<ffffffff81803bb2>]
system_call_fastpath+0x16/0x75
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416816] Code: 07 48 89 e5 f6
c4 08 74 43 48 8b 7f 30 48 89 f8 eb 19 66 2e 0f 1f 84 00 00 00 00 00 c7 02
01 00 00 00 48 8b 40 08 48 39 c7 74 1f <48> 8b 08 80 e5 10 75 e9 48 8b 08
80 e5 02 74 e7 c7 06 01 00 00
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417040] RIP
[<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417094] RSP
<ffff880065cb32c8>
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417118] CR2: 0000000000000000
Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417402] ---[ end trace
71135a48b0dfd313 ]---
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759401] BUG: unable to handle
kernel NULL pointer dereference at (null)
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759457] IP:
[<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759519] PGD 0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759543] Oops: 0000 [#2] SMP
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759571] Modules linked in:
xfs libcrc32c intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp
coretemp ipmi_ssif kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac
edac_core joydev mei_me mei lpc_ich ioatdma ipmi_si 8250_fintek
ipmi_msghandler 8021q garp mrp stp llc wmi bonding mac_hid lp parport
mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid mpt2sas ahci igb
hid libahci i2c_algo_bit mlx4_core dca raid_class ptp scsi_transport_sas
pps_core arcmsr
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759889] CPU: 7 PID: 110 Comm:
kswapd0 Tainted: G D 4.1.0-040100rc2-generic #201505032335
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759939] Hardware name:
Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759987] task:
ffff88103712bc60 ti: ffff8810331bc000 task.ti: ffff8810331bc000
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760032] RIP:
0010:[<ffffffffc04be80f>] [<ffffffffc04be80f>]
xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760100] RSP:
0018:ffff8810331bf818 EFLAGS: 00010282
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760127] RAX: 0000000000000000
RBX: ffffea000cd08d80 RCX: 0000000000000000
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760171] RDX: ffff8810331bf844
RSI: ffff8810331bf840 RDI: ffff880004000068
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760216] RBP: ffff8810331bf818
R08: ffff880b687bf478 R09: ffff8810331bf850
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760260] R10: ffff880b687bf518
R11: 0000000000000220 R12: ffffea000cd08d80
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760304] R13: 0000000000000000
R14: ffff880ab6ce3b28 R15: ffff880ab6ce39d0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760348] FS:
0000000000000000(0000) GS:ffff88103f3c0000(0000) knlGS:0000000000000000
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760394] CS: 0010 DS: 0000
ES: 0000 CR0: 0000000080050033
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760422] CR2: 0000000000000000
CR3: 0000000001e0f000 CR4: 00000000000407e0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760466] Stack:
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760487] ffff8810331bf878
ffffffffc04be880 ffff880ab6ce3b40 ffff880ab6ce3b28
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760539] ffff8810331bf888
0000000000000000 ffff880b687bf478 ffffea000cd08d80
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760591] ffff880ab6ce3b28
0000000000000000 ffff880ab6ce3b28 000000000000000b
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760643] Call Trace:
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760686] [<ffffffffc04be880>]
xfs_vm_releasepage+0x40/0x120 [xfs]
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760720] [<ffffffff8118a7d2>]
try_to_release_page+0x32/0x50
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760752] [<ffffffff8119b431>]
invalidate_inode_page+0x71/0x90
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760783] [<ffffffff8119b528>]
invalidate_mapping_pages+0xd8/0x1e0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760815] [<ffffffff8121ff22>]
inode_lru_isolate+0x122/0x1b0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760846] [<ffffffff811b54b1>]
__list_lru_walk_one.isra.6+0x81/0x150
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760877] [<ffffffff8121fe00>]
? insert_inode_locked+0x190/0x190
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760907] [<ffffffff8121fe00>]
? insert_inode_locked+0x190/0x190
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760938] [<ffffffff811b566a>]
list_lru_walk_one+0x4a/0x60
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760966] [<ffffffff81220274>]
prune_icache_sb+0x44/0x60
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760996] [<ffffffff81207505>]
super_cache_scan+0x155/0x1a0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761026] [<ffffffff8119c06f>]
do_shrink_slab+0x13f/0x2c0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761056] [<ffffffff811a10b0>]
? shrink_lruvec+0x330/0x370
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761086] [<ffffffff810aeb95>]
? sched_clock_cpu+0x85/0xc0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761115] [<ffffffff8119c358>]
shrink_slab+0xd8/0x110
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761143] [<ffffffff811a13bf>]
shrink_zone+0x2cf/0x300
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761173] [<ffffffff811aa500>]
? zone_statistics+0xa0/0xc0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761203] [<ffffffff811a16e0>]
kswapd_shrink_zone+0xe0/0x160
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761232] [<ffffffff811a2323>]
balance_pgdat+0x2c3/0x430
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761261] [<ffffffff811a25c3>]
kswapd+0x133/0x250
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761289] [<ffffffff811a2490>]
? balance_pgdat+0x430/0x430
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761322] [<ffffffff8109cdc9>]
kthread+0xc9/0xe0
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761350] [<ffffffff8109cd00>]
? flush_kthread_worker+0x90/0x90
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761381] [<ffffffff81803fe2>]
ret_from_fork+0x42/0x70
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761410] [<ffffffff8109cd00>]
? flush_kthread_worker+0x90/0x90
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761439] Code: 07 48 89 e5 f6
c4 08 74 43 48 8b 7f 30 48 89 f8 eb 19 66 2e 0f 1f 84 00 00 00 00 00 c7 02
01 00 00 00 48 8b 40 08 48 39 c7 74 1f <48> 8b 08 80 e5 10 75 e9 48 8b 08
80 e5 02 74 e7 c7 06 01 00 00
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761669] RIP
[<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761725] RSP
<ffff8810331bf818>
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761749] CR2: 0000000000000000
Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.762039] ---[ end trace
71135a48b0dfd314 ]---
[-- Attachment #1.2: Type: text/html, Size: 22433 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-05 4:25 ` Alex Gorbachev
@ 2015-07-05 23:24 ` Dave Chinner
2015-07-06 19:20 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2015-07-05 23:24 UTC (permalink / raw)
To: Alex Gorbachev; +Cc: xfs
[ Please turn off line wrap when pasting kernel traces ]
On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
> > > sysctl vm.swappiness=20 (can probably be 1 as per article)
> > >
> > > sysctl vm.min_free_kbytes=262144
> >
> > That's not an explanation for what looks to be page cache radix
> > tree coruption. Memory reclaim still occurs with the settings you
> > have now and, well, those changes occurred back in 3.5 - some
> > 3 years ago - so it's not really an explanation for a problem with a
> > recent 4.1 kernel...
> >
> > > So far no issues, but I need to wait a week to see if anything shows up.
> > > Thank you for reviewing the error codes.
> >
> > I expect that you'll see the problems again...
>
> We have experienced the problem in various guises with kernels 3.14, 3.19,
> 4.1-rc2 and now 4.1, so it's not new to us, just different error stack.
> Below are some other stack dumps of what manifested as the same error.
>
> [<ffffffff817cf4b9>] schedule+0x29/0x70
> [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
> [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
> [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
> [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
> [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
> [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
> [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
> [<ffffffff81095e29>] kthread+0xc9/0xe0
> [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> [<ffffffff817d3718>] ret_from_fork+0x58/0x90
> [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
> Not tainted 3.19.4-031904-generic #201504131440
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
That's indicative of IO completion problems, but not a crash.
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
....
> [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
> [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
> [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
> [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
....
Again, this is indicative of a page cache issue: a page without
buffers has been passed to xfs_vm_releasepage(), which implies the
page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
page->private is null...
Again, this is unlikely to be an XFS issue.
> Do you think we need to look at RAM handling by this Supermicro machine
> type?
Not sure what you mean by that. Problems like this can be caused by
bad hardware, but it's unusual for a machine using ECC memory to
have undetected RAM problems...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-05 23:24 ` Dave Chinner
@ 2015-07-06 19:20 ` Alex Gorbachev
2015-07-07 0:35 ` Dave Chinner
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-07-06 19:20 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 2802 bytes --]
Thank you Dave,
On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> wrote:
> [ Please turn off line wrap when pasting kernel traces ]
>
Noted, sorry, I thought wrap was off, but Google must have it on by default.
>
> On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
> > > > sysctl vm.swappiness=20 (can probably be 1 as per article)
> > > >
> > > > sysctl vm.min_free_kbytes=262144
> > >
> [...]
> >
> > We have experienced the problem in various guises with kernels 3.14,
> 3.19,
> > 4.1-rc2 and now 4.1, so it's not new to us, just different error stack.
> > Below are some other stack dumps of what manifested as the same error.
> >
> > [<ffffffff817cf4b9>] schedule+0x29/0x70
> > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
> > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
> > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
> > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
> > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
> > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
> > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
> > [<ffffffff81095e29>] kthread+0xc9/0xe0
> > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > [<ffffffff817d3718>] ret_from_fork+0x58/0x90
> > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
> > Not tainted 3.19.4-031904-generic #201504131440
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
>
> That's indicative of IO completion problems, but not a crash.
>
> > BUG: unable to handle kernel NULL pointer dereference at
> (null)
> > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
> ....
> > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
> > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
> > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
> > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
> ....
>
> Again, this is indicative of a page cache issue: a page without
> buffers has been passed to xfs_vm_releasepage(), which implies the
> page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
> page->private is null...
>
> Again, this is unlikely to be an XFS issue.
>
Sorry for my ignorance, but would this likely come from Ceph code or a
hardware issue of some kind, such as a disk drive? I have reached out to
RedHat and Ceph community on that as well.
Thank you,
Alex
>
> > Do you think we need to look at RAM handling by this Supermicro machine
> > type?
>
> Not sure what you mean by that. Problems like this can be caused by
> bad hardware, but it's unusual for a machine using ECC memory to
> have undetected RAM problems...
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
[-- Attachment #1.2: Type: text/html, Size: 4162 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-06 19:20 ` Alex Gorbachev
@ 2015-07-07 0:35 ` Dave Chinner
2015-07-22 12:23 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Dave Chinner @ 2015-07-07 0:35 UTC (permalink / raw)
To: Alex Gorbachev; +Cc: xfs
On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote:
> On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
> > > > > sysctl vm.swappiness=20 (can probably be 1 as per article)
> > > > >
> > > > > sysctl vm.min_free_kbytes=262144
> > > >
> > [...]
> > >
> > > We have experienced the problem in various guises with kernels 3.14,
> > 3.19,
> > > 4.1-rc2 and now 4.1, so it's not new to us, just different error stack.
> > > Below are some other stack dumps of what manifested as the same error.
> > >
> > > [<ffffffff817cf4b9>] schedule+0x29/0x70
> > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
> > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
> > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
> > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
> > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
> > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
> > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
> > > [<ffffffff81095e29>] kthread+0xc9/0xe0
> > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90
> > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
> > > Not tainted 3.19.4-031904-generic #201504131440
> > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> > message.
> >
> > That's indicative of IO completion problems, but not a crash.
> >
> > > BUG: unable to handle kernel NULL pointer dereference at
> > (null)
> > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
> > ....
> > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
> > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
> > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
> > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
> > ....
> >
> > Again, this is indicative of a page cache issue: a page without
> > buffers has been passed to xfs_vm_releasepage(), which implies the
> > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
> > page->private is null...
> >
> > Again, this is unlikely to be an XFS issue.
> >
>
> Sorry for my ignorance, but would this likely come from Ceph code or a
> hardware issue of some kind, such as a disk drive? I have reached out to
> RedHat and Ceph community on that as well.
More likely a kernel bug somewhere in the page cache or memory
reclaim paths. The issue is that we only notice the problem long
after it has occurred. i.e. when XFS goes to tear down the page it has
been handed, the page is already in a bad state and so it doesn't
really tell us anything about the cause of the problem.
Realisticaly, we need a script that reproduces the problem (that
doesn't require a Ceph cluster) to be able to isolate the cause.
In the mean time, you can always try running CONFIG_XFS_WARN=y to
see if that catches problems earlier, and you might also want to do
things like turn on memory poisoning and other kernel debugging
options to try to isolate the cause of the issue....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-07 0:35 ` Dave Chinner
@ 2015-07-22 12:23 ` Alex Gorbachev
2015-08-13 14:25 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-07-22 12:23 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 3904 bytes --]
Hi Dave,
On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote:
> > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com>
> wrote:
> > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
> > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article)
> > > > > >
> > > > > > sysctl vm.min_free_kbytes=262144
> > > > >
> > > [...]
> > > >
> > > > We have experienced the problem in various guises with kernels 3.14,
> > > 3.19,
> > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error
> stack.
> > > > Below are some other stack dumps of what manifested as the same
> error.
> > > >
> > > > [<ffffffff817cf4b9>] schedule+0x29/0x70
> > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
> > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
> > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
> > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
> > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
> > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
> > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
> > > > [<ffffffff81095e29>] kthread+0xc9/0xe0
> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90
> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
> > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
> > > > Not tainted 3.19.4-031904-generic #201504131440
> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> > > message.
> > >
> > > That's indicative of IO completion problems, but not a crash.
> > >
> > > > BUG: unable to handle kernel NULL pointer dereference at
> > > (null)
> > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
> > > ....
> > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
> > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
> > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
> > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
> > > ....
> > >
> > > Again, this is indicative of a page cache issue: a page without
> > > buffers has been passed to xfs_vm_releasepage(), which implies the
> > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
> > > page->private is null...
> > >
> > > Again, this is unlikely to be an XFS issue.
> > >
> >
> > Sorry for my ignorance, but would this likely come from Ceph code or a
> > hardware issue of some kind, such as a disk drive? I have reached out to
> > RedHat and Ceph community on that as well.
>
> More likely a kernel bug somewhere in the page cache or memory
> reclaim paths. The issue is that we only notice the problem long
> after it has occurred. i.e. when XFS goes to tear down the page it has
> been handed, the page is already in a bad state and so it doesn't
> really tell us anything about the cause of the problem.
>
> Realisticaly, we need a script that reproduces the problem (that
> doesn't require a Ceph cluster) to be able to isolate the cause.
> In the mean time, you can always try running CONFIG_XFS_WARN=y to
> see if that catches problems earlier, and you might also want to do
> things like turn on memory poisoning and other kernel debugging
> options to try to isolate the cause of the issue....
>
We have been error free for almost 3 weeks now with these changes:
vm.swappiness=1
vm.min_free_kbytes=262144
I wonder if this is related to us using high speed Areca HBAs with RAM
writeback cache and having had vm.swappiness=0 previously. POssibly the
HBA handing down a large chunk of IO very fast and page cache not being to
handle it with swappiness=0. I will keep monitoring, but thank you very
much for the analysis and info.
Alex
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
[-- Attachment #1.2: Type: text/html, Size: 5824 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-07-22 12:23 ` Alex Gorbachev
@ 2015-08-13 14:25 ` Alex Gorbachev
2016-03-11 3:26 ` Alex Gorbachev
0 siblings, 1 reply; 11+ messages in thread
From: Alex Gorbachev @ 2015-08-13 14:25 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
[-- Attachment #1.1: Type: text/plain, Size: 4563 bytes --]
Good morning,
We have experienced one more failure like the ones originally described. I
am assuming the vm.min_free_kbytes at 256 MB helped (only one hit, OSD went
down but the rest of the cluster stayed up unlike the previous massive
storms). So I went ahead and increased the vm.min_free_kbytes to 1 GB.
I do not know of any way to reproduce the problem, or what causes it.
There is no unusual IO pattern at the time that we are aware of.
Thanks,
Alex
On Wed, Jul 22, 2015 at 8:23 AM, Alex Gorbachev <ag@iss-integration.com>
wrote:
> Hi Dave,
>
> On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com> wrote:
>
>> On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote:
>> > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com>
>> wrote:
>> > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
>> > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article)
>> > > > > >
>> > > > > > sysctl vm.min_free_kbytes=262144
>> > > > >
>> > > [...]
>> > > >
>> > > > We have experienced the problem in various guises with kernels 3.14,
>> > > 3.19,
>> > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error
>> stack.
>> > > > Below are some other stack dumps of what manifested as the same
>> error.
>> > > >
>> > > > [<ffffffff817cf4b9>] schedule+0x29/0x70
>> > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
>> > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
>> > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
>> > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
>> > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
>> > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
>> > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
>> > > > [<ffffffff81095e29>] kthread+0xc9/0xe0
>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>> > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90
>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>> > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
>> > > > Not tainted 3.19.4-031904-generic #201504131440
>> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
>> > > message.
>> > >
>> > > That's indicative of IO completion problems, but not a crash.
>> > >
>> > > > BUG: unable to handle kernel NULL pointer dereference at
>> > > (null)
>> > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
>> > > ....
>> > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
>> > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
>> > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
>> > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
>> > > ....
>> > >
>> > > Again, this is indicative of a page cache issue: a page without
>> > > buffers has been passed to xfs_vm_releasepage(), which implies the
>> > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
>> > > page->private is null...
>> > >
>> > > Again, this is unlikely to be an XFS issue.
>> > >
>> >
>> > Sorry for my ignorance, but would this likely come from Ceph code or a
>> > hardware issue of some kind, such as a disk drive? I have reached out
>> to
>> > RedHat and Ceph community on that as well.
>>
>> More likely a kernel bug somewhere in the page cache or memory
>> reclaim paths. The issue is that we only notice the problem long
>> after it has occurred. i.e. when XFS goes to tear down the page it has
>> been handed, the page is already in a bad state and so it doesn't
>> really tell us anything about the cause of the problem.
>>
>> Realisticaly, we need a script that reproduces the problem (that
>> doesn't require a Ceph cluster) to be able to isolate the cause.
>> In the mean time, you can always try running CONFIG_XFS_WARN=y to
>> see if that catches problems earlier, and you might also want to do
>> things like turn on memory poisoning and other kernel debugging
>> options to try to isolate the cause of the issue....
>>
>
> We have been error free for almost 3 weeks now with these changes:
>
> vm.swappiness=1
> vm.min_free_kbytes=262144
>
> I wonder if this is related to us using high speed Areca HBAs with RAM
> writeback cache and having had vm.swappiness=0 previously. POssibly the
> HBA handing down a large chunk of IO very fast and page cache not being to
> handle it with swappiness=0. I will keep monitoring, but thank you very
> much for the analysis and info.
>
> Alex
>
>
>
>>
>> Cheers,
>>
>> Dave.
>> --
>> Dave Chinner
>> david@fromorbit.com
>>
>
>
[-- Attachment #1.2: Type: text/html, Size: 7219 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs
2015-08-13 14:25 ` Alex Gorbachev
@ 2016-03-11 3:26 ` Alex Gorbachev
0 siblings, 0 replies; 11+ messages in thread
From: Alex Gorbachev @ 2016-03-11 3:26 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs@oss.sgi.com
[-- Attachment #1.1: Type: text/plain, Size: 5090 bytes --]
Final update: no issues for many months when using 1GB!
On Thursday, August 13, 2015, Alex Gorbachev <ag@iss-integration.com> wrote:
> Good morning,
>
> We have experienced one more failure like the ones originally described.
> I am assuming the vm.min_free_kbytes at 256 MB helped (only one hit, OSD
> went down but the rest of the cluster stayed up unlike the previous massive
> storms). So I went ahead and increased the vm.min_free_kbytes to 1 GB.
>
> I do not know of any way to reproduce the problem, or what causes it.
> There is no unusual IO pattern at the time that we are aware of.
>
> Thanks,
> Alex
>
> On Wed, Jul 22, 2015 at 8:23 AM, Alex Gorbachev <ag@iss-integration.com
> <javascript:_e(%7B%7D,'cvml','ag@iss-integration.com');>> wrote:
>
>> Hi Dave,
>>
>> On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com
>> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');>> wrote:
>>
>>> On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote:
>>> > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com
>>> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');>> wrote:
>>> > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote:
>>> > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article)
>>> > > > > >
>>> > > > > > sysctl vm.min_free_kbytes=262144
>>> > > > >
>>> > > [...]
>>> > > >
>>> > > > We have experienced the problem in various guises with kernels
>>> 3.14,
>>> > > 3.19,
>>> > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error
>>> stack.
>>> > > > Below are some other stack dumps of what manifested as the same
>>> error.
>>> > > >
>>> > > > [<ffffffff817cf4b9>] schedule+0x29/0x70
>>> > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs]
>>> > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0
>>> > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs]
>>> > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs]
>>> > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210
>>> > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs]
>>> > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs]
>>> > > > [<ffffffff81095e29>] kthread+0xc9/0xe0
>>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>>> > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90
>>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90
>>> > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds.
>>> > > > Not tainted 3.19.4-031904-generic #201504131440
>>> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
>>> > > message.
>>> > >
>>> > > That's indicative of IO completion problems, but not a crash.
>>> > >
>>> > > > BUG: unable to handle kernel NULL pointer dereference at
>>> > > (null)
>>> > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs]
>>> > > ....
>>> > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs]
>>> > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50
>>> > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720
>>> > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0
>>> > > ....
>>> > >
>>> > > Again, this is indicative of a page cache issue: a page without
>>> > > buffers has been passed to xfs_vm_releasepage(), which implies the
>>> > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but
>>> > > page->private is null...
>>> > >
>>> > > Again, this is unlikely to be an XFS issue.
>>> > >
>>> >
>>> > Sorry for my ignorance, but would this likely come from Ceph code or a
>>> > hardware issue of some kind, such as a disk drive? I have reached out
>>> to
>>> > RedHat and Ceph community on that as well.
>>>
>>> More likely a kernel bug somewhere in the page cache or memory
>>> reclaim paths. The issue is that we only notice the problem long
>>> after it has occurred. i.e. when XFS goes to tear down the page it has
>>> been handed, the page is already in a bad state and so it doesn't
>>> really tell us anything about the cause of the problem.
>>>
>>> Realisticaly, we need a script that reproduces the problem (that
>>> doesn't require a Ceph cluster) to be able to isolate the cause.
>>> In the mean time, you can always try running CONFIG_XFS_WARN=y to
>>> see if that catches problems earlier, and you might also want to do
>>> things like turn on memory poisoning and other kernel debugging
>>> options to try to isolate the cause of the issue....
>>>
>>
>> We have been error free for almost 3 weeks now with these changes:
>>
>> vm.swappiness=1
>> vm.min_free_kbytes=262144
>>
>> I wonder if this is related to us using high speed Areca HBAs with RAM
>> writeback cache and having had vm.swappiness=0 previously. POssibly the
>> HBA handing down a large chunk of IO very fast and page cache not being to
>> handle it with swappiness=0. I will keep monitoring, but thank you very
>> much for the analysis and info.
>>
>> Alex
>>
>>
>>
>>>
>>> Cheers,
>>>
>>> Dave.
>>> --
>>> Dave Chinner
>>> david@fromorbit.com
>>> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');>
>>>
>>
>>
>
--
--
Alex Gorbachev
Storcium
[-- Attachment #1.2: Type: text/html, Size: 7747 bytes --]
[-- Attachment #2: Type: text/plain, Size: 121 bytes --]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-03-11 3:27 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-03 9:07 Failing XFS filesystem underlying Ceph OSDs Alex Gorbachev
2015-07-03 23:51 ` Dave Chinner
2015-07-04 14:46 ` Alex Gorbachev
2015-07-04 23:38 ` Dave Chinner
2015-07-05 4:25 ` Alex Gorbachev
2015-07-05 23:24 ` Dave Chinner
2015-07-06 19:20 ` Alex Gorbachev
2015-07-07 0:35 ` Dave Chinner
2015-07-22 12:23 ` Alex Gorbachev
2015-08-13 14:25 ` Alex Gorbachev
2016-03-11 3:26 ` Alex Gorbachev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox