* Failing XFS filesystem underlying Ceph OSDs @ 2015-07-03 9:07 Alex Gorbachev 2015-07-03 23:51 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-07-03 9:07 UTC (permalink / raw) To: xfs [-- Attachment #1.1: Type: text/plain, Size: 8751 bytes --] Hello, we are seeing this and similar errors on multiple Supermicro nodes running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1 Thank you for any info and troubleshooting advice. Alex Gorbachev Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261899] BUG: unable to handle kernel paging request at 000000190000001c Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261923] IP: [<ffffffff8118e476>] find_get_entries+0x66/0x160 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261941] PGD 1035954067 PUD 0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261955] Oops: 0000 [#1] SMP Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.261969] Modules linked in: xfs libcrc32c ipmi_ssif intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core lpc_ich joydev mei_me mei ioatdma wmi 8021q ipmi_si garp 8250_fintek mrp ipmi_msghandler stp llc bonding mac_hid lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid hid igb ahci mpt2sas mlx4_core i2c_algo_bit libahci dca raid_class ptp scsi_transport_sas pps_core arcmsr Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262182] CPU: 10 PID: 8711 Comm: ceph-osd Not tainted 4.1.0-040100-generic #201506220235 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262197] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262215] task: ffff8800721f1420 ti: ffff880fbad54000 task.ti: ffff880fbad54000 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262229] RIP: 0010:[<ffffffff8118e476>] [<ffffffff8118e476>] find_get_entries+0x66/0x160 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262248] RSP: 0018:ffff880fbad571a8 EFLAGS: 00010246 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262258] RAX: ffff880004000158 RBX: 000000000000000e RCX: 0000000000000000 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262303] RDX: ffff880004000158 RSI: ffff880fbad571c0 RDI: 0000001900000000 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262347] RBP: ffff880fbad57208 R08: 00000000000000c0 R09: 00000000000000ff Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262391] R10: 0000000000000000 R11: 0000000000000220 R12: 00000000000000b6 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262435] R13: ffff880fbad57268 R14: 000000000000000a R15: ffff880fbad572d8 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262479] FS: 00007f98cb0e0700(0000) GS:ffff88103f480000(0000) knlGS:0000000000000000 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262524] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262551] CR2: 000000190000001c CR3: 0000001034f0e000 CR4: 00000000000407e0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262596] Stack: Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262618] ffff880fbad571f8 ffff880cf6076b30 ffff880bdde05da8 00000000000000e6 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262669] 0000000000000100 ffff880cf6076b28 00000000000000b5 ffff880fbad57258 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262721] ffff880fbad57258 ffff880fbad572d8 ffffffffffffffff ffff880cf6076b28 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262772] Call Trace: Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262801] [<ffffffff8119b482>] pagevec_lookup_entries+0x22/0x30 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262831] [<ffffffff8119bd84>] truncate_inode_pages_range+0xf4/0x700 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262862] [<ffffffff8119c415>] truncate_inode_pages+0x15/0x20 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262891] [<ffffffff8119c53f>] truncate_inode_pages_final+0x5f/0xa0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262949] [<ffffffffc0431c2c>] xfs_fs_evict_inode+0x3c/0xe0 [xfs] Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.262981] [<ffffffff81220558>] evict+0xb8/0x190 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263009] [<ffffffff81220671>] dispose_list+0x41/0x50 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263037] [<ffffffff8122176f>] prune_icache_sb+0x4f/0x60 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263067] [<ffffffff81208ab5>] super_cache_scan+0x155/0x1a0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263096] [<ffffffff8119d26f>] do_shrink_slab+0x13f/0x2c0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263126] [<ffffffff811a22b0>] ? shrink_lruvec+0x330/0x370 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263157] [<ffffffff811b4189>] ? isolate_migratepages_block+0x299/0x5c0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263188] [<ffffffff8119d558>] shrink_slab+0xd8/0x110 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263217] [<ffffffff811a25bf>] shrink_zone+0x2cf/0x300 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263246] [<ffffffff811b4d3d>] ? compact_zone+0x7d/0x4f0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263275] [<ffffffff811a2a64>] shrink_zones+0x104/0x2a0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263304] [<ffffffff811b53ad>] ? compact_zone_order+0x5d/0x70 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263336] [<ffffffff810f1666>] ? ktime_get+0x46/0xb0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263365] [<ffffffff811a2cd7>] do_try_to_free_pages+0xd7/0x160 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263396] [<ffffffff811a3017>] try_to_free_pages+0xb7/0x170 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263427] [<ffffffff8119571a>] __alloc_pages_nodemask+0x5ba/0x9c0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263460] [<ffffffff811dc9bc>] alloc_pages_current+0x9c/0x110 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263492] [<ffffffff811e4f2a>] allocate_slab+0x20a/0x2e0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263522] [<ffffffff811e5031>] new_slab+0x31/0x1f0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263553] [<ffffffff817f8dd9>] __slab_alloc+0x18e/0x2a3 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263584] [<ffffffff816d7817>] ? __alloc_skb+0x87/0x2b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263614] [<ffffffff816d77e7>] ? __alloc_skb+0x57/0x2b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263643] [<ffffffff811e9b7b>] __kmalloc_node_track_caller+0xbb/0x2b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263675] [<ffffffff816d7817>] ? __alloc_skb+0x87/0x2b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263704] [<ffffffff816d737c>] __kmalloc_reserve.isra.57+0x3c/0xa0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263734] [<ffffffff816d7817>] __alloc_skb+0x87/0x2b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263766] [<ffffffff81737de1>] sk_stream_alloc_skb+0x41/0x130 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263796] [<ffffffff817388b3>] tcp_sendmsg+0x2d3/0xa90 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263827] [<ffffffff81764477>] inet_sendmsg+0x67/0xa0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263858] [<ffffffff816cea54>] ? copy_msghdr_from_user+0x154/0x1b0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263891] [<ffffffff816cdcfd>] sock_sendmsg+0x4d/0x60 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263920] [<ffffffff816cef93>] ___sys_sendmsg+0x2b3/0x2c0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263950] [<ffffffff810a853c>] ? ttwu_do_wakeup+0x2c/0x100 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.263979] [<ffffffff810a8826>] ? ttwu_do_activate.constprop.121+0x66/0x70 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264011] [<ffffffff810abef5>] ? try_to_wake_up+0x215/0x2a0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264040] [<ffffffff810abfb0>] ? wake_up_state+0x10/0x20 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264071] [<ffffffff810fce86>] ? wake_futex+0x76/0xb0 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264099] [<ffffffff810fe192>] ? futex_wake+0x72/0x140 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264127] [<ffffffff81222675>] ? __fget_light+0x25/0x70 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264155] [<ffffffff816cf9b9>] __sys_sendmsg+0x49/0x90 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264184] [<ffffffff816cfa19>] SyS_sendmsg+0x19/0x20 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264215] [<ffffffff8180d272>] system_call_fastpath+0x16/0x75 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264243] Code: 00 4c 89 65 c0 31 d2 e9 86 00 00 00 66 0f 1f 84 00 00 00 00 00 48 8b 3a 48 85 ff 0f 84 ad 00 00 00 40 f6 c7 03 0f 85 a9 00 00 00 <8b> 4f 1c 85 c9 74 e3 8d 71 01 4c 8d 47 1c 89 c8 f0 0f b1 77 1c Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264467] RIP [<ffffffff8118e476>] find_get_entries+0x66/0x160 Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264499] RSP <ffff880fbad571a8> Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264522] CR2: 000000190000001c Jul 3 03:42:06 roc-4r-sca020 kernel: [554036.264824] ---[ end trace ae271fe24c8d817e ]--- [-- Attachment #1.2: Type: text/html, Size: 10052 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-03 9:07 Failing XFS filesystem underlying Ceph OSDs Alex Gorbachev @ 2015-07-03 23:51 ` Dave Chinner 2015-07-04 14:46 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2015-07-03 23:51 UTC (permalink / raw) To: Alex Gorbachev; +Cc: xfs On Fri, Jul 03, 2015 at 05:07:29AM -0400, Alex Gorbachev wrote: > Hello, we are seeing this and similar errors on multiple Supermicro nodes > running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1 > > Thank you for any info and troubleshooting advice. Nothing to suggest that this is an XFS problem. Memory reclaim triggered by network stack memory pressure is causing inode eviction. While removing the page cache it's falling over in the generic truncate code doing a radix tree lookup. That's all generic code - XFS never touches the page cache radix tree directly. I haven't seen this before - is this a new problem since you upgraded your kernel to 4.1? Is it repeatable? if yes to both, then a bisect may be in order to isolate the problematic commit... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-03 23:51 ` Dave Chinner @ 2015-07-04 14:46 ` Alex Gorbachev 2015-07-04 23:38 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-07-04 14:46 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs [-- Attachment #1.1: Type: text/plain, Size: 1680 bytes --] Hello Dave, thank you for the response. I got some recommendations on the ceph-users list that essentially pointed to the problem with vm.swappiness=0 and its new behavior - described here https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/ Basically setting it to 0 creates these OOM conditions due to never swapping anything out. So I changed these settings right away: sysctl vm.swappiness=20 (can probably be 1 as per article) sysctl vm.min_free_kbytes=262144 So far no issues, but I need to wait a week to see if anything shows up. Thank you for reviewing the error codes. Alex On Fri, Jul 3, 2015 at 7:51 PM, Dave Chinner <david@fromorbit.com> wrote: > On Fri, Jul 03, 2015 at 05:07:29AM -0400, Alex Gorbachev wrote: > > Hello, we are seeing this and similar errors on multiple Supermicro nodes > > running Ceph. OS is Ubuntu 14.04.2 with kernel 4.1 > > > > Thank you for any info and troubleshooting advice. > > Nothing to suggest that this is an XFS problem. Memory reclaim > triggered by network stack memory pressure is causing inode > eviction. While removing the page cache it's falling over in > the generic truncate code doing a radix tree lookup. That's all > generic code - XFS never touches the page cache radix tree directly. > > I haven't seen this before - is this a new problem since you > upgraded your kernel to 4.1? Is it repeatable? if yes to both, then > a bisect may be in order to isolate the problematic commit... > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > > _______________________________________________ > xfs mailing list > xfs@oss.sgi.com > http://oss.sgi.com/mailman/listinfo/xfs > [-- Attachment #1.2: Type: text/html, Size: 3932 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-04 14:46 ` Alex Gorbachev @ 2015-07-04 23:38 ` Dave Chinner 2015-07-05 4:25 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2015-07-04 23:38 UTC (permalink / raw) To: Alex Gorbachev; +Cc: xfs On Sat, Jul 04, 2015 at 10:46:24AM -0400, Alex Gorbachev wrote: > Hello Dave, thank you for the response. I got some recommendations on the > ceph-users list that essentially pointed to the problem with > vm.swappiness=0 and its new behavior - described here > https://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/ > > Basically setting it to 0 creates these OOM conditions due to never > swapping anything out. So I changed these settings right away: > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > sysctl vm.min_free_kbytes=262144 That's not an explanation for what looks to be page cache radix tree coruption. Memory reclaim still occurs with the settings you have now and, well, those changes occurred back in 3.5 - some 3 years ago - so it's not really an explanation for a problem with a recent 4.1 kernel... > So far no issues, but I need to wait a week to see if anything shows up. > Thank you for reviewing the error codes. I expect that you'll see the problems again... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-04 23:38 ` Dave Chinner @ 2015-07-05 4:25 ` Alex Gorbachev 2015-07-05 23:24 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-07-05 4:25 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs [-- Attachment #1.1: Type: text/plain, Size: 19202 bytes --] Hi Dave, > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > > > sysctl vm.min_free_kbytes=262144 > > That's not an explanation for what looks to be page cache radix > tree coruption. Memory reclaim still occurs with the settings you > have now and, well, those changes occurred back in 3.5 - some > 3 years ago - so it's not really an explanation for a problem with a > recent 4.1 kernel... > > > So far no issues, but I need to wait a week to see if anything shows up. > > Thank you for reviewing the error codes. > > I expect that you'll see the problems again... > We have experienced the problem in various guises with kernels 3.14, 3.19, 4.1-rc2 and now 4.1, so it's not new to us, just different error stack. Below are some other stack dumps of what manifested as the same error. Do you think we need to look at RAM handling by this Supermicro machine type? Best regards, Alex > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > Apr 30 15:26:04 roc-4r-sca025 kernel: [1070485.999993] [<ffffffff817cf4b9>] schedule+0x29/0x70 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000010] [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000014] [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000030] [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000047] [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000050] [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000067] [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000083] [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000086] [<ffffffff81095e29>] kthread+0xc9/0xe0 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000089] [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000093] [<ffffffff817d3718>] ret_from_fork+0x58/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000096] [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000100] INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000113] Not tainted 3.19.4-031904-generic #201504131440 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000125] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000140] xfsaild/sdg1 D ffff8810102dfcd8 0 2606 2 0x00000000 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000143] ffff8810102dfcd8 ffff8810344c2d80 ffff8810102dffd8 0000000000013e40 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000146] ffff881033383b80 ffff8810389589d0 ffff881033491d70 ffff8810102dfcf8 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000149] ffff8810344c3100 ffff881033491d70 ffff881032acd128 ffff88107fc75100 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000152] Call Trace: Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000155] [<ffffffff817cf4b9>] schedule+0x29/0x70 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000172] [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000175] [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000192] [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000208] [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000211] [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000228] [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000244] [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000247] [<ffffffff81095e29>] kthread+0xc9/0xe0 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000251] [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000254] [<ffffffff817d3718>] ret_from_fork+0x58/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000257] [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000261] INFO: task xfsaild/sdd1:2707 blocked for more than 120 seconds. Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000273] Not tainted 3.19.4-031904-generic #201504131440 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000285] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000300] xfsaild/sdd1 D ffff88100f813cd8 0 2707 2 0x00000000 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000303] ffff88100f813cd8 ffff88102b1d9480 ffff88100f813fd8 0000000000013e40 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000306] ffff881033387700 ffff881038959d70 ffff8810351a09d0 ffff88100f813cf8 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000310] ffff88102b1d8800 ffff8810351a09d0 ffff8810329acd28 ffff88107fcb5100 Apr 30 15:26:04 roc-4r-sca025 kernel: [1070486.000313] Call Trace: Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414753] BUG: unable to handle kernel NULL pointer dereference at (null) Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414779] IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414823] PGD 1013574067 PUD 100ffcb067 PMD 0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414842] Oops: 0000 [#1] SMP Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.414856] Modules linked in: xfs libcrc32c intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core joydev mei_me mei lpc_ich ioatdma ipmi_si 8250_fintek ipmi_msghandler 8021q garp mrp stp llc wmi bonding mac_hid lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid mpt2sas ahci igb hid libahci i2c_algo_bit mlx4_core dca raid_class ptp scsi_transport_sas pps_core arcmsr Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415068] CPU: 0 PID: 687310 Comm: ceph-osd Not tainted 4.1.0-040100rc2-generic #201505032335 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415084] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415102] task: ffff88100f33eeb0 ti: ffff880065cb0000 task.ti: ffff880065cb0000 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415116] RIP: 0010:[<ffffffffc04be80f>] [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415155] RSP: 0018:ffff880065cb32c8 EFLAGS: 00010286 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415165] RAX: 0000000000000000 RBX: ffff880065cb37b0 RCX: 0000000000000000 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415210] RDX: ffff880065cb32f4 RSI: ffff880065cb32f0 RDI: ffff880004000000 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415254] RBP: ffff880065cb32c8 R08: ffff880b687bf478 R09: ffff880065cb3270 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415297] R10: dead000000100100 R11: dead000000200200 R12: ffffea000ccf9840 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415341] R13: ffffea000ccf9860 R14: ffffea000ccf9840 R15: ffff880ab6ce39d0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415386] FS: 00007ff4be95c700(0000) GS:ffff88103f200000(0000) knlGS:0000000000000000 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415432] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415459] CR2: 0000000000000000 CR3: 000000007942d000 CR4: 00000000000407f0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415502] Stack: Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415523] ffff880065cb3328 ffffffffc04be880 0000000000000000 ffffea000ccf9840 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415574] ffff880065cb3498 0000000000000000 ffffea000ccf9840 ffff880065cb37b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415626] ffff880065cb3498 ffffea000ccf9860 ffffea000ccf9840 ffff880ab6ce3b28 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415677] Call Trace: Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415718] [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415751] [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415782] [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415812] [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415843] [<ffffffff8119bf2d>] ? get_lru_size+0x3d/0x40 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415872] [<ffffffff811a0fb4>] shrink_lruvec+0x234/0x370 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415901] [<ffffffff81198e25>] ? pagevec_lru_move_fn+0xf5/0x110 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415931] [<ffffffff811a11db>] shrink_zone+0xeb/0x300 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415961] [<ffffffff811b38fd>] ? compact_zone+0x7d/0x4f0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.415990] [<ffffffff811a1864>] shrink_zones+0x104/0x2a0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416019] [<ffffffff811b3f6d>] ? compact_zone_order+0x5d/0x70 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416051] [<ffffffff810f0666>] ? ktime_get+0x46/0xb0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416079] [<ffffffff811a1ad7>] do_try_to_free_pages+0xd7/0x160 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416110] [<ffffffff811a1e17>] try_to_free_pages+0xb7/0x170 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416139] [<ffffffff8119450a>] __alloc_pages_nodemask+0x5ba/0x9c0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416172] [<ffffffff811db51c>] alloc_pages_current+0x9c/0x110 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416203] [<ffffffff811e3a8a>] allocate_slab+0x20a/0x2e0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416232] [<ffffffff811e3b91>] new_slab+0x31/0x1f0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416261] [<ffffffff817ef6a4>] __slab_alloc+0x18e/0x2a3 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416290] [<ffffffff816cea97>] ? __alloc_skb+0x87/0x2b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416318] [<ffffffff816cea67>] ? __alloc_skb+0x57/0x2b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416348] [<ffffffff811e8687>] __kmalloc_node_track_caller+0xb7/0x290 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416379] [<ffffffff816cea97>] ? __alloc_skb+0x87/0x2b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416407] [<ffffffff816ce5fc>] __kmalloc_reserve.isra.57+0x3c/0xa0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416437] [<ffffffff816cea97>] __alloc_skb+0x87/0x2b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416467] [<ffffffff8172ee91>] sk_stream_alloc_skb+0x41/0x130 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416496] [<ffffffff8172f963>] tcp_sendmsg+0x2d3/0xa90 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416526] [<ffffffff8175b467>] inet_sendmsg+0x67/0xa0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416555] [<ffffffff816c5cb4>] ? copy_msghdr_from_user+0x154/0x1b0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416585] [<ffffffff816c4f5d>] sock_sendmsg+0x4d/0x60 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416613] [<ffffffff816c61f3>] ___sys_sendmsg+0x2b3/0x2c0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416644] [<ffffffff810fcf67>] ? futex_wait+0x1c7/0x2c0 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416673] [<ffffffff810fd122>] ? futex_wake+0x72/0x140 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416703] [<ffffffff81221185>] ? __fget_light+0x25/0x70 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416731] [<ffffffff816c6c19>] __sys_sendmsg+0x49/0x90 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416760] [<ffffffff816c6c79>] SyS_sendmsg+0x19/0x20 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416788] [<ffffffff81803bb2>] system_call_fastpath+0x16/0x75 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.416816] Code: 07 48 89 e5 f6 c4 08 74 43 48 8b 7f 30 48 89 f8 eb 19 66 2e 0f 1f 84 00 00 00 00 00 c7 02 01 00 00 00 48 8b 40 08 48 39 c7 74 1f <48> 8b 08 80 e5 10 75 e9 48 8b 08 80 e5 02 74 e7 c7 06 01 00 00 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417040] RIP [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417094] RSP <ffff880065cb32c8> Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417118] CR2: 0000000000000000 Jun 26 07:02:56 roc-4r-sca015 kernel: [462437.417402] ---[ end trace 71135a48b0dfd313 ]--- Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759401] BUG: unable to handle kernel NULL pointer dereference at (null) Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759457] IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759519] PGD 0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759543] Oops: 0000 [#2] SMP Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759571] Modules linked in: xfs libcrc32c intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sb_edac edac_core joydev mei_me mei lpc_ich ioatdma ipmi_si 8250_fintek ipmi_msghandler 8021q garp mrp stp llc wmi bonding mac_hid lp parport mlx4_en vxlan ip6_udp_tunnel udp_tunnel hid_generic usbhid mpt2sas ahci igb hid libahci i2c_algo_bit mlx4_core dca raid_class ptp scsi_transport_sas pps_core arcmsr Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759889] CPU: 7 PID: 110 Comm: kswapd0 Tainted: G D 4.1.0-040100rc2-generic #201505032335 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759939] Hardware name: Supermicro X9DRD-7LN4F(-JBOD)/X9DRD-EF/X9DRD-7LN4F, BIOS 3.0a 12/05/2013 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.759987] task: ffff88103712bc60 ti: ffff8810331bc000 task.ti: ffff8810331bc000 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760032] RIP: 0010:[<ffffffffc04be80f>] [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760100] RSP: 0018:ffff8810331bf818 EFLAGS: 00010282 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760127] RAX: 0000000000000000 RBX: ffffea000cd08d80 RCX: 0000000000000000 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760171] RDX: ffff8810331bf844 RSI: ffff8810331bf840 RDI: ffff880004000068 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760216] RBP: ffff8810331bf818 R08: ffff880b687bf478 R09: ffff8810331bf850 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760260] R10: ffff880b687bf518 R11: 0000000000000220 R12: ffffea000cd08d80 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760304] R13: 0000000000000000 R14: ffff880ab6ce3b28 R15: ffff880ab6ce39d0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760348] FS: 0000000000000000(0000) GS:ffff88103f3c0000(0000) knlGS:0000000000000000 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760394] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760422] CR2: 0000000000000000 CR3: 0000000001e0f000 CR4: 00000000000407e0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760466] Stack: Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760487] ffff8810331bf878 ffffffffc04be880 ffff880ab6ce3b40 ffff880ab6ce3b28 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760539] ffff8810331bf888 0000000000000000 ffff880b687bf478 ffffea000cd08d80 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760591] ffff880ab6ce3b28 0000000000000000 ffff880ab6ce3b28 000000000000000b Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760643] Call Trace: Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760686] [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760720] [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760752] [<ffffffff8119b431>] invalidate_inode_page+0x71/0x90 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760783] [<ffffffff8119b528>] invalidate_mapping_pages+0xd8/0x1e0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760815] [<ffffffff8121ff22>] inode_lru_isolate+0x122/0x1b0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760846] [<ffffffff811b54b1>] __list_lru_walk_one.isra.6+0x81/0x150 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760877] [<ffffffff8121fe00>] ? insert_inode_locked+0x190/0x190 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760907] [<ffffffff8121fe00>] ? insert_inode_locked+0x190/0x190 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760938] [<ffffffff811b566a>] list_lru_walk_one+0x4a/0x60 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760966] [<ffffffff81220274>] prune_icache_sb+0x44/0x60 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.760996] [<ffffffff81207505>] super_cache_scan+0x155/0x1a0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761026] [<ffffffff8119c06f>] do_shrink_slab+0x13f/0x2c0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761056] [<ffffffff811a10b0>] ? shrink_lruvec+0x330/0x370 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761086] [<ffffffff810aeb95>] ? sched_clock_cpu+0x85/0xc0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761115] [<ffffffff8119c358>] shrink_slab+0xd8/0x110 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761143] [<ffffffff811a13bf>] shrink_zone+0x2cf/0x300 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761173] [<ffffffff811aa500>] ? zone_statistics+0xa0/0xc0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761203] [<ffffffff811a16e0>] kswapd_shrink_zone+0xe0/0x160 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761232] [<ffffffff811a2323>] balance_pgdat+0x2c3/0x430 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761261] [<ffffffff811a25c3>] kswapd+0x133/0x250 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761289] [<ffffffff811a2490>] ? balance_pgdat+0x430/0x430 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761322] [<ffffffff8109cdc9>] kthread+0xc9/0xe0 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761350] [<ffffffff8109cd00>] ? flush_kthread_worker+0x90/0x90 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761381] [<ffffffff81803fe2>] ret_from_fork+0x42/0x70 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761410] [<ffffffff8109cd00>] ? flush_kthread_worker+0x90/0x90 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761439] Code: 07 48 89 e5 f6 c4 08 74 43 48 8b 7f 30 48 89 f8 eb 19 66 2e 0f 1f 84 00 00 00 00 00 c7 02 01 00 00 00 48 8b 40 08 48 39 c7 74 1f <48> 8b 08 80 e5 10 75 e9 48 8b 08 80 e5 02 74 e7 c7 06 01 00 00 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761669] RIP [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761725] RSP <ffff8810331bf818> Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.761749] CR2: 0000000000000000 Jun 26 07:40:55 roc-4r-sca015 kernel: [464718.762039] ---[ end trace 71135a48b0dfd314 ]--- [-- Attachment #1.2: Type: text/html, Size: 22433 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-05 4:25 ` Alex Gorbachev @ 2015-07-05 23:24 ` Dave Chinner 2015-07-06 19:20 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2015-07-05 23:24 UTC (permalink / raw) To: Alex Gorbachev; +Cc: xfs [ Please turn off line wrap when pasting kernel traces ] On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: > > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > > > > > sysctl vm.min_free_kbytes=262144 > > > > That's not an explanation for what looks to be page cache radix > > tree coruption. Memory reclaim still occurs with the settings you > > have now and, well, those changes occurred back in 3.5 - some > > 3 years ago - so it's not really an explanation for a problem with a > > recent 4.1 kernel... > > > > > So far no issues, but I need to wait a week to see if anything shows up. > > > Thank you for reviewing the error codes. > > > > I expect that you'll see the problems again... > > We have experienced the problem in various guises with kernels 3.14, 3.19, > 4.1-rc2 and now 4.1, so it's not new to us, just different error stack. > Below are some other stack dumps of what manifested as the same error. > > [<ffffffff817cf4b9>] schedule+0x29/0x70 > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] > [<ffffffff81095e29>] kthread+0xc9/0xe0 > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. > Not tainted 3.19.4-031904-generic #201504131440 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. That's indicative of IO completion problems, but not a crash. > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] .... > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 .... Again, this is indicative of a page cache issue: a page without buffers has been passed to xfs_vm_releasepage(), which implies the page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but page->private is null... Again, this is unlikely to be an XFS issue. > Do you think we need to look at RAM handling by this Supermicro machine > type? Not sure what you mean by that. Problems like this can be caused by bad hardware, but it's unusual for a machine using ECC memory to have undetected RAM problems... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-05 23:24 ` Dave Chinner @ 2015-07-06 19:20 ` Alex Gorbachev 2015-07-07 0:35 ` Dave Chinner 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-07-06 19:20 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs [-- Attachment #1.1: Type: text/plain, Size: 2802 bytes --] Thank you Dave, On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> wrote: > [ Please turn off line wrap when pasting kernel traces ] > Noted, sorry, I thought wrap was off, but Google must have it on by default. > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: > > > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > > > > > > > sysctl vm.min_free_kbytes=262144 > > > > [...] > > > > We have experienced the problem in various guises with kernels 3.14, > 3.19, > > 4.1-rc2 and now 4.1, so it's not new to us, just different error stack. > > Below are some other stack dumps of what manifested as the same error. > > > > [<ffffffff817cf4b9>] schedule+0x29/0x70 > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] > > [<ffffffff81095e29>] kthread+0xc9/0xe0 > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. > > Not tainted 3.19.4-031904-generic #201504131440 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > message. > > That's indicative of IO completion problems, but not a crash. > > > BUG: unable to handle kernel NULL pointer dereference at > (null) > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] > .... > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 > .... > > Again, this is indicative of a page cache issue: a page without > buffers has been passed to xfs_vm_releasepage(), which implies the > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but > page->private is null... > > Again, this is unlikely to be an XFS issue. > Sorry for my ignorance, but would this likely come from Ceph code or a hardware issue of some kind, such as a disk drive? I have reached out to RedHat and Ceph community on that as well. Thank you, Alex > > > Do you think we need to look at RAM handling by this Supermicro machine > > type? > > Not sure what you mean by that. Problems like this can be caused by > bad hardware, but it's unusual for a machine using ECC memory to > have undetected RAM problems... > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > [-- Attachment #1.2: Type: text/html, Size: 4162 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-06 19:20 ` Alex Gorbachev @ 2015-07-07 0:35 ` Dave Chinner 2015-07-22 12:23 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Dave Chinner @ 2015-07-07 0:35 UTC (permalink / raw) To: Alex Gorbachev; +Cc: xfs On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote: > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> wrote: > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > > > > > > > > > sysctl vm.min_free_kbytes=262144 > > > > > > [...] > > > > > > We have experienced the problem in various guises with kernels 3.14, > > 3.19, > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error stack. > > > Below are some other stack dumps of what manifested as the same error. > > > > > > [<ffffffff817cf4b9>] schedule+0x29/0x70 > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] > > > [<ffffffff81095e29>] kthread+0xc9/0xe0 > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. > > > Not tainted 3.19.4-031904-generic #201504131440 > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > > message. > > > > That's indicative of IO completion problems, but not a crash. > > > > > BUG: unable to handle kernel NULL pointer dereference at > > (null) > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] > > .... > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 > > .... > > > > Again, this is indicative of a page cache issue: a page without > > buffers has been passed to xfs_vm_releasepage(), which implies the > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but > > page->private is null... > > > > Again, this is unlikely to be an XFS issue. > > > > Sorry for my ignorance, but would this likely come from Ceph code or a > hardware issue of some kind, such as a disk drive? I have reached out to > RedHat and Ceph community on that as well. More likely a kernel bug somewhere in the page cache or memory reclaim paths. The issue is that we only notice the problem long after it has occurred. i.e. when XFS goes to tear down the page it has been handed, the page is already in a bad state and so it doesn't really tell us anything about the cause of the problem. Realisticaly, we need a script that reproduces the problem (that doesn't require a Ceph cluster) to be able to isolate the cause. In the mean time, you can always try running CONFIG_XFS_WARN=y to see if that catches problems earlier, and you might also want to do things like turn on memory poisoning and other kernel debugging options to try to isolate the cause of the issue.... Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-07 0:35 ` Dave Chinner @ 2015-07-22 12:23 ` Alex Gorbachev 2015-08-13 14:25 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-07-22 12:23 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs [-- Attachment #1.1: Type: text/plain, Size: 3904 bytes --] Hi Dave, On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com> wrote: > On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote: > > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> > wrote: > > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: > > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article) > > > > > > > > > > > > sysctl vm.min_free_kbytes=262144 > > > > > > > > [...] > > > > > > > > We have experienced the problem in various guises with kernels 3.14, > > > 3.19, > > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error > stack. > > > > Below are some other stack dumps of what manifested as the same > error. > > > > > > > > [<ffffffff817cf4b9>] schedule+0x29/0x70 > > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] > > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 > > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] > > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] > > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 > > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] > > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] > > > > [<ffffffff81095e29>] kthread+0xc9/0xe0 > > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 > > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 > > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. > > > > Not tainted 3.19.4-031904-generic #201504131440 > > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this > > > message. > > > > > > That's indicative of IO completion problems, but not a crash. > > > > > > > BUG: unable to handle kernel NULL pointer dereference at > > > (null) > > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] > > > .... > > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] > > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 > > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 > > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 > > > .... > > > > > > Again, this is indicative of a page cache issue: a page without > > > buffers has been passed to xfs_vm_releasepage(), which implies the > > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but > > > page->private is null... > > > > > > Again, this is unlikely to be an XFS issue. > > > > > > > Sorry for my ignorance, but would this likely come from Ceph code or a > > hardware issue of some kind, such as a disk drive? I have reached out to > > RedHat and Ceph community on that as well. > > More likely a kernel bug somewhere in the page cache or memory > reclaim paths. The issue is that we only notice the problem long > after it has occurred. i.e. when XFS goes to tear down the page it has > been handed, the page is already in a bad state and so it doesn't > really tell us anything about the cause of the problem. > > Realisticaly, we need a script that reproduces the problem (that > doesn't require a Ceph cluster) to be able to isolate the cause. > In the mean time, you can always try running CONFIG_XFS_WARN=y to > see if that catches problems earlier, and you might also want to do > things like turn on memory poisoning and other kernel debugging > options to try to isolate the cause of the issue.... > We have been error free for almost 3 weeks now with these changes: vm.swappiness=1 vm.min_free_kbytes=262144 I wonder if this is related to us using high speed Areca HBAs with RAM writeback cache and having had vm.swappiness=0 previously. POssibly the HBA handing down a large chunk of IO very fast and page cache not being to handle it with swappiness=0. I will keep monitoring, but thank you very much for the analysis and info. Alex > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > [-- Attachment #1.2: Type: text/html, Size: 5824 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-07-22 12:23 ` Alex Gorbachev @ 2015-08-13 14:25 ` Alex Gorbachev 2016-03-11 3:26 ` Alex Gorbachev 0 siblings, 1 reply; 11+ messages in thread From: Alex Gorbachev @ 2015-08-13 14:25 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs [-- Attachment #1.1: Type: text/plain, Size: 4563 bytes --] Good morning, We have experienced one more failure like the ones originally described. I am assuming the vm.min_free_kbytes at 256 MB helped (only one hit, OSD went down but the rest of the cluster stayed up unlike the previous massive storms). So I went ahead and increased the vm.min_free_kbytes to 1 GB. I do not know of any way to reproduce the problem, or what causes it. There is no unusual IO pattern at the time that we are aware of. Thanks, Alex On Wed, Jul 22, 2015 at 8:23 AM, Alex Gorbachev <ag@iss-integration.com> wrote: > Hi Dave, > > On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com> wrote: > >> On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote: >> > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com> >> wrote: >> > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: >> > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article) >> > > > > > >> > > > > > sysctl vm.min_free_kbytes=262144 >> > > > > >> > > [...] >> > > > >> > > > We have experienced the problem in various guises with kernels 3.14, >> > > 3.19, >> > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error >> stack. >> > > > Below are some other stack dumps of what manifested as the same >> error. >> > > > >> > > > [<ffffffff817cf4b9>] schedule+0x29/0x70 >> > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] >> > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 >> > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] >> > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] >> > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 >> > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] >> > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] >> > > > [<ffffffff81095e29>] kthread+0xc9/0xe0 >> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 >> > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 >> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 >> > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. >> > > > Not tainted 3.19.4-031904-generic #201504131440 >> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >> > > message. >> > > >> > > That's indicative of IO completion problems, but not a crash. >> > > >> > > > BUG: unable to handle kernel NULL pointer dereference at >> > > (null) >> > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] >> > > .... >> > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] >> > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 >> > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 >> > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 >> > > .... >> > > >> > > Again, this is indicative of a page cache issue: a page without >> > > buffers has been passed to xfs_vm_releasepage(), which implies the >> > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but >> > > page->private is null... >> > > >> > > Again, this is unlikely to be an XFS issue. >> > > >> > >> > Sorry for my ignorance, but would this likely come from Ceph code or a >> > hardware issue of some kind, such as a disk drive? I have reached out >> to >> > RedHat and Ceph community on that as well. >> >> More likely a kernel bug somewhere in the page cache or memory >> reclaim paths. The issue is that we only notice the problem long >> after it has occurred. i.e. when XFS goes to tear down the page it has >> been handed, the page is already in a bad state and so it doesn't >> really tell us anything about the cause of the problem. >> >> Realisticaly, we need a script that reproduces the problem (that >> doesn't require a Ceph cluster) to be able to isolate the cause. >> In the mean time, you can always try running CONFIG_XFS_WARN=y to >> see if that catches problems earlier, and you might also want to do >> things like turn on memory poisoning and other kernel debugging >> options to try to isolate the cause of the issue.... >> > > We have been error free for almost 3 weeks now with these changes: > > vm.swappiness=1 > vm.min_free_kbytes=262144 > > I wonder if this is related to us using high speed Areca HBAs with RAM > writeback cache and having had vm.swappiness=0 previously. POssibly the > HBA handing down a large chunk of IO very fast and page cache not being to > handle it with swappiness=0. I will keep monitoring, but thank you very > much for the analysis and info. > > Alex > > > >> >> Cheers, >> >> Dave. >> -- >> Dave Chinner >> david@fromorbit.com >> > > [-- Attachment #1.2: Type: text/html, Size: 7219 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: Failing XFS filesystem underlying Ceph OSDs 2015-08-13 14:25 ` Alex Gorbachev @ 2016-03-11 3:26 ` Alex Gorbachev 0 siblings, 0 replies; 11+ messages in thread From: Alex Gorbachev @ 2016-03-11 3:26 UTC (permalink / raw) To: Dave Chinner; +Cc: xfs@oss.sgi.com [-- Attachment #1.1: Type: text/plain, Size: 5090 bytes --] Final update: no issues for many months when using 1GB! On Thursday, August 13, 2015, Alex Gorbachev <ag@iss-integration.com> wrote: > Good morning, > > We have experienced one more failure like the ones originally described. > I am assuming the vm.min_free_kbytes at 256 MB helped (only one hit, OSD > went down but the rest of the cluster stayed up unlike the previous massive > storms). So I went ahead and increased the vm.min_free_kbytes to 1 GB. > > I do not know of any way to reproduce the problem, or what causes it. > There is no unusual IO pattern at the time that we are aware of. > > Thanks, > Alex > > On Wed, Jul 22, 2015 at 8:23 AM, Alex Gorbachev <ag@iss-integration.com > <javascript:_e(%7B%7D,'cvml','ag@iss-integration.com');>> wrote: > >> Hi Dave, >> >> On Mon, Jul 6, 2015 at 8:35 PM, Dave Chinner <david@fromorbit.com >> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');>> wrote: >> >>> On Mon, Jul 06, 2015 at 03:20:19PM -0400, Alex Gorbachev wrote: >>> > On Sun, Jul 5, 2015 at 7:24 PM, Dave Chinner <david@fromorbit.com >>> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');>> wrote: >>> > > On Sun, Jul 05, 2015 at 12:25:47AM -0400, Alex Gorbachev wrote: >>> > > > > > sysctl vm.swappiness=20 (can probably be 1 as per article) >>> > > > > > >>> > > > > > sysctl vm.min_free_kbytes=262144 >>> > > > > >>> > > [...] >>> > > > >>> > > > We have experienced the problem in various guises with kernels >>> 3.14, >>> > > 3.19, >>> > > > 4.1-rc2 and now 4.1, so it's not new to us, just different error >>> stack. >>> > > > Below are some other stack dumps of what manifested as the same >>> error. >>> > > > >>> > > > [<ffffffff817cf4b9>] schedule+0x29/0x70 >>> > > > [<ffffffffc07caee7>] _xfs_log_force+0x187/0x280 [xfs] >>> > > > [<ffffffff810a4150>] ? try_to_wake_up+0x2a0/0x2a0 >>> > > > [<ffffffffc07cb019>] xfs_log_force+0x39/0xc0 [xfs] >>> > > > [<ffffffffc07d6542>] xfsaild_push+0x552/0x5a0 [xfs] >>> > > > [<ffffffff817d2264>] ? schedule_timeout+0x124/0x210 >>> > > > [<ffffffffc07d662f>] xfsaild+0x9f/0x140 [xfs] >>> > > > [<ffffffffc07d6590>] ? xfsaild_push+0x5a0/0x5a0 [xfs] >>> > > > [<ffffffff81095e29>] kthread+0xc9/0xe0 >>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 >>> > > > [<ffffffff817d3718>] ret_from_fork+0x58/0x90 >>> > > > [<ffffffff81095d60>] ? flush_kthread_worker+0x90/0x90 >>> > > > INFO: task xfsaild/sdg1:2606 blocked for more than 120 seconds. >>> > > > Not tainted 3.19.4-031904-generic #201504131440 >>> > > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this >>> > > message. >>> > > >>> > > That's indicative of IO completion problems, but not a crash. >>> > > >>> > > > BUG: unable to handle kernel NULL pointer dereference at >>> > > (null) >>> > > > IP: [<ffffffffc04be80f>] xfs_count_page_state+0x3f/0x70 [xfs] >>> > > .... >>> > > > [<ffffffffc04be880>] xfs_vm_releasepage+0x40/0x120 [xfs] >>> > > > [<ffffffff8118a7d2>] try_to_release_page+0x32/0x50 >>> > > > [<ffffffff8119fe6d>] shrink_page_list+0x69d/0x720 >>> > > > [<ffffffff811a058d>] shrink_inactive_list+0x1dd/0x5d0 >>> > > .... >>> > > >>> > > Again, this is indicative of a page cache issue: a page without >>> > > buffers has been passed to xfs_vm_releasepage(), which implies the >>> > > page flags are not correct. i.e PAGE_FLAGS_PRIVATE is set but >>> > > page->private is null... >>> > > >>> > > Again, this is unlikely to be an XFS issue. >>> > > >>> > >>> > Sorry for my ignorance, but would this likely come from Ceph code or a >>> > hardware issue of some kind, such as a disk drive? I have reached out >>> to >>> > RedHat and Ceph community on that as well. >>> >>> More likely a kernel bug somewhere in the page cache or memory >>> reclaim paths. The issue is that we only notice the problem long >>> after it has occurred. i.e. when XFS goes to tear down the page it has >>> been handed, the page is already in a bad state and so it doesn't >>> really tell us anything about the cause of the problem. >>> >>> Realisticaly, we need a script that reproduces the problem (that >>> doesn't require a Ceph cluster) to be able to isolate the cause. >>> In the mean time, you can always try running CONFIG_XFS_WARN=y to >>> see if that catches problems earlier, and you might also want to do >>> things like turn on memory poisoning and other kernel debugging >>> options to try to isolate the cause of the issue.... >>> >> >> We have been error free for almost 3 weeks now with these changes: >> >> vm.swappiness=1 >> vm.min_free_kbytes=262144 >> >> I wonder if this is related to us using high speed Areca HBAs with RAM >> writeback cache and having had vm.swappiness=0 previously. POssibly the >> HBA handing down a large chunk of IO very fast and page cache not being to >> handle it with swappiness=0. I will keep monitoring, but thank you very >> much for the analysis and info. >> >> Alex >> >> >> >>> >>> Cheers, >>> >>> Dave. >>> -- >>> Dave Chinner >>> david@fromorbit.com >>> <javascript:_e(%7B%7D,'cvml','david@fromorbit.com');> >>> >> >> > -- -- Alex Gorbachev Storcium [-- Attachment #1.2: Type: text/html, Size: 7747 bytes --] [-- Attachment #2: Type: text/plain, Size: 121 bytes --] _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-03-11 3:27 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-03 9:07 Failing XFS filesystem underlying Ceph OSDs Alex Gorbachev 2015-07-03 23:51 ` Dave Chinner 2015-07-04 14:46 ` Alex Gorbachev 2015-07-04 23:38 ` Dave Chinner 2015-07-05 4:25 ` Alex Gorbachev 2015-07-05 23:24 ` Dave Chinner 2015-07-06 19:20 ` Alex Gorbachev 2015-07-07 0:35 ` Dave Chinner 2015-07-22 12:23 ` Alex Gorbachev 2015-08-13 14:25 ` Alex Gorbachev 2016-03-11 3:26 ` Alex Gorbachev
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox