public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-22 14:14 Is kernel 3.6.1 or filestreams option toxic ? Yann Dupont
@ 2012-10-23  8:24 ` Yann Dupont
  2012-10-25 15:21   ` Yann Dupont
  0 siblings, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-23  8:24 UTC (permalink / raw)
  To: xfs; +Cc: linux-kernel

Le 22/10/2012 16:14, Yann Dupont a écrit :

Hello. This mail is a follow up of a message on XFS mailing list. I had 
hang with 3.6.1, and then , damage on XFS filesystem.

3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a 
different trace this time , so not really sure the 2 problems are related .
Anyway the problem is maybe not XFS, but is just a consequence of what 
seems more like kernel problems.

cc: to linux-kernel


Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991908] 
INFO: task ceph-osd:4409 blocked for more than 120 seconds.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991954] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991999] 
ceph-osd        D ffff88084c049030     0 4409      1 0x00000000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992003]  
ffff88084c048d60 0000000000000086 ffff880a1421de78 ffff880a17caa820
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992054]  
ffff880a1421dfd8 ffff880a1421dfd8 ffff880a1421dfd8 ffff88084c048d60
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992105]  
0000000003373001 ffff88084c048d60 ffff88051775cb20 ffffffffffffffff
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992156] 
Call Trace:
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992184]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992215]  
[<ffffffff812094a3>] ? call_rwsem_down_write_failed+0x13/0x20
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992248]  
[<ffffffff811b83e0>] ? cap_mmap_addr+0x50/0x50
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992275]  
[<ffffffff813c3cbc>] ? down_write+0x1c/0x1d
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992303]  
[<ffffffff810fcf74>] ? vm_mmap_pgoff+0x64/0xb0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992331]  
[<ffffffff8110d4cc>] ? sys_mmap_pgoff+0x5c/0x190
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992360]  
[<ffffffff811357f1>] ? do_sys_open+0x161/0x1e0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992387]  
[<ffffffff813c5ffd>] ? system_call_fastpath+0x1a/0x1f
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992423] 
INFO: task ceph-osd:25297 blocked for more than 120 seconds.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992451] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992495] 
ceph-osd        D ffff8801bce7b1a0     0 25297      1 0x00000000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992497]  
ffff8801bce7aed0 0000000000000086 ffff88025d903fd8 ffff880a17cab580
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992548]  
ffff88025d903fd8 ffff88025d903fd8 ffff88025d903fd8 ffff8801bce7aed0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992599]  
ffff8801bce7aed0 ffff8801bce7aed0 ffff88051775cb20 ffffffffffffffff
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992650] 
Call Trace:
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992673]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992702]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992732]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992759]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992787]  
[<ffffffff81305862>] ? release_sock+0xd2/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992815]  
[<ffffffff8137aceb>] ? inet_stream_connect+0x4b/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992844]  
[<ffffffff81302b55>] ? sys_connect+0xa5/0xe0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992871]  
[<ffffffff811343e3>] ? fd_install+0x33/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992898]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992925] 
INFO: task ceph-osd:32469 blocked for more than 120 seconds.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992953] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992996] 
ceph-osd        D ffff880556237b30     0 32469      1 0x00000000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992999]  
ffff880556237860 0000000000000086 ffff88059fe5dfd8 ffff880a17c742e0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993050]  
ffff88059fe5dfd8 ffff88059fe5dfd8 ffff88059fe5dfd8 ffff880556237860
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993101]  
ffff880556237860 ffff880556237860 ffff88051775cb20 ffffffffffffffff
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993153] 
Call Trace:
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993175]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993204]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993233]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993259]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993286]  
[<ffffffff81305862>] ? release_sock+0xd2/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993314]  
[<ffffffff8137aceb>] ? inet_stream_connect+0x4b/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993342]  
[<ffffffff81302b55>] ? sys_connect+0xa5/0xe0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994484]  
[<ffffffff811343e3>] ? fd_install+0x33/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994510]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994538] 
INFO: task ceph-osd:9660 blocked for more than 120 seconds.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994566] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994609] 
ceph-osd        D ffff8801659f82d0     0 9660      1 0x00000000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994612]  
ffff8801659f8000 0000000000000086 ffff88010f6bdfd8 ffff88084f0c9ac0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994662]  
ffff88010f6bdfd8 ffff88010f6bdfd8 ffff88010f6bdfd8 ffff8801659f8000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994713]  
ffff8801659f8000 ffff8801659f8000 ffff88051775cb20 ffffffffffffffff
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994764] 
Call Trace:
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994786]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994815]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994844]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994870]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994898]  
[<ffffffff81305862>] ? release_sock+0xd2/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994925]  
[<ffffffff8137aceb>] ? inet_stream_connect+0x4b/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994953]  
[<ffffffff81302b55>] ? sys_connect+0xa5/0xe0
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.994980]  
[<ffffffff811343e3>] ? fd_install+0x33/0x70
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995006]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995037] 
INFO: task grep:7014 blocked for more than 120 seconds.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995064] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995108] 
grep            D ffff8800c3f69030     0  7014 7011 0x00000000
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995110]  
ffff8800c3f68d60 0000000000000082 0000000000000000 ffff880a17ca9410
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995161]  
ffff88002dd2ffd8 ffff88002dd2ffd8 ffff88002dd2ffd8 ffff8800c3f68d60
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995212]  
0000000000000000 ffff8800c3f68d60 ffff88051775cb20 ffffffffffffffff
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995264] 
Call Trace:
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995286]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995428]  
[<ffffffff81191625>] ? proc_pid_cmdline+0xa5/0x130
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995456]  
[<ffffffff811922e0>] ? proc_info_read+0xb0/0x110
Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.995484]  
[<ffffffff81136454>] ? vfs_read+0xa4/0x180
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.943923] 
INFO: task ceph-osd:4409 blocked for more than 120 seconds.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.943954] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.943999] 
ceph-osd        D ffff88084c049030     0 4409      1 0x00000000
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944003]  
ffff88084c048d60 0000000000000086 ffff880a1421de78 ffff880a17caa820
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944055]  
ffff880a1421dfd8 ffff880a1421dfd8 ffff880a1421dfd8 ffff88084c048d60
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944106]  
0000000003373001 ffff88084c048d60 ffff88051775cb20 ffffffffffffffff
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944157] 
Call Trace:
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944185]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944216]  
[<ffffffff812094a3>] ? call_rwsem_down_write_failed+0x13/0x20
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944248]  
[<ffffffff811b83e0>] ? cap_mmap_addr+0x50/0x50
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944275]  
[<ffffffff813c3cbc>] ? down_write+0x1c/0x1d
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944303]  
[<ffffffff810fcf74>] ? vm_mmap_pgoff+0x64/0xb0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944330]  
[<ffffffff8110d4cc>] ? sys_mmap_pgoff+0x5c/0x190
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944358]  
[<ffffffff811357f1>] ? do_sys_open+0x161/0x1e0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944386]  
[<ffffffff813c5ffd>] ? system_call_fastpath+0x1a/0x1f
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944423] 
INFO: task ceph-osd:25297 blocked for more than 120 seconds.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944451] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944494] 
ceph-osd        D ffff8801bce7b1a0     0 25297      1 0x00000000
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944496]  
ffff8801bce7aed0 0000000000000086 ffff88025d903fd8 ffff880a17cab580
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944548]  
ffff88025d903fd8 ffff88025d903fd8 ffff88025d903fd8 ffff8801bce7aed0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944599]  
ffff8801bce7aed0 ffff8801bce7aed0 ffff88051775cb20 ffffffffffffffff
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944650] 
Call Trace:
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944673]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944702]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944731]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944758]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944786]  
[<ffffffff81305862>] ? release_sock+0xd2/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944814]  
[<ffffffff8137aceb>] ? inet_stream_connect+0x4b/0x70
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944843]  
[<ffffffff81302b55>] ? sys_connect+0xa5/0xe0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944870]  
[<ffffffff811343e3>] ? fd_install+0x33/0x70
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944897]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944923] 
INFO: task ceph-osd:12506 blocked for more than 120 seconds.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944951] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944994] 
ceph-osd        D ffff8800227f7480     0 12506      1 0x00000000
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.944996]  
ffff8800227f71b0 0000000000000086 0000000000000000 ffff880a17cab580
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945048]  
ffff880468df1fd8 ffff880468df1fd8 ffff880468df1fd8 ffff8800227f71b0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945099]  
0000000000000000 ffff8800227f71b0 ffff88051775cb20 ffffffffffffffff
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945150] 
Call Trace:
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945172]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945201]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945231]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945257]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945284]  
[<ffffffff81302fb7>] ? sys_recvfrom+0x107/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945311]  
[<ffffffff81302b55>] ? sys_connect+0xa5/0xe0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945339]  
[<ffffffff8100a465>] ? read_tsc+0x5/0x20
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945366]  
[<ffffffff810828cf>] ? ktime_get_ts+0x3f/0xe0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945394]  
[<ffffffff811489a4>] ? poll_select_set_timeout+0x64/0x80
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945422]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945449] 
INFO: task ceph-osd:25459 blocked for more than 120 seconds.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945476] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945520] 
ceph-osd        D ffff8803fc809d90     0 25459      1 0x00000000
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945522]  
ffff8803fc809ac0 0000000000000086 0000000000000000 ffff880a17c74990
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945573]  
ffff880468e25fd8 ffff880468e25fd8 ffff880468e25fd8 ffff8803fc809ac0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945624]  
0000000000000000 ffff8803fc809ac0 ffff88051775cb20 ffffffffffffffff
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945675] 
Call Trace:
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945697]  
[<ffffffff813c52fd>] ? rwsem_down_failed_common+0xbd/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945726]  
[<ffffffff81209474>] ? call_rwsem_down_read_failed+0x14/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945755]  
[<ffffffff813c3c9e>] ? down_read+0xe/0x10
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945781]  
[<ffffffff8103129c>] ? do_page_fault+0x16c/0x460
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945808]  
[<ffffffff81302fb7>] ? sys_recvfrom+0x107/0x150
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945835]  
[<ffffffff81082892>] ? ktime_get_ts+0x2/0xe0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945862]  
[<ffffffff8100a465>] ? read_tsc+0x5/0x20
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945888]  
[<ffffffff810828cf>] ? ktime_get_ts+0x3f/0xe0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945914]  
[<ffffffff811489a4>] ? poll_select_set_timeout+0x64/0x80
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945942]  
[<ffffffff813c5a75>] ? page_fault+0x25/0x30
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945969] 
INFO: task ceph-osd:32469 blocked for more than 120 seconds.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.945997] 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.946041] 
ceph-osd        D ffff880556237b30     0 32469      1 0x00000000
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.946043]  
ffff880556237860 0000000000000086 ffff88059fe5dfd8 ffff880a17c742e0
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.946096]  
ffff88059fe5dfd8 ffff88059fe5dfd8 ffff88059fe5dfd8 ffff880556237860
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.946146]  
ffff880556237860 ffff880556237860 ffff88051775cb20 ffffffffffffffff
Oct 22 20:56:29 braeval.u14.univ-nantes.prive kernel: [629696.946198] 
Call Trace:

Well. at least, after the hard reset, xfs volume was still good this time.

Old mail (send to xfs mailing list) for reference :

> Hello,
> Last week, I encountered problems with xfs volumes on several 
> machines. Kernel hanged under heavy load, I hard to hard reset. After 
> reboot, xfs volume was not able to mount, and xfs_repair didn't 
> managed to recover the volume cleanly on 2 different machines.
>
> Just to relax things, It wasn't production data, so it don't matter if 
> I recover data or not. But more important to me is to understand why 
> things went wrong...
>
> I'm using XFS since a long time, on lots of data, it's the first time 
> I encounter such a problem, but I was using unusual option : 
> filestreams, and was using kernel 3.6.1. So I wonder if it has 
> something to do with the crash.
>
> I have nothing very conclusive in the kernel logs, apart this :
>
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569890] 
> INFO: task ceph-osd:17856 blocked for more than 120 seconds.
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569941] 
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569987] 
> ceph-osd        D ffff88056416b1a0     0 17856      1 0x00000000
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569993] 
> ffff88056416aed0 0000000000000086 ffff880590751fd8 ffff88000c67eb00
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570047] 
> ffff880590751fd8 ffff880590751fd8 ffff880590751fd8 ffff88056416aed0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570101] 
> 0000000000000001 ffff88056416aed0 ffff880a15240d00 ffff880a15240d60
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570156] 
> Call Trace:
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570187] 
> [<ffffffff81041335>] ? exit_mm+0x85/0x120
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570216] 
> [<ffffffff81042a94>] ? do_exit+0x154/0x8e0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570248] 
> [<ffffffff8114ec79>] ? file_update_time+0xa9/0x100
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570278] 
> [<ffffffff81043568>] ? do_group_exit+0x38/0xa0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570309] 
> [<ffffffff81051bc6>] ? get_signal_to_deliver+0x1a6/0x5e0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570341] 
> [<ffffffff8100223e>] ? do_signal+0x4e/0x970
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570371] 
> [<ffffffff81170e2e>] ? fsnotify+0x24e/0x340
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570402] 
> [<ffffffff8100c995>] ? fpu_finit+0x15/0x30
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570431] 
> [<ffffffff8100db34>] ? restore_i387_xstate+0x64/0x1c0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570464] 
> [<ffffffff8108e0d2>] ? sys_futex+0x92/0x1b0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570493] 
> [<ffffffff81002bf5>] ? do_notify_resume+0x75/0xc0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570525] 
> [<ffffffff813c60fa>] ? int_signal+0x12/0x17
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570553] 
> INFO: task ceph-osd:17857 blocked for more than 120 seconds.
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570583] 
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570628] 
> ceph-osd        D ffff8801161fe720     0 17857      1 0x00000000
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570632] 
> ffff8801161fe450 0000000000000086 ffffffffffffffe0 ffff880a17c73c30
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570687] 
> ffff88011347ffd8 ffff88011347ffd8 ffff88011347ffd8 ffff8801161fe450
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570740] 
> ffff8801161fe450 ffff8801161fe450 ffff880a15240d00 ffff880a15240d60
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570794] 
> Call Trace:
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570818] 
> [<ffffffff81041335>] ? exit_mm+0x85/0x120
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570846] 
> [<ffffffff81042a94>] ? do_exit+0x154/0x8e0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570875] 
> [<ffffffff81043568>] ? do_group_exit+0x38/0xa0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570905] 
> [<ffffffff81051bc6>] ? get_signal_to_deliver+0x1a6/0x5e0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570935] 
> [<ffffffff8100223e>] ? do_signal+0x4e/0x970
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570967] 
> [<ffffffff81302d24>] ? sys_sendto+0x114/0x150
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570996] 
> [<ffffffff8108e0d2>] ? sys_futex+0x92/0x1b0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571024] 
> [<ffffffff81002bf5>] ? do_notify_resume+0x75/0xc0
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571054] 
> [<ffffffff813c60fa>] ? int_signal+0x12/0x17
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571082] 
> INFO: task ceph-osd:17858 blocked for more than 120 seconds.
> Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571111] 
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-23  8:24 ` Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) Yann Dupont
@ 2012-10-25 15:21   ` Yann Dupont
  2012-10-25 20:55     ` Yann Dupont
  2012-10-25 21:10     ` Dave Chinner
  0 siblings, 2 replies; 19+ messages in thread
From: Yann Dupont @ 2012-10-25 15:21 UTC (permalink / raw)
  To: xfs

Le 23/10/2012 10:24, Yann Dupont a écrit :
> Le 22/10/2012 16:14, Yann Dupont a écrit :
>
> Hello. This mail is a follow up of a message on XFS mailing list. I 
> had hang with 3.6.1, and then , damage on XFS filesystem.
>
> 3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a 
> different trace this time , so not really sure the 2 problems are 
> related .
> Anyway the problem is maybe not XFS, but is just a consequence of what 
> seems more like kernel problems.
>
> cc: to linux-kernel
Hello.
There is definitively something wrong in 3.6.xx with XFS, in particular 
after an abrupt stop of the machine :

I now have corruption on a 3rd machine (not involved with ceph).
The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.

This machine isn't under heavy load, but it's a machine we use for tests 
& compilations. We often crash it. For 2 years, we didn't have problems. 
XFS always was reliable, even in hard conditions (hard reset, loss of 
power, etc)

This time, after 3.6.3 boot, one of my xfs volume refuse to mount :

mount: /dev/mapper/LocalDisk-debug--git: can't read superblock

276596.189363] XFS (dm-1): Mounting Filesystem
[276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
[276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
[276596.711329] XFS (dm-1): log mount/recovery failed: error 5
[276596.711516] XFS (dm-1): log mount failed

I'm not even sure the reboot was after a crash or just a clean reboot. 
(I'm not the only one to use this machine). I have nothing suspect on my 
remote syslog.

Anyway, it's the 3rd XFS crashed volume in a row with 3.6 kernel. 
Different machines, different contexts. Looks suspicious.

This time the crashed volume was handled by a PERC (mptsas) card. The 2 
others volumes previously reported were handled by emulex lightpulse 
fibre channel card (lpfc) and this time filestreams option wasn't used.


xfs_repair -n seems to show volume is quite broken :

Phase 1 - find and verify superblock...
Phase 2 - using internal log
         - scan filesystem freespace and inode maps...
block (1,6197-6197) multiply claimed by bno space tree, state - 2
bad magic # 0x7f454c46 in btbno block 3/2320
expected level 0 got 513 in btbno block 3/2320
bad btree nrecs (256, min=255, max=510) in btbno block 3/2320
invalid start block 16793088 in record 0 of bno btree block 3/2320
invalid start block 0 in record 1 of bno btree block 3/2320
invalid start block 0 in record 2 of bno btree block 3/2320
invalid start block 2282029056 in record 3 of bno btree block 3/2320
invalid start block 0 in record 4 of bno btree block 3/2320
invalid length 218106368 in record 5 of bno btree block 3/2320
invalid start block 1684369509 in record 6 of bno btree block 3/2320
invalid start block 6909556 in record 7 of bno btree block 3/2320
invalid start block 1493202533 in record 8 of bno btree block 3/2320
invalid start block 1768111411 in record 9 of bno btree block 3/2320
invalid start block 761557865 in record 10 of bno btree block 3/2320
invalid start block 842084400 in record 11 of bno btree block 3/2320
...
bad magic # 0x41425442 in btcnt block 2/14832
bad btree nrecs (436, min=255, max=510) in btcnt block 2/14832
out-of-order cnt btree record 2 (188545 1) block 2/14832
out-of-order cnt btree record 3 (188650 1) block 2/14832
out-of-order cnt btree record 4 (188658 1) block 2/14832
out-of-order cnt btree record 8 (189021 1) block 2/14832
out-of-order cnt btree record 9 (189104 1) block 2/14832
out-of-order cnt btree record 10 (189127 2) block 2/14832
out-of-order cnt btree record 11 (189193 2) block 2/14832
out-of-order cnt btree record 12 (189259 2) block 2/14832
out-of-order cnt btree record 13 (189268 1) block 2/14832
out-of-order cnt btree record 14 (189307 1) block 2/14832
out-of-order cnt btree record 15 (189330 1) block 2/14832
out-of-order cnt btree record 16 (189379 1) block 2/14832
out-of-order cnt btree record 18 (189477 1) block 2/14832


I won't try to repair this volume right now.

This time, volume is small enough to make an image (it's a 100 GB lvm 
volume). I'll try to image it before making anything else.

1st question : I saw there is ext4 corruption reported too with 3.6 
kernel, but as far as I can see, problem seems to be jbd related, so it 
shouldn't affect xfs ?
2nd question : Am I the only one to see this ?? I saw problems reported 
with 2.6.37, but here, the kernel is 3.6.xx

3rd question : If you suspect the problem may be lying in XFS , what 
should I supply to help debugging the problem ?

Not CC:ing linux kernel list right now, as I'm really not sure where the 
problem is right now.

Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-25 15:21   ` Yann Dupont
@ 2012-10-25 20:55     ` Yann Dupont
  2012-10-25 21:10     ` Dave Chinner
  1 sibling, 0 replies; 19+ messages in thread
From: Yann Dupont @ 2012-10-25 20:55 UTC (permalink / raw)
  To: xfs

Le 25/10/2012 17:21, Yann Dupont a écrit :
> Hello.
> There is definitively something wrong in 3.6.xx with XFS, in 
> particular after an abrupt stop of the machine :
>
> I now have corruption on a 3rd machine (not involved with ceph).
> The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.
>
> This machine isn't under heavy load, but it's a machine we use for 
> tests & compilations. We often crash it. For 2 years, we didn't have 
> problems. XFS always was reliable, even in hard conditions (hard 
> reset, loss of power, etc)
>
> This time, after 3.6.3 boot, one of my xfs volume refuse to mount :
>
> mount: /dev/mapper/LocalDisk-debug--git: can't read superblock
>
> 276596.189363] XFS (dm-1): Mounting Filesystem
> [276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
> [276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
> [276596.711329] XFS (dm-1): log mount/recovery failed: error 5
> [276596.711516] XFS (dm-1): log mount failed
>

Just found something interesting :

I was rebooting with 3.4.15 to make a backup of this volume. As I said 
in previous message, I didn't did xfs_repair on it.
Before reboot, I forgot to edit fstab to prevent the mount.
To my surprise, under 3.4.15 the volume mounts like a charm !!!

[   37.958374] XFS (dm-1): Mounting Filesystem
[   38.050374] XFS (dm-1): Starting recovery (logdev: internal)
[   69.596892] XFS (dm-1): Ending recovery (logdev: internal)

As far as I can say, there is no corruption, no problems, all my files 
are here !!!

So far here is the scenario :

You have to hard reset your machine with 3.6 (maybe kernel version isn't 
important here). As I encoutered others 3.6 Bugs (exit_mm and 
rwsem_down_failed_common) , I had to do that.

So XFS is not clean.

2) boot with 3.6.xx
Mounting volume fails, bacause log replay fails for an unkwown reason

3) You think your FS is broken, so you start an xfs_repair, which is 
somehow fooled and definitively broke your filesystem

I hope it's reproductible. Will try tomorrow morning.

Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-25 15:21   ` Yann Dupont
  2012-10-25 20:55     ` Yann Dupont
@ 2012-10-25 21:10     ` Dave Chinner
  2012-10-26 10:03       ` Yann Dupont
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2012-10-25 21:10 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Thu, Oct 25, 2012 at 05:21:35PM +0200, Yann Dupont wrote:
> Le 23/10/2012 10:24, Yann Dupont a écrit :
> >Le 22/10/2012 16:14, Yann Dupont a écrit :
> >
> >Hello. This mail is a follow up of a message on XFS mailing list.
> >I had hang with 3.6.1, and then , damage on XFS filesystem.
> >
> >3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a
> >different trace this time , so not really sure the 2 problems are
> >related .
> >Anyway the problem is maybe not XFS, but is just a consequence of
> >what seems more like kernel problems.
> >
> >cc: to linux-kernel
> Hello.
> There is definitively something wrong in 3.6.xx with XFS, in
> particular after an abrupt stop of the machine :
> 
> I now have corruption on a 3rd machine (not involved with ceph).
> The machine was just rebooting from 3.6.2 kernel to 3.6.3 kernel.
> 
> This machine isn't under heavy load, but it's a machine we use for
> tests & compilations. We often crash it. For 2 years, we didn't have
> problems. XFS always was reliable, even in hard conditions (hard
> reset, loss of power, etc)
> 
> This time, after 3.6.3 boot, one of my xfs volume refuse to mount :
> 
> mount: /dev/mapper/LocalDisk-debug--git: can't read superblock
> 
> 276596.189363] XFS (dm-1): Mounting Filesystem
> [276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
> [276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
> [276596.711329] XFS (dm-1): log mount/recovery failed: error 5
> [276596.711516] XFS (dm-1): log mount failed

That's an indication that zeros are being read from the journal
rather than valid transaction data. It may well be caused by an XFS
bug, but from experience it is equally likely to be a lower layer
storage problem. More information is needed.

Firstly:

http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

Secondly, is the system still in this state? If so, dump the log to
a file using xfs_logprint, zip it up and send it to me so I can have
a look at where the log is intact (i.e. likely xfs bug) or contains
zero (likely storage bug).

If the system is not still in this state, then I'm afraid there's
nothing that can be done to understand the problem.

> I'm not even sure the reboot was after a crash or just a clean
> reboot. (I'm not the only one to use this machine). I have nothing
> suspect on my remote syslog.
> 
> Anyway, it's the 3rd XFS crashed volume in a row with 3.6 kernel.
> Different machines, different contexts. Looks suspicious.

You've had two machines crash with problems in the mm subsystem, and
one filesystem problem that might be hardware realted. Bit early to
be blaming XFS for all your problems, I think....

> xfs_repair -n seems to show volume is quite broken :

Sure, if the log hasn't been replayed then it will be - the
filesystem will only be consistent after log recovery has been run.

> I won't try to repair this volume right now.
> 
> This time, volume is small enough to make an image (it's a 100 GB
> lvm volume). I'll try to image it before making anything else.
> 
> 1st question : I saw there is ext4 corruption reported too with 3.6
> kernel, but as far as I can see, problem seems to be jbd related, so
> it shouldn't affect xfs ?

No relationship at all.

> 2nd question : Am I the only one to see this ?? I saw problems
> reported with 2.6.37, but here, the kernel is 3.6.xx

Yes, you're the only one to report such problems on 3.6. Anything
reported on 2.6.37 is likely to be completely unrelated.

> 3rd question : If you suspect the problem may be lying in XFS , what
> should I supply to help debugging the problem ?

See above.

> Not CC:ing linux kernel list right now, as I'm really not sure where
> the problem is right now.

You should report the mm problems to linux-mm@kvack.org to make sure
the right people see them and they don't get lost in the noise of
lkml....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-25 21:10     ` Dave Chinner
@ 2012-10-26 10:03       ` Yann Dupont
  2012-10-26 22:05         ` Yann Dupont
  0 siblings, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-26 10:03 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Le 25/10/2012 23:10, Dave Chinner a écrit :
>
> This time, after 3.6.3 boot, one of my xfs volume refuse to mount :
>
> mount: /dev/mapper/LocalDisk-debug--git: can't read superblock
>
> 276596.189363] XFS (dm-1): Mounting Filesystem
> [276596.270614] XFS (dm-1): Starting recovery (logdev: internal)
> [276596.711295] XFS (dm-1): xlog_recover_process_data: bad clientid 0x0
> [276596.711329] XFS (dm-1): log mount/recovery failed: error 5
> [276596.711516] XFS (dm-1): log mount failed
> That's an indication that zeros are being read from the journal
> rather than valid transaction data. It may well be caused by an XFS
> bug, but from experience it is equally likely to be a lower layer
> storage problem. More information is needed.

Hello dave, did you see the next mail ? The fact is that with 3.4.15, 
journal is OK, and data is, in fact, intact.

> Firstly:
>
> http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F

OK, sorry I missed it : here are the informations. Not sure all is 
relevant, anyway here we go.
each time I will distinguish between the first reported crash (nodes of 
ceph) and the last one, as the setup is quite different.

--------

kernel version (uname -a) : 3.6.1 then 3.6.2, vanilla, hand compiled, no 
proprietary modules. Not running it at the moment, can't give you the 
exact uname -a

------------
xfs_repair version 3.1.7 on the the third machine,
xfs_repair version 3.1.4 on two first machines (part of ceph)
-----------
cpu : the same for the 3 machines : Dell PowerEdgme M610,
2x Intel(R) Xeon(R) CPU           E5649  @ 2.53GHz , Hyper threading 
activated (12 physical cores, 24 virtual cores)

-------------
meminfo :
for example, on the 3rd machine :

MemTotal:       41198292 kB
MemFree:        28623116 kB
Buffers:            1056 kB
Cached:         10392452 kB
SwapCached:            0 kB
Active:           180528 kB
Inactive:       10227416 kB
Active(anon):      17476 kB
Inactive(anon):      180 kB
Active(file):     163052 kB
Inactive(file): 10227236 kB
Unevictable:        3744 kB
Mlocked:            3744 kB
SwapTotal:        506040 kB
SwapFree:         506040 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         18228 kB
Mapped:            12688 kB
Shmem:               300 kB
Slab:            1408204 kB
SReclaimable:    1281008 kB
SUnreclaim:       127196 kB
KernelStack:        1976 kB
PageTables:         2736 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    21105184 kB
Committed_AS:     136080 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      398608 kB
VmallocChunk:   34337979376 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        7652 kB
DirectMap2M:     2076672 kB
DirectMap1G:    39845888 kB

----
/proc/mounts:

root@label5:~# cat /proc/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=20592788k,nr_inodes=5148197,mode=755 0 0
devpts /dev/pts devpts 
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=4119832k,mode=755 0 0
/dev/mapper/LocalDisk-root / xfs rw,relatime,attr2,noquota 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev,relatime,size=8239660k 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /run/shm tmpfs rw,nosuid,nodev,relatime,size=8239660k 0 0
/dev/sda1 /boot ext2 rw,relatime,errors=continue 0 0
** /dev/mapper/LocalDisk-debug--git /mnt/debug-git xfs 
rw,relatime,attr2,noquota 0 0 ** this one was the failing on 3.6.xx
configfs /sys/kernel/config configfs rw,relatime 0 0
ocfs2_dlmfs /dlm ocfs2_dlmfs rw,relatime 0 0
rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0

This volume is on RAID1 localdisk.

on one of the first 2 nodes :

root@hanyu:~# cat /proc/mounts
rootfs / rootfs rw 0 0
none /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
none /proc proc rw,nosuid,nodev,noexec,relatime 0 0
none /dev devtmpfs rw,relatime,size=20592652k,nr_inodes=5148163,mode=755 0 0
none /dev/pts devpts 
rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
/dev/disk/by-uuid/37dd603c-168c-49de-830d-ef1b5c6982f8 / xfs 
rw,relatime,attr2,noquota 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,relatime,mode=755 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,relatime 0 0
/dev/sdk1 /boot ext2 rw,relatime,errors=continue 0 0
none /var/local/cgroup cgroup 
rw,relatime,net_cls,freezer,devices,memory,cpuacct,cpu,debug,cpuset 0 0
** /dev/mapper/xceph--hanyu-data /XCEPH-PROD/data xfs 
rw,noatime,attr2,filestreams,nobarrier,inode64,logbsize=256k,noquota 0 0 
** This one was the failed volume
fusectl /sys/fs/fuse/connections fusectl rw,relatime 0 0


Please note that on this server, nobarrier is used because the volume is 
on a battery-backed fibre channel raid array.
--------------
/proc/partitions :
quite complicated on the ceph node :

root@hanyu:~#  cat /proc/partitions
major minor  #blocks  name

   11        0    1048575 sr0
    8       32 6656000000 sdc
    8       48 5063483392 sdd
    8       64 6656000000 sde
    8       80 5063483392 sdf
    8       96 6656000000 sdg
    8      112 5063483392 sdh
    8      128 6656000000 sdi
    8      144 5063483392 sdj
    8      160  292421632 sdk
    8      161     273073 sdk1
    8      162     530145 sdk2
    8      163    2369587 sdk3
    8      164  289242292 sdk4
  254        0 6656000000 dm-0
  254        1 5063483392 dm-1
  254        2    5242880 dm-2
  254        3 11676106752 dm-3


please note that we use multipath here. 4 Paths for the LUN :

root@hanyu:~# multipath -ll
mpath2 (3600d02310006674500000001414d677d) dm-1 IFT,S16F-R1840-4
size=4.7T features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=100 status=active
| |- 0:0:1:96 sdf 8:80  active ready  running
| `- 6:0:1:96 sdj 8:144 active ready  running
`-+- policy='round-robin 0' prio=20 status=enabled
   |- 0:0:0:96 sdd 8:48  active ready  running
   `- 6:0:0:96 sdh 8:112 active ready  running
mpath1 (3600d02310006674500000000414d677d) dm-0 IFT,S16F-R1840-4
size=6.2T features='1 queue_if_no_path' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=100 status=active
| |- 0:0:1:32 sde 8:64  active ready  running
| `- 6:0:1:32 sdi 8:128 active ready  running
`-+- policy='round-robin 0' prio=20 status=enabled
   |- 0:0:0:32 sdc 8:32  active ready  running
   `- 6:0:0:32 sdg 8:96  active ready  running

On the 3rd machine, setup is quite simpler

root@label5:~# cat /proc/partitions
major minor  #blocks  name

    8        0  292421632 sda
    8        1     257008 sda1
    8        2     506047 sda2
    8        3    1261102 sda3
    8        4  140705302 sda4
  254        0    2609152 dm-0
  254        1  104857600 dm-1
  254        2   31457280 dm-2

--------------

raid layout :

On the first 2 machines (part of ceph cluster), the data is on Raid5 on 
a fibre channel raid array, accessed by emulex fibre channel 
(lightpulse, lpfc)
On the 3rd, data is on Raid1 accessed by Dell Perc (LSI Logic / Symbios 
Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) driver mptsas)

--------------

LVM config :
root@hanyu:~# vgs
   VG          #PV #LV #SN Attr   VSize   VFree
   LocalDisk     1   1   0 wz--n- 275,84g 270,84g
   xceph-hanyu   2   1   0 wz--n-  10,91t  41,36g

root@hanyu:~# lvs
   LV   VG          Attr   LSize  Origin Snap%  Move Log Copy% Convert
   log  LocalDisk   -wi-a- 5,00g
   data xceph-hanyu -wi-ao 10,87t

and

root@label5:~# vgs
   VG        #PV #LV #SN Attr   VSize   VFree
   LocalDisk   1   3   0 wz--n- 134,18g 1,70g

root@label5:~# lvs
   LV        VG        Attr   LSize   Origin Snap%  Move Log Copy% Convert
   1         LocalDisk -wi-a- 30,00g
   debug-git LocalDisk -wi-ao 100,00g
   root      LocalDisk -wi-ao 2,49g
root@label5:~#

-------------------

type of disks :

on the raid array I'd say not very important (SEAGATE ST32000444SS near 
line sas 2TB)
on the 3rd machine : TOSHIBA  MBF2300RC        DA06

---------------------

write cache status :

on the raid array, write cache is activated globally for the raid array 
BUT is explicitely disabled on drives.
on the 3rd machine, it is disabled as far as I know

-------------------

Size of BBWC : 2 or 4 GB on raid arrays. None on the 3rd.


------------------
xfs_info :


root@hanyu:~# xfs_info /dev/xceph-hanyu/data
meta-data=/dev/mapper/xceph--hanyu-data isize=256    agcount=11, 
agsize=268435455 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=2919026688, imaxpct=5
          =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=521728, version=2
          =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


(no sunit or swidth on this one)


root@label5:~# xfs_info /dev/LocalDisk/debug-git
meta-data=/dev/mapper/LocalDisk-debug--git isize=256    agcount=4, 
agsize=6553600 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=26214400, imaxpct=25
          =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=12800, version=2
          =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

-----

dmesg : you already have the informations.

For iostat, etc, I need to try to reproduce the load.




> Secondly, is the system still in this state? If so, dump the log to

No. The first 2 nodes have been xfs_repaired. One was completed and it 
was a terrible mess.
The second had xfs_repair segfaulting. Will try with a newer xfs_repair 
on a 3.4 kernel.

The 3rd one is now ok, after booting on 3.4 kernel.

> a file using xfs_logprint, zip it up and send it to me so I can have
> a look at where the log is intact (i.e. likely xfs bug) or contains
> zero (likely storage bug).
>
> If the system is not still in this state, then I'm afraid there's
> nothing that can be done to understand the problem.

I'll try to reproduce a similar problem.




> You've had two machines crash with problems in the mm subsystem, and
> one filesystem problem that might be hardware realted. Bit early to
> be blaming XFS for all your problems, I think....

I don't try to blame XFS. I'm very confident in it, and since a long 
time. BUT I see a very different behaviour on those 3 cases. Nothing 
conclusive yet. I think the problem is related with kernel 3.6, maybe in 
dm layer.
I don't think it's hardware related : different disks, differents 
controllers, different machines.

The common point is :
-XFS
-Kernel 3.6.xx
-Device Mapper + LVM

>> xfs_repair -n seems to show volume is quite broken :
> Sure, if the log hasn't been replayed then it will be - the
> filesystem will only be consistent after log recovery has been run.
>

Yes, but I had to use xfs_repair -L in the past (power outage, hardware 
failures) and never had such disastrous repairs.

At least on the 2 first failures, I can understand : There is lots of 
data, Journal is BIG, and I/O transactions in flight are quite high.
on the 3rd failure I'm very septical : low I/O load, little volume.

> You should report the mm problems to linux-mm@kvack.org to make sure
> the right people see them and they don't get lost in the noise of
> lkml....

yes point taken,

I'll try now to reproduce this kind of behaviour on a verry little 
volume (10 GB for exemple) so I can confirm or inform the given scenario .

Thanks for your time,


-- Yann Dupont - Service IRTS, DSI Université de Nantes Tel : 
02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-26 10:03       ` Yann Dupont
@ 2012-10-26 22:05         ` Yann Dupont
  2012-10-28 23:48           ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-26 22:05 UTC (permalink / raw)
  To: xfs

Le 26/10/2012 12:03, Yann Dupont a écrit :
> Le 25/10/2012 23:10, Dave Chinner a écrit :
>
> I'll try now to reproduce this kind of behaviour on a verry little 
> volume (10 GB for exemple) so I can confirm or inform the given 
> scenario .
>

This is reproductible. Here is how to do it :

- Started a 3.6.2 kernel.

- I created a fresh lvm volume on localdisk of 20 GB.
- mkfs.xfs on it, with default options
- mounted with default options
- launch something that hammers this volume. I launched compilebench 
0.6  on it
- wait some time to fill memory,buffers, and be sure your disks are 
really busy. I waited some minutes after the initial 30 kernel unpacking 
in compilebench
- hard reset the server (I'm using the Idrac of the server to generate a 
power cycle)
- After some try, I finally had the impossibility to mount the xfs 
volume, with the error reported in previous mails. So far this is normal .

xfs_logprint don't say much :

xfs_logprint:
     data device: 0xfe02
     log device: 0xfe02 daddr: 10485792 length: 20480

Header 0x7c wanted 0xfeedbabe
**********************************************************************
* ERROR: header cycle=124         block=5414 *
**********************************************************************

I tried xfs_logprint -c , it gaves a 22M file. You can grab it here :
http://filex.univ-nantes.fr/get?k=QnBXivz2J3LmzJ18uBV

- Rebooted 3.4.15
- xfs_logprint gives the exact same result that with 3.6.2 (diff tells 
no differences)

but on 3.4.15, I can mount the volume without problem, log is replayed.


for information here is xfs_info of the volume :

here is xfs_info output

root@label5:/mnt/debug# xfs_info /mnt/tempo
meta-data=/dev/mapper/LocalDisk-crashdisk isize=256    agcount=8, 
agsize=655360 blks
          =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=5242880, imaxpct=25
          =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=2560, version=2
          =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0


Does this helps you ?
Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-26 22:05         ` Yann Dupont
@ 2012-10-28 23:48           ` Dave Chinner
  2012-10-29  1:25             ` Dave Chinner
  2012-10-29  8:07             ` Yann Dupont
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Chinner @ 2012-10-28 23:48 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Sat, Oct 27, 2012 at 12:05:34AM +0200, Yann Dupont wrote:
> Le 26/10/2012 12:03, Yann Dupont a écrit :
> >Le 25/10/2012 23:10, Dave Chinner a écrit :
> >
> >I'll try now to reproduce this kind of behaviour on a verry little
> >volume (10 GB for exemple) so I can confirm or inform the given
> >scenario .
> >
> 
> This is reproductible. Here is how to do it :
> 
> - Started a 3.6.2 kernel.
> 
> - I created a fresh lvm volume on localdisk of 20 GB.

Can you reproduce the problem without LVM?

> - mkfs.xfs on it, with default options
> - mounted with default options
> - launch something that hammers this volume. I launched compilebench
> 0.6  on it
> - wait some time to fill memory,buffers, and be sure your disks are
> really busy. I waited some minutes after the initial 30 kernel
> unpacking in compilebench
> - hard reset the server (I'm using the Idrac of the server to
> generate a power cycle)
> - After some try, I finally had the impossibility to mount the xfs
> volume, with the error reported in previous mails. So far this is
> normal .

So it doesn't happen every time, and it may be power cycle related.
What is your "local disk"?

> 
> xfs_logprint don't say much :
> 
> xfs_logprint:
>     data device: 0xfe02
>     log device: 0xfe02 daddr: 10485792 length: 20480
> 
> Header 0x7c wanted 0xfeedbabe
> **********************************************************************
> * ERROR: header cycle=124         block=5414 *
> **********************************************************************

You didn't look past the initial error, did you? The file is only
482280 lines long, and 482200 lines of that are decoded log data....
:)

> I tried xfs_logprint -c , it gaves a 22M file. You can grab it here :
> http://filex.univ-nantes.fr/get?k=QnBXivz2J3LmzJ18uBV

I really need the raw log data, not the parsed output. The logprint
command to do that is "-C <file>", not "-c".

> - Rebooted 3.4.15
> - xfs_logprint gives the exact same result that with 3.6.2 (diff
> tells no differences)

Given that it's generated by the logprint application, I'd expect it
to be identical.

> but on 3.4.15, I can mount the volume without problem, log is
> replayed.

> for information here is xfs_info of the volume :
> 
> here is xfs_info output
> 
> root@label5:/mnt/debug# xfs_info /mnt/tempo
> meta-data=/dev/mapper/LocalDisk-crashdisk isize=256    agcount=8,
> agsize=655360 blks

How did you get a default of 8 AGs? That seems wrong.  What version
of mkfs.xfs are you using?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-28 23:48           ` Dave Chinner
@ 2012-10-29  1:25             ` Dave Chinner
  2012-10-29  8:11               ` Yann Dupont
  2012-10-29 12:18               ` Dave Chinner
  2012-10-29  8:07             ` Yann Dupont
  1 sibling, 2 replies; 19+ messages in thread
From: Dave Chinner @ 2012-10-29  1:25 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Mon, Oct 29, 2012 at 10:48:02AM +1100, Dave Chinner wrote:
> On Sat, Oct 27, 2012 at 12:05:34AM +0200, Yann Dupont wrote:
> > Le 26/10/2012 12:03, Yann Dupont a écrit :
> > >Le 25/10/2012 23:10, Dave Chinner a écrit :
> > >
> > >I'll try now to reproduce this kind of behaviour on a verry little
> > >volume (10 GB for exemple) so I can confirm or inform the given
> > >scenario .
> > >
> > 
> > This is reproductible. Here is how to do it :
> > 
> > - Started a 3.6.2 kernel.
> > 
> > - I created a fresh lvm volume on localdisk of 20 GB.
> 
> Can you reproduce the problem without LVM?
> 
> > - mkfs.xfs on it, with default options
> > - mounted with default options
> > - launch something that hammers this volume. I launched compilebench
> > 0.6  on it
> > - wait some time to fill memory,buffers, and be sure your disks are
> > really busy. I waited some minutes after the initial 30 kernel
> > unpacking in compilebench
> > - hard reset the server (I'm using the Idrac of the server to
> > generate a power cycle)
> > - After some try, I finally had the impossibility to mount the xfs
> > volume, with the error reported in previous mails. So far this is
> > normal .
> 
> So it doesn't happen every time, and it may be power cycle related.
> What is your "local disk"?

I can't reproduce this with a similar setup but using KVM (i.e.
killing the VM instead of power cycling) or forcing a shutdown of
the filesystem without flushing the log. The second case is very
much the same as power cycling, but without the potential "power
failure caused partial IOs to be written" problem.

The only thing I can see in the logprint that I haven't seen so far
in my testing is that your log print indicates a checkpoint that
wraps the end of the log. I haven't yet hit that situation by
chance, so I'll keep trying to see if that's the case that is
causing the problem....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-28 23:48           ` Dave Chinner
  2012-10-29  1:25             ` Dave Chinner
@ 2012-10-29  8:07             ` Yann Dupont
  2012-10-29  8:17               ` Yann Dupont
  1 sibling, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-29  8:07 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Le 29/10/2012 00:48, Dave Chinner a écrit :
>
> This is reproductible. Here is how to do it :
>
> - Started a 3.6.2 kernel.
>
> - I created a fresh lvm volume on localdisk of 20 GB.
> Can you reproduce the problem without LVM?

Hello dave. That is THE question. My intent was to test with and without 
LVM. But right now I can't , because all my disks are consumed by lvm.
In fact my test setup hasn't even enough space to locally clone the 
volume where I have errors. I only have 146 sas disks on this machine.

I have to setup another test platform, and as I'm currently traveling, 
it won't be easy before  next week.

What I want to try this week is to recrash a volume, possibly smaller,
download this image on another machine with kvm
see if I have the mounting problems inside this kvm
begin to bisect the kernel .

...
>
> - After some try, I finally had the impossibility to mount the xfs
> volume, with the error reported in previous mails. So far this is
> normal .
> So it doesn't happen every time, and it may be power cycle related.

Yes, during my tests, I had to to power cycle 3 or 4 times before having 
the actual problem

> What is your "local disk"?

Raid 1 array (2 disks) with mptsas on this.
>
>> xfs_logprint don't say much :
>>
>> xfs_logprint:
>>      data device: 0xfe02
>>      log device: 0xfe02 daddr: 10485792 length: 20480
>>
>> Header 0x7c wanted 0xfeedbabe
>> **********************************************************************
>> * ERROR: header cycle=124         block=5414 *
>> **********************************************************************
> You didn't look past the initial error, did you? The file is only
> 482280 lines long, and 482200 lines of that are decoded log data....
> :)
Well I'd tried with -c  but sorry, i didn't had experience with 
xfs_logprint so far.
>
>> I tried xfs_logprint -c , it gaves a 22M file. You can grab it here :
>> http://filex.univ-nantes.fr/get?k=QnBXivz2J3LmzJ18uBV
> I really need the raw log data, not the parsed output. The logprint
> command to do that is "-C <file>", not "-c".

Ok ... I should have read the man page more carefully. Time to restart a 
crash session

>> - Rebooted 3.4.15
>> - xfs_logprint gives the exact same result that with 3.6.2 (diff
>> tells no differences)
> Given that it's generated by the logprint application, I'd expect it
> to be identical.
Me too, but I'd also expect the log replaying to be identical between 
the 2 kernels

>
>> but on 3.4.15, I can mount the volume without problem, log is
>> replayed.
>> for information here is xfs_info of the volume :
>>
>> here is xfs_info output
>>
>> root@label5:/mnt/debug# xfs_info /mnt/tempo
>> meta-data=/dev/mapper/LocalDisk-crashdisk isize=256    agcount=8,
>> agsize=655360 blks
> How did you get a default of 8 AGs? That seems wrong.  What version
> of mkfs.xfs are you using?
root@label5:~# mkfs.xfs -V
mkfs.xfs version 3.1.7

the volume was freshly formatted, with defaults options. Absolutely 
nothing special on my side.
Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29  1:25             ` Dave Chinner
@ 2012-10-29  8:11               ` Yann Dupont
  2012-10-29 12:21                 ` Dave Chinner
  2012-10-29 12:18               ` Dave Chinner
  1 sibling, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-29  8:11 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Le 29/10/2012 02:25, Dave Chinner a écrit :
> I can't reproduce this with a similar setup but using KVM (i.e. 
> killing the VM instead of power cycling) or forcing a shutdown of the 
> filesystem without flushing the log. The second case is very much the 
> same as power cycling, but without the potential "power failure caused 
> partial IOs to be written" problem. The only thing I can see in the 
> logprint that I haven't seen so far in my testing is that your log 
> print indicates a checkpoint that wraps the end of the log. I haven't 
> yet hit that situation by chance, so I'll keep trying to see if that's 
> the case that is causing the problem.... Cheers, Dave. 

Ok, is your kvm guest was lvm enabled ?

I'll try to recrash the FS, this time I'll make an image of it on 
another machine for further testings. And I'll supply a usefull logprint

Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29  8:07             ` Yann Dupont
@ 2012-10-29  8:17               ` Yann Dupont
  0 siblings, 0 replies; 19+ messages in thread
From: Yann Dupont @ 2012-10-29  8:17 UTC (permalink / raw)
  To: xfs

Le 29/10/2012 09:07, Yann Dupont a écrit :
>
> Hello dave. That is THE question. My intent was to test with and 
> without LVM. But right now I can't , because all my disks are consumed 
> by lvm.
> In fact my test setup hasn't even enough space to locally clone the 
> volume where I have errors. I only have 146 sas disks on this machine.
whoooops
read 146 GB sas disks

If I really had 146 sas disks on that machine I shouldn't have disk 
space problems :)
Cheers,

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29  1:25             ` Dave Chinner
  2012-10-29  8:11               ` Yann Dupont
@ 2012-10-29 12:18               ` Dave Chinner
  2012-10-29 12:43                 ` Yann Dupont
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Chinner @ 2012-10-29 12:18 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Mon, Oct 29, 2012 at 12:25:40PM +1100, Dave Chinner wrote:
> On Mon, Oct 29, 2012 at 10:48:02AM +1100, Dave Chinner wrote:
> > On Sat, Oct 27, 2012 at 12:05:34AM +0200, Yann Dupont wrote:
> > > Le 26/10/2012 12:03, Yann Dupont a écrit :
> > > >Le 25/10/2012 23:10, Dave Chinner a écrit :
> > > - mkfs.xfs on it, with default options
> > > - mounted with default options
> > > - launch something that hammers this volume. I launched compilebench
> > > 0.6  on it
> > > - wait some time to fill memory,buffers, and be sure your disks are
> > > really busy. I waited some minutes after the initial 30 kernel
> > > unpacking in compilebench
> > > - hard reset the server (I'm using the Idrac of the server to
> > > generate a power cycle)
> > > - After some try, I finally had the impossibility to mount the xfs
> > > volume, with the error reported in previous mails. So far this is
> > > normal .
> > 
> > So it doesn't happen every time, and it may be power cycle related.
> > What is your "local disk"?
> 
> I can't reproduce this with a similar setup but using KVM (i.e.
> killing the VM instead of power cycling) or forcing a shutdown of
> the filesystem without flushing the log. The second case is very
> much the same as power cycling, but without the potential "power
> failure caused partial IOs to be written" problem.
> 
> The only thing I can see in the logprint that I haven't seen so far
> in my testing is that your log print indicates a checkpoint that
> wraps the end of the log. I haven't yet hit that situation by
> chance, so I'll keep trying to see if that's the case that is
> causing the problem....

Well, it's taken about 12 hours of random variation of parameters
in the loop of:

mkfs.xfs -f /dev/vdb
mount /dev/vdb /mnt/scratch
./compilebench -D /mnt/scratch &
sleep <some period>
/home/dave/src/xfstests-dev/src/godown /mnt/scratch
sleep 5
umount /mnt/scratch
xfs_logprint -d /dev/vdb

To get a log with a wrapped checkpoint to occur. That was with <some
period> equal to 36s. In all that time, I hadn't seen a single log
mount failure, and the moment I get a wrapped log:

1917 HEADER Cycle 10 tail 9:018456 len  32256 ops 468
1981 HEADER Cycle 10 tail 9:018456 len  32256 ops 427
            ^^^^^^^^^^^^^^^
[00000 - 02045] Cycle 0x0000000a New Cycle 0x00000009

[  368.364232] XFS (vdb): Mounting Filesystem
[  369.096144] XFS (vdb): Starting recovery (logdev: internal)
[  369.126545] XFS (vdb): xlog_recover_process_data: bad clientid 0x2c
[  369.129522] XFS (vdb): log mount/recovery failed: error 5
[  369.131884] XFS (vdb): log mount failed

Ok, so no LVM, no power failure involved, etc. Dig deeper. Let's see
if logprint can dump the transactional record of the log:

# xfs_logprint -f log.img -t
.....
LOG REC AT LSN cycle 9 block 20312 (0x9, 0x4f58)

LOG REC AT LSN cycle 9 block 20376 (0x9, 0x4f98)
xfs_logprint: failed in xfs_do_recovery_pass, error: 12288
#

Ok, xfs_logprint failed to decode the wrapped transaction at the end
of the log. I can't see anything obviously wrong with the contents
of the log off the top of my head (logprint is notoriously buggy),
but the above command can reproduce the problem (3 out of 3 so far),
so I should be able to track down the bug from this.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29  8:11               ` Yann Dupont
@ 2012-10-29 12:21                 ` Dave Chinner
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Chinner @ 2012-10-29 12:21 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Mon, Oct 29, 2012 at 09:11:26AM +0100, Yann Dupont wrote:
> Le 29/10/2012 02:25, Dave Chinner a écrit :
> >I can't reproduce this with a similar setup but using KVM (i.e.
> >killing the VM instead of power cycling) or forcing a shutdown of
> >the filesystem without flushing the log. The second case is very
> >much the same as power cycling, but without the potential "power
> >failure caused partial IOs to be written" problem. The only thing
> >I can see in the logprint that I haven't seen so far in my testing
> >is that your log print indicates a checkpoint that wraps the end
> >of the log. I haven't yet hit that situation by chance, so I'll
> >keep trying to see if that's the case that is causing the
> >problem.... Cheers, Dave.
> 
> Ok, is your kvm guest was lvm enabled ?

No. The idea being that if it is an XFS problem, then it will show
up without needing LVM. And it did.

> I'll try to recrash the FS, this time I'll make an image of it on
> another machine for further testings. And I'll supply a usefull
> logprint

No need, I have a simple local reproducer now based on your example.
I should be able to find the problem from here....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29 12:18               ` Dave Chinner
@ 2012-10-29 12:43                 ` Yann Dupont
  2012-10-30  1:33                   ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: Yann Dupont @ 2012-10-29 12:43 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Le 29/10/2012 13:18, Dave Chinner a écrit :
> On Mon, Oct 29, 2012 at 12:25:40PM +1100, Dave Chinner wrote:
>> On Mon, Oct 29, 2012 at 10:48:02AM +1100, Dave Chinner wrote:
>>> On Sat, Oct 27, 2012 at 12:05:34AM +0200, Yann Dupont wrote:
>>>> Le 26/10/2012 12:03, Yann Dupont a écrit :
>>>>> Le 25/10/2012 23:10, Dave Chinner a écrit :
>>>> - mkfs.xfs on it, with default options
>>>> - mounted with default options
>>>> - launch something that hammers this volume. I launched compilebench
>>>> 0.6  on it
>>>> - wait some time to fill memory,buffers, and be sure your disks are
>>>> really busy. I waited some minutes after the initial 30 kernel
>>>> unpacking in compilebench
>>>> - hard reset the server (I'm using the Idrac of the server to
>>>> generate a power cycle)
>>>> - After some try, I finally had the impossibility to mount the xfs
>>>> volume, with the error reported in previous mails. So far this is
>>>> normal .
>>> So it doesn't happen every time, and it may be power cycle related.
>>> What is your "local disk"?
>> I can't reproduce this with a similar setup but using KVM (i.e.
>> killing the VM instead of power cycling) or forcing a shutdown of
>> the filesystem without flushing the log. The second case is very
>> much the same as power cycling, but without the potential "power
>> failure caused partial IOs to be written" problem.
>>
>> The only thing I can see in the logprint that I haven't seen so far
>> in my testing is that your log print indicates a checkpoint that
>> wraps the end of the log. I haven't yet hit that situation by
>> chance, so I'll keep trying to see if that's the case that is
>> causing the problem....
> Well, it's taken about 12 hours of random variation of parameters
> in the loop of:
>
> mkfs.xfs -f /dev/vdb
> mount /dev/vdb /mnt/scratch
> ./compilebench -D /mnt/scratch &
> sleep <some period>
> /home/dave/src/xfstests-dev/src/godown /mnt/scratch
> sleep 5
> umount /mnt/scratch
> xfs_logprint -d /dev/vdb
>
> To get a log with a wrapped checkpoint to occur. That was with <some
> period> equal to 36s. In all that time, I hadn't seen a single log
> mount failure, and the moment I get a wrapped log:
>
> 1917 HEADER Cycle 10 tail 9:018456 len  32256 ops 468
> 1981 HEADER Cycle 10 tail 9:018456 len  32256 ops 427
>              ^^^^^^^^^^^^^^^
> [00000 - 02045] Cycle 0x0000000a New Cycle 0x00000009
>
> [  368.364232] XFS (vdb): Mounting Filesystem
> [  369.096144] XFS (vdb): Starting recovery (logdev: internal)
> [  369.126545] XFS (vdb): xlog_recover_process_data: bad clientid 0x2c
> [  369.129522] XFS (vdb): log mount/recovery failed: error 5
> [  369.131884] XFS (vdb): log mount failed
>
> Ok, so no LVM, no power failure involved, etc. Dig deeper. Let's see
> if logprint can dump the transactional record of the log:
>
> # xfs_logprint -f log.img -t
> .....
> LOG REC AT LSN cycle 9 block 20312 (0x9, 0x4f58)
>
> LOG REC AT LSN cycle 9 block 20376 (0x9, 0x4f98)
> xfs_logprint: failed in xfs_do_recovery_pass, error: 12288
> #
>
> Ok, xfs_logprint failed to decode the wrapped transaction at the end
> of the log. I can't see anything obviously wrong with the contents
> of the log off the top of my head (logprint is notoriously buggy),
> but the above command can reproduce the problem (3 out of 3 so far),
> so I should be able to track down the bug from this.
>
> Cheers,
>
> Dave.
OK, very glad to hear you were able to reproduce it.
Good luck, and now let the chase begin :)

Cheers,

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-29 12:43                 ` Yann Dupont
@ 2012-10-30  1:33                   ` Dave Chinner
  2012-10-31 11:45                     ` Gaudenz Steinlin
  2012-11-05 13:57                     ` Yann Dupont
  0 siblings, 2 replies; 19+ messages in thread
From: Dave Chinner @ 2012-10-30  1:33 UTC (permalink / raw)
  To: Yann Dupont; +Cc: xfs

On Mon, Oct 29, 2012 at 01:43:00PM +0100, Yann Dupont wrote:
> Le 29/10/2012 13:18, Dave Chinner a écrit :
> >On Mon, Oct 29, 2012 at 12:25:40PM +1100, Dave Chinner wrote:
> >>On Mon, Oct 29, 2012 at 10:48:02AM +1100, Dave Chinner wrote:
> >>>On Sat, Oct 27, 2012 at 12:05:34AM +0200, Yann Dupont wrote:
> >>>>Le 26/10/2012 12:03, Yann Dupont a écrit :
> >>>>>Le 25/10/2012 23:10, Dave Chinner a écrit :
> >>>>- mkfs.xfs on it, with default options
> >>>>- mounted with default options
> >>>>- launch something that hammers this volume. I launched compilebench
> >>>>0.6  on it
> >>>>- wait some time to fill memory,buffers, and be sure your disks are
> >>>>really busy. I waited some minutes after the initial 30 kernel
> >>>>unpacking in compilebench
> >>>>- hard reset the server (I'm using the Idrac of the server to
> >>>>generate a power cycle)
> >>>>- After some try, I finally had the impossibility to mount the xfs
> >>>>volume, with the error reported in previous mails. So far this is
> >>>>normal .
> >>>So it doesn't happen every time, and it may be power cycle related.
> >>>What is your "local disk"?
> >>I can't reproduce this with a similar setup but using KVM (i.e.
> >>killing the VM instead of power cycling) or forcing a shutdown of
> >>the filesystem without flushing the log. The second case is very
> >>much the same as power cycling, but without the potential "power
> >>failure caused partial IOs to be written" problem.
> >>
> >>The only thing I can see in the logprint that I haven't seen so far
> >>in my testing is that your log print indicates a checkpoint that
> >>wraps the end of the log. I haven't yet hit that situation by
> >>chance, so I'll keep trying to see if that's the case that is
> >>causing the problem....
> >Well, it's taken about 12 hours of random variation of parameters
> >in the loop of:
> >
> >mount /dev/vdb /mnt/scratch
> >./compilebench -D /mnt/scratch &
> >sleep <some period>
> >/home/dave/src/xfstests-dev/src/godown /mnt/scratch
> >sleep 5
> >umount /mnt/scratch
> >xfs_logprint -d /dev/vdb
.....
> >Ok, xfs_logprint failed to decode the wrapped transaction at the end
> >of the log. I can't see anything obviously wrong with the contents
> >of the log off the top of my head (logprint is notoriously buggy),
> >but the above command can reproduce the problem (3 out of 3 so far),
> >so I should be able to track down the bug from this.
> >
> OK, very glad to hear you were able to reproduce it.
> Good luck, and now let the chase begin :)

Not really a huge chase, just a simple matter of isolation. The
patch below should fix the problem.

However, the fact that recovery succeeded on 3.4 means you may have
a corrupted filesystem. The bug has been present since 3.0-rc1
(which was a fix for vmap memory leaks), and recovery is trying to
replay stale items from the previous log buffer. As such, it is
possible that changes from a previous checkpoint to have overwritten
more recent changes in the current checkpoint. As such, you should
probably run xfs_repair -n over the filesystems that you remounted
on 3.4 that failed on 3.6 just to make sure they are OK.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: fix reading of wrapped log data

From: Dave Chinner <dchinner@redhat.com>

Commit 4439647 ("xfs: reset buffer pointers before freeing them") in
3.0-rc1 introduced a regression when recovering log buffers that
wrapped around the end of log. The second part of the log buffer at
the start of the physical log was being read into the header buffer
rather than the data buffer, and hence recovery was seeing garbage
in the data buffer when it got to the region of the log buffer that
was incorrectly read.

Cc: <stable@vger.kernel.org> # 3.0.x, 3.2.x, 3.4.x 3.6.x
Reported-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_log_recover.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index e445550..02ff9a8 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3646,7 +3646,7 @@ xlog_do_recovery_pass(
 				 *   - order is important.
 				 */
 				error = xlog_bread_offset(log, 0,
-						bblks - split_bblks, hbp,
+						bblks - split_bblks, dbp,
 						offset + BBTOB(split_bblks));
 				if (error)
 					goto bread_err2;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-30  1:33                   ` Dave Chinner
@ 2012-10-31 11:45                     ` Gaudenz Steinlin
  2012-11-05 13:57                     ` Yann Dupont
  1 sibling, 0 replies; 19+ messages in thread
From: Gaudenz Steinlin @ 2012-10-31 11:45 UTC (permalink / raw)
  To: linux-xfs


Hi

Dave Chinner <david <at> fromorbit.com> writes:

> 
> On Mon, Oct 29, 2012 at 01:43:00PM +0100, Yann Dupont wrote:
> > Le 29/10/2012 13:18, Dave Chinner a écrit :
> > OK, very glad to hear you were able to reproduce it.
> > Good luck, and now let the chase begin :)
> 
> Not really a huge chase, just a simple matter of isolation. The
> patch below should fix the problem.
> 

I ran into the same bug this morning with my home partition after a crash on
suspend to ram. I can confirm that the patch posted to the list by Dave fixes
the problem. I have a backup of the raw log and the whole filesystem if you need
it for further investigation.

BTW: Is this fix already sent upstream? I could not find it anywhere. But then
it may just not be there yet.

Gaudenz



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-10-30  1:33                   ` Dave Chinner
  2012-10-31 11:45                     ` Gaudenz Steinlin
@ 2012-11-05 13:57                     ` Yann Dupont
  1 sibling, 0 replies; 19+ messages in thread
From: Yann Dupont @ 2012-11-05 13:57 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

Le 30/10/2012 02:33, Dave Chinner a écrit :
>
> Not really a huge chase, just a simple matter of isolation. The
> patch below should fix the problem.

Yes, it does.
Thanks a lot for the fast answer and this one-letter-patch !


> However, the fact that recovery succeeded on 3.4 means you may have
> a corrupted filesystem. The bug has been present since 3.0-rc1
> (which was a fix for vmap memory leaks), and recovery is trying to
> replay stale items from the previous log buffer. As such, it is
> possible that changes from a previous checkpoint to have overwritten
> more recent changes in the current checkpoint. As such, you should

Ouch.

> probably run xfs_repair -n over the filesystems that you remounted
> on 3.4 that failed on 3.6 just to make sure they are OK.
Will do.

As someone else pointed out, I think it should go to stable release.
Thanks a lot for your time,
Cheers

-- 
Yann Dupont - Service IRTS, DSI Université de Nantes
Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
@ 2012-11-28  9:39 reste donewell
  2012-11-28 20:37 ` Dave Chinner
  0 siblings, 1 reply; 19+ messages in thread
From: reste donewell @ 2012-11-28  9:39 UTC (permalink / raw)
  To: xfs


[-- Attachment #1.1: Type: text/plain, Size: 182 bytes --]

Hi, 

The proposed patch, wich works, still hasn't landed in Kernel version
3.6.6.
I think it should go to stable release.
Anybody working on this ?

Thanks in advance,

Eric Thiele

[-- Attachment #1.2: Type: text/html, Size: 442 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
  2012-11-28  9:39 Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) reste donewell
@ 2012-11-28 20:37 ` Dave Chinner
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Chinner @ 2012-11-28 20:37 UTC (permalink / raw)
  To: xfs; +Cc: xfs

On Wed, Nov 28, 2012 at 10:39:38AM +0100, reste donewell wrote:
> Hi, 
> 
> The proposed patch, wich works, still hasn't landed in Kernel version
> 3.6.6.
> I think it should go to stable release.
> Anybody working on this ?

It's in 3.6.7

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-11-28 20:35 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-28  9:39 Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) reste donewell
2012-11-28 20:37 ` Dave Chinner
  -- strict thread matches above, loose matches on Subject: below --
2012-10-22 14:14 Is kernel 3.6.1 or filestreams option toxic ? Yann Dupont
2012-10-23  8:24 ` Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?) Yann Dupont
2012-10-25 15:21   ` Yann Dupont
2012-10-25 20:55     ` Yann Dupont
2012-10-25 21:10     ` Dave Chinner
2012-10-26 10:03       ` Yann Dupont
2012-10-26 22:05         ` Yann Dupont
2012-10-28 23:48           ` Dave Chinner
2012-10-29  1:25             ` Dave Chinner
2012-10-29  8:11               ` Yann Dupont
2012-10-29 12:21                 ` Dave Chinner
2012-10-29 12:18               ` Dave Chinner
2012-10-29 12:43                 ` Yann Dupont
2012-10-30  1:33                   ` Dave Chinner
2012-10-31 11:45                     ` Gaudenz Steinlin
2012-11-05 13:57                     ` Yann Dupont
2012-10-29  8:07             ` Yann Dupont
2012-10-29  8:17               ` Yann Dupont

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox