From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id q9MECRtk095555 for ; Mon, 22 Oct 2012 09:12:27 -0500 Received: from smtp-tls.univ-nantes.fr (smtptls1-cha.cpub.univ-nantes.fr [193.52.103.113]) by cuda.sgi.com with ESMTP id rD86dbBLzmY46djz for ; Mon, 22 Oct 2012 07:14:08 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by smtp-tls.univ-nantes.fr (Postfix) with ESMTP id 80D36401483 for ; Mon, 22 Oct 2012 16:14:07 +0200 (CEST) Received: from smtp-tls.univ-nantes.fr ([127.0.0.1]) by localhost (smtptls1-cha.cpub.univ-nantes.fr [127.0.0.1]) (amavisd-new, port 10024) with LMTP id USUDAFaGbNSQ for ; Mon, 22 Oct 2012 16:14:07 +0200 (CEST) Received: from [IPv6:2001:660:7220:0:8991:bb6a:1c27:8ad0] (unknown [IPv6:2001:660:7220:0:8991:bb6a:1c27:8ad0]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp-tls.univ-nantes.fr (Postfix) with ESMTPSA id 5C0924016E4 for ; Mon, 22 Oct 2012 16:14:07 +0200 (CEST) Message-ID: <508554AF.5050005@univ-nantes.fr> Date: Mon, 22 Oct 2012 16:14:07 +0200 From: Yann Dupont MIME-Version: 1.0 Subject: Is kernel 3.6.1 or filestreams option toxic ? List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com Hello, Last week, I encountered problems with xfs volumes on several machines. = Kernel hanged under heavy load, I hard to hard reset. After reboot, xfs = volume was not able to mount, and xfs_repair didn't managed to recover = the volume cleanly on 2 different machines. Just to relax things, It wasn't production data, so it don't matter if I = recover data or not. But more important to me is to understand why = things went wrong... I'm using XFS since a long time, on lots of data, it's the first time I = encounter such a problem, but I was using unusual option : filestreams, = and was using kernel 3.6.1. So I wonder if it has something to do with = the crash. I have nothing very conclusive in the kernel logs, apart this : Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569890] = INFO: task ceph-osd:17856 blocked for more than 120 seconds. Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569941] = "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569987] = ceph-osd D ffff88056416b1a0 0 17856 1 0x00000000 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.569993] = ffff88056416aed0 0000000000000086 ffff880590751fd8 ffff88000c67eb00 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570047] = ffff880590751fd8 ffff880590751fd8 ffff880590751fd8 ffff88056416aed0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570101] = 0000000000000001 ffff88056416aed0 ffff880a15240d00 ffff880a15240d60 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570156] Call = Trace: Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570187] = [] ? exit_mm+0x85/0x120 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570216] = [] ? do_exit+0x154/0x8e0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570248] = [] ? file_update_time+0xa9/0x100 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570278] = [] ? do_group_exit+0x38/0xa0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570309] = [] ? get_signal_to_deliver+0x1a6/0x5e0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570341] = [] ? do_signal+0x4e/0x970 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570371] = [] ? fsnotify+0x24e/0x340 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570402] = [] ? fpu_finit+0x15/0x30 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570431] = [] ? restore_i387_xstate+0x64/0x1c0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570464] = [] ? sys_futex+0x92/0x1b0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570493] = [] ? do_notify_resume+0x75/0xc0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570525] = [] ? int_signal+0x12/0x17 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570553] = INFO: task ceph-osd:17857 blocked for more than 120 seconds. Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570583] = "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570628] = ceph-osd D ffff8801161fe720 0 17857 1 0x00000000 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570632] = ffff8801161fe450 0000000000000086 ffffffffffffffe0 ffff880a17c73c30 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570687] = ffff88011347ffd8 ffff88011347ffd8 ffff88011347ffd8 ffff8801161fe450 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570740] = ffff8801161fe450 ffff8801161fe450 ffff880a15240d00 ffff880a15240d60 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570794] Call = Trace: Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570818] = [] ? exit_mm+0x85/0x120 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570846] = [] ? do_exit+0x154/0x8e0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570875] = [] ? do_group_exit+0x38/0xa0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570905] = [] ? get_signal_to_deliver+0x1a6/0x5e0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570935] = [] ? do_signal+0x4e/0x970 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570967] = [] ? sys_sendto+0x114/0x150 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.570996] = [] ? sys_futex+0x92/0x1b0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571024] = [] ? do_notify_resume+0x75/0xc0 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571054] = [] ? int_signal+0x12/0x17 Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571082] = INFO: task ceph-osd:17858 blocked for more than 120 seconds. Oct 14 14:37:21 hanyu.u14.univ-nantes.prive kernel: [532905.571111] = "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Wasn't able to cleanly shutdown the servers after that. On 2 machines, = xfs volumes (12 TB each) couldn't be mounted anymore, after hardreset, = needed xfs_repair -L ... On 1 machine, xfs_repair goes to end, but with millions errors, and this = gives this in the end :( 344010712 /XCEPH-PROD/data/osd.8 6841649480 /XCEPH-PROD/data/lost+found/ I understand xfs_repair -L always lead to data loss, but not to that point ? on the other one, xfs_repairs segfaults, after lots of messages like = that (I mean, really lots): block (0,1008194-1008194) multiply claimed by cnt space tree, state - 2 block (0,1008200-1008200) multiply claimed by cnt space tree, state - 2 block (0,1012323-1012323) multiply claimed by cnt space tree, state - 2 ... agf_freeblks 87066179, counted 87066033 in ag 0 agi_freecount 489403, counted 488952 in ag 0 agi unlinked bucket 1 is 7681 in ag 0 (inode=3D7681) agi unlinked bucket 5 is 67781 in ag 0 (inode=3D67781) agi unlinked bucket 6 is 10950 in ag 0 (inode=3D10950) ... block (3,30847085-30847085) multiply claimed by cnt space tree, state - 2 block (3,27384823-27384823) multiply claimed by cnt space tree, state - 2 block (3,30115747-30115747) multiply claimed by cnt space tree, state - 2 ... agf_freeblks 90336213, counted 302201427 in ag 3 agf_longest 6144, counted 167772160 in ag 3 inode chunk claims used block, inobt block - agno 3, bno 2380, inopb 16 inode chunk claims used block, inobt block - agno 3, bno 280918, inopb 16 ... Phase 3 - for each AG... - scan (but don't clear) agi unlinked lists... found inodes not in the inode allocation tree - process known inodes and perform inode discovery... - agno =3D 0 7f1738c17700: Badness in key lookup (length) bp=3D(bno 2848, len 16384 bytes) key=3D(bno 2848, len 8192 bytes) 7f1738c17700: Badness in key lookup (length) bp=3D(bno 3840, len 16384 bytes) key=3D(bno 3840, len 8192 bytes) 7f1738c17700: Badness in key lookup (length) bp=3D(bno 5456, len 16384 bytes) key=3D(bno 5456, len 8192 bytes) ... and in the end, xfs_repair segfaults. Those machines are part of a 12 machine ceph cluster (Ceph itself is = pure user-space). All nodes are independant (not on the same computer = room), but were all running 3.6.1 since some days, and all were using = xfs with filestreams option (I was trying to prevent xfs fragmentation). = Could it be related , as it's the first time I encounter such a = disastrous data loss ? I don't have much more relevant details, making this mail a poor bug = report ... If that matters, I can anyway furnish more details about the way those = kernels hanged (ceph nodes reweights, stressing the hardware, lots of = I/O), details about servers & fibre channels disks, and so on. Cheers, -- = Yann Dupont - Service IRTS, DSI Universit=E9 de Nantes Tel : 02.53.48.49.20 - Mail/Jabber : Yann.Dupont@univ-nantes.fr _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs