From mboxrd@z Thu Jan 1 00:00:00 1970 From: Joseph Qi Date: Thu, 21 Aug 2014 09:59:42 +0800 Subject: [Ocfs2-devel] Cluster blocked, so as to reboot all nodes to avoid it. Is there any patchs for it? Thanks. In-Reply-To: <71604351584F6A4EBAE558C676F37CA42EC6360C@H3CMLB14-EX.srv.huawei-3com.com> References: <71604351584F6A4EBAE558C676F37CA42EC6360C@H3CMLB14-EX.srv.huawei-3com.com> Message-ID: <53F5528E.2030608@huawei.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: ocfs2-devel@oss.oracle.com >From the stack, it seems that it blocks on loading journal during mount. Has it already been owned by another node? Try debugfs.ocfs2 'fs_locks -B' and 'dlm_locks xxx' to find out why. On 2014/8/21 9:07, Guozhonghua wrote: > Hi, everyone > > > > And we have the blocked cluster several times, and the log is always, we have to reboot all the node of the cluster to avoid it. > > Is there any patch that had fix this bug? > > [] schedule_timeout+0x1e5/0x250 > > [] wait_for_completion+0xa7/0x160 > > [] ? try_to_wake_up+0x2c0/0x2c0 > > [] __ocfs2_cluster_lock.isra.30+0x1f3/0x820 [ocfs2] > > > > > > As we test with a lot of node in one cluster, may be ten or twenty nodes, the cluster is always blocked, and the log is below, > > The kernel version is 3.13.6. > > > > > > Aug 20 10:05:43 server211 kernel: [82025.281828] Tainted: GF W O 3.13.6 #5 > > Aug 20 10:05:43 server211 kernel: [82025.281830] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > Aug 20 10:05:43 server211 kernel: [82025.281833] mount.ocfs2 D 0000000000000000 0 57890 57889 0x00000000 > > Aug 20 10:05:43 server211 kernel: [82025.281838] ffff880427e03888 0000000000000002 ffff880427e03828 ffffffff8101cba3 > > Aug 20 10:05:43 server211 kernel: [82025.281842] ffff8804270a1810 0000000000014440 ffff880427e03fd8 0000000000014440 > > Aug 20 10:05:43 server211 kernel: [82025.281845] ffff88042958e040 ffff8804270a1810 ffff8804270a1810 ffff880427e03a60 > > Aug 20 10:05:43 server211 kernel: [82025.281849] Call Trace: > > Aug 20 10:05:43 server211 kernel: [82025.281862] [] ? native_sched_clock+0x13/0x80 > > Aug 20 10:05:43 server211 kernel: [82025.281867] [] schedule+0x29/0x70 > > Aug 20 10:05:43 server211 kernel: [82025.281870] [] schedule_timeout+0x1e5/0x250 > > Aug 20 10:05:43 server211 kernel: [82025.281874] [] wait_for_completion+0xa7/0x160 > > Aug 20 10:05:43 server211 kernel: [82025.281879] [] ? try_to_wake_up+0x2c0/0x2c0 > > Aug 20 10:05:43 server211 kernel: [82025.281907] [] __ocfs2_cluster_lock.isra.30+0x1f3/0x820 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281910] [] ? out_of_line_wait_on_bit+0x7c/0x90 > > Aug 20 10:05:43 server211 kernel: [82025.281922] [] ? ocfs2_inode_lock_res_init+0x73/0x160 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281934] [] ocfs2_inode_lock_full_nested+0x13a/0xb80 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281958] [] ? ocfs2_iget+0x121/0x7d0 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281971] [] ocfs2_journal_init+0x92/0x480 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281986] [] ocfs2_fill_super+0x15a1/0x25a0 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.281992] [] ? vsnprintf+0x309/0x600 > > Aug 20 10:05:43 server211 kernel: [82025.281998] [] mount_bdev+0x1b9/0x200 > > Aug 20 10:05:43 server211 kernel: [82025.282011] [] ? ocfs2_initialize_super.isra.208+0x1470/0x1470 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.282022] [] ocfs2_mount+0x15/0x20 [ocfs2] > > Aug 20 10:05:43 server211 kernel: [82025.282025] [] mount_fs+0x43/0x1b0 > > Aug 20 10:05:43 server211 kernel: [82025.282029] [] vfs_kern_mount+0x76/0x130 > > Aug 20 10:05:43 server211 kernel: [82025.282032] [] do_mount+0x237/0xa90 > > Aug 20 10:05:43 server211 kernel: [82025.282037] [] ? __get_free_pages+0xe/0x40 > > Aug 20 10:05:43 server211 kernel: [82025.282040] [] ? copy_mount_options+0x3a/0x180 > > Aug 20 10:05:43 server211 kernel: [82025.282043] [] SyS_mount+0x90/0xe0 > > Aug 20 10:05:43 server211 kernel: [82025.282048] [] tracesys+0xe1/0xe6 > > Aug 20 10:06:01 server211 CRON[803]: (root) CMD ( /opt/bin/tomcat_check.sh) > > > > > > > > ------------------------------------------------------------------------------------------------------------------------------------- > ???????????????????????????????????????? > ???????????????????????????????????????? > ???????????????????????????????????????? > ??? > This e-mail and its attachments contain confidential information from H3C, which is > intended only for the person or entity whose address is listed above. Any use of the > information contained herein in any way (including, but not limited to, total or partial > disclosure, reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender > by phone or email immediately and delete it! > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >