All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joseph Qi <joseph.qi@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] Cluster blocked, so as to reboot all nodes to avoid it. Is there any patchs for it? Thanks.
Date: Thu, 21 Aug 2014 09:59:42 +0800	[thread overview]
Message-ID: <53F5528E.2030608@huawei.com> (raw)
In-Reply-To: <71604351584F6A4EBAE558C676F37CA42EC6360C@H3CMLB14-EX.srv.huawei-3com.com>

From the stack, it seems that it blocks on loading journal during mount.
Has it already been owned by another node?
Try debugfs.ocfs2 'fs_locks -B' and 'dlm_locks xxx' to find out why.

On 2014/8/21 9:07, Guozhonghua wrote:
> Hi, everyone
> 
>  
> 
> And we have the blocked cluster several times, and the log is always, we have to reboot all the node of the cluster to avoid it.
> 
> Is there any patch that had fix this bug?  
> 
> [<ffffffff817539a5>] schedule_timeout+0x1e5/0x250
> 
> [<ffffffff81755a77>] wait_for_completion+0xa7/0x160
> 
> [<ffffffff8109c9b0>] ? try_to_wake_up+0x2c0/0x2c0
> 
> [<ffffffffa0564063>] __ocfs2_cluster_lock.isra.30+0x1f3/0x820 [ocfs2]
> 
>  
> 
>  
> 
> As we test with a lot of node in one cluster, may be ten or twenty nodes, the cluster is always blocked, and the log is below,
> 
> The kernel version is 3.13.6.
> 
>  
> 
>  
> 
> Aug 20 10:05:43 server211 kernel: [82025.281828]       Tainted: GF       W  O 3.13.6 #5
> 
> Aug 20 10:05:43 server211 kernel: [82025.281830] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> 
> Aug 20 10:05:43 server211 kernel: [82025.281833] mount.ocfs2     D 0000000000000000     0 57890  57889 0x00000000
> 
> Aug 20 10:05:43 server211 kernel: [82025.281838]  ffff880427e03888 0000000000000002 ffff880427e03828 ffffffff8101cba3
> 
> Aug 20 10:05:43 server211 kernel: [82025.281842]  ffff8804270a1810 0000000000014440 ffff880427e03fd8 0000000000014440
> 
> Aug 20 10:05:43 server211 kernel: [82025.281845]  ffff88042958e040 ffff8804270a1810 ffff8804270a1810 ffff880427e03a60
> 
> Aug 20 10:05:43 server211 kernel: [82025.281849] Call Trace:
> 
> Aug 20 10:05:43 server211 kernel: [82025.281862]  [<ffffffff8101cba3>] ? native_sched_clock+0x13/0x80
> 
> Aug 20 10:05:43 server211 kernel: [82025.281867]  [<ffffffff817547d9>] schedule+0x29/0x70
> 
> Aug 20 10:05:43 server211 kernel: [82025.281870]  [<ffffffff817539a5>] schedule_timeout+0x1e5/0x250
> 
> Aug 20 10:05:43 server211 kernel: [82025.281874]  [<ffffffff81755a77>] wait_for_completion+0xa7/0x160
> 
> Aug 20 10:05:43 server211 kernel: [82025.281879]  [<ffffffff8109c9b0>] ? try_to_wake_up+0x2c0/0x2c0
> 
> Aug 20 10:05:43 server211 kernel: [82025.281907]  [<ffffffffa0564063>] __ocfs2_cluster_lock.isra.30+0x1f3/0x820 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281910]  [<ffffffff8175501c>] ? out_of_line_wait_on_bit+0x7c/0x90
> 
> Aug 20 10:05:43 server211 kernel: [82025.281922]  [<ffffffffa0562493>] ? ocfs2_inode_lock_res_init+0x73/0x160 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281934]  [<ffffffffa05658ca>] ocfs2_inode_lock_full_nested+0x13a/0xb80 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281958]  [<ffffffffa0576571>] ? ocfs2_iget+0x121/0x7d0 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281971]  [<ffffffffa057a9f2>] ocfs2_journal_init+0x92/0x480 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281986]  [<ffffffffa05bc3f1>] ocfs2_fill_super+0x15a1/0x25a0 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.281992]  [<ffffffff81394e49>] ? vsnprintf+0x309/0x600
> 
> Aug 20 10:05:43 server211 kernel: [82025.281998]  [<ffffffff811c4c99>] mount_bdev+0x1b9/0x200
> 
> Aug 20 10:05:43 server211 kernel: [82025.282011]  [<ffffffffa05bae50>] ? ocfs2_initialize_super.isra.208+0x1470/0x1470 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.282022]  [<ffffffffa05adbe5>] ocfs2_mount+0x15/0x20 [ocfs2]
> 
> Aug 20 10:05:43 server211 kernel: [82025.282025]  [<ffffffff811c58c3>] mount_fs+0x43/0x1b0
> 
> Aug 20 10:05:43 server211 kernel: [82025.282029]  [<ffffffff811e0ab6>] vfs_kern_mount+0x76/0x130
> 
> Aug 20 10:05:43 server211 kernel: [82025.282032]  [<ffffffff811e2d47>] do_mount+0x237/0xa90
> 
> Aug 20 10:05:43 server211 kernel: [82025.282037]  [<ffffffff8115800e>] ? __get_free_pages+0xe/0x40
> 
> Aug 20 10:05:43 server211 kernel: [82025.282040]  [<ffffffff811e297a>] ? copy_mount_options+0x3a/0x180
> 
> Aug 20 10:05:43 server211 kernel: [82025.282043]  [<ffffffff811e3920>] SyS_mount+0x90/0xe0
> 
> Aug 20 10:05:43 server211 kernel: [82025.282048]  [<ffffffff81760fbf>] tracesys+0xe1/0xe6
> 
> Aug 20 10:06:01 server211 CRON[803]: (root) CMD (   /opt/bin/tomcat_check.sh)
> 
>  
> 
>  
> 
>  
> 
> -------------------------------------------------------------------------------------------------------------------------------------
> ????????????????????????????????????????
> ????????????????????????????????????????
> ????????????????????????????????????????
> ???
> This e-mail and its attachments contain confidential information from H3C, which is
> intended only for the person or entity whose address is listed above. Any use of the
> information contained herein in any way (including, but not limited to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the intended
> recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender
> by phone or email immediately and delete it!
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
> 

  reply	other threads:[~2014-08-21  1:59 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-21  1:07 [Ocfs2-devel] Cluster blocked, so as to reboot all nodes to avoid it. Is there any patchs for it? Thanks Guozhonghua
2014-08-21  1:59 ` Joseph Qi [this message]
2014-08-21  2:31   ` [Ocfs2-devel] 答复: " Guozhonghua

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53F5528E.2030608@huawei.com \
    --to=joseph.qi@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.