From: Jie Liu <jeff.liu@oracle.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements
Date: Wed, 22 Aug 2012 22:17:59 +0800 [thread overview]
Message-ID: <5034EA17.107@oracle.com> (raw)
Hi All,
These days, I am investigating an issue regarding OCFS2 unexpected
reboot in some real world use cases.
This problem occurred when the network status goes south, when the disk
IO load is too high, etc...
I suspect it might caused by ocfs2 fencing if it's BIO reading/writing
can not be scheduled and processed quickly, or
something like this happened in the network IO heartbeat thread.
Now am trying to reproduce this problem locally. In the meantime, I'd
like to ping you guys with some rough ideas
to improve the disk IO heartbeat to see if they are sounds reasonable or
not.
Firstly, if an OCFS2 node is suffer from heavy disk IO, how about to fix
the bio read/write to make this IO request can not
be preempted by other requests? e.g, for o2hb_issue_node_write(),
currently, it do bio submission with WRITE only,
'submit_bio(WRITE, bio)'. If we change the flag to WRITE_SYNC, or even
submit the request combine with REQ_FUA,
maybe could get highest priority for disk IO request.
Secondly, the comments for bio allocation at o2hb_setup_one_bio()
indicates that we can pre-allocate bio instead of
acquire for each time. But I have not saw any code snippet doing such
things in kernel. :(
how about creating a private bio set for each o2hb_region, so that we
can do allocation out of it?
maybe it's faster than do allocation from global bio sets. Also, does
it make sense if creating a memory pool
on each o2hb_region, so that we can have continuous pages bind to those
bios?
Any comments are appreciated!
Thanks,
-Jeff
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20120822/c454d0b4/attachment.html
next reply other threads:[~2012-08-22 14:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-22 14:17 Jie Liu [this message]
2012-08-22 17:13 ` [Ocfs2-devel] RFC: OCFS2 heartbeat improvements srinivas eeda
2012-08-22 17:18 ` Sunil Mushran
2012-08-23 2:08 ` Jie Liu
2012-08-23 2:00 ` Jie Liu
2012-08-23 3:44 ` Tao Ma
2012-08-23 4:01 ` Jie Liu
2012-08-23 17:25 ` Sunil Mushran
2012-08-24 4:33 ` Jie Liu
2012-08-23 17:33 ` Sunil Mushran
2012-08-24 1:42 ` Tao Ma
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5034EA17.107@oracle.com \
--to=jeff.liu@oracle.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.