* [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout
@ 2009-06-08 5:36 wengang wang
2009-06-08 18:01 ` Sunil Mushran
0 siblings, 1 reply; 4+ messages in thread
From: wengang wang @ 2009-06-08 5:36 UTC (permalink / raw)
To: ocfs2-devel
backgroud:
there is a network idle timeout regarding which a node is considered dead or network partition occures.
problem:
for some product environment, there is a special time during a day. in this special time, a backup work is happening over private network. at the time that the backup is going on, there is very very high load on network. this can lead to ocfs2 network idle timeout and when it can't connect back in time, some nodes have to be fensed out the cluster domain which is not really what we want.
there is a configuration O2CB_IDLE_TIMEOUT_MS by which we can set the timeout value. but looks it takes effect on when o2cb service is restarted, so it's not possible to change it in the already running system.
suggestion:
if we can modify the timeout value at runtime, it's better. we can add a proc file under /proc/fs/ocfs2_nodemanager, for example, idle_timeout, so that a userspace application(such as debugfs.ocfs2) can read/write the timeout value. before the customer run the backup, set the value to a big value(or to no limit) and set it back when backup finished.
contents in /proc/fs/ocfs2_nodemanager/idle_timeout is the timeout value in MS. 0 means no limit.
if it's good, I'm glad to do it.
thanks,
wengang.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout
2009-06-08 5:36 [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout wengang wang
@ 2009-06-08 18:01 ` Sunil Mushran
2009-06-09 3:34 ` Wengang Wang
0 siblings, 1 reply; 4+ messages in thread
From: Sunil Mushran @ 2009-06-08 18:01 UTC (permalink / raw)
To: ocfs2-devel
wengang wang wrote:
> backgroud:
> there is a network idle timeout regarding which a node is considered dead or network partition occures.
>
> problem:
> for some product environment, there is a special time during a day. in this special time, a backup work is happening over private network. at the time that the backup is going on, there is very very high load on network. this can lead to ocfs2 network idle timeout and when it can't connect back in time, some nodes have to be fensed out the cluster domain which is not really what we want.
Bug#? SR? Have we ruled out a bug in our code? The last time I saw one
of these
we determined it was because of a bug.
> there is a configuration O2CB_IDLE_TIMEOUT_MS by which we can set the timeout value. but looks it takes effect on when o2cb service is restarted, so it's not possible to change it in the already running system.
>
> suggestion:
> if we can modify the timeout value at runtime, it's better. we can add a proc file under /proc/fs/ocfs2_nodemanager, for example, idle_timeout, so that a userspace application(such as debugfs.ocfs2) can read/write the timeout value. before the customer run the backup, set the value to a big value(or to no limit) and set it back when backup finished.
> contents in /proc/fs/ocfs2_nodemanager/idle_timeout is the timeout value in MS. 0 means no limit.
>
> if it's good, I'm glad to do it.
One cannot just set this value on one node. It would have to be set
atomically
on all nodes.
While that can still be done, my issue is as to why one cannot set that
timeout
up front. Asking clients to "set" timeout dynamically before certain fs
operations
is not at all friendly. Especially when the user has no idea as what
workload a
certain operation entails.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout
2009-06-08 18:01 ` Sunil Mushran
@ 2009-06-09 3:34 ` Wengang Wang
2009-06-09 21:12 ` Sunil Mushran
0 siblings, 1 reply; 4+ messages in thread
From: Wengang Wang @ 2009-06-09 3:34 UTC (permalink / raw)
To: ocfs2-devel
Sunil,
Sunil Mushran wrote:
> wengang wang wrote:
>> backgroud:
>> there is a network idle timeout regarding which a node is
>> considered dead or network partition occures.
>> problem:
>> for some product environment, there is a special time during a
>> day. in this special time, a backup work is happening over private
>> network. at the time that the backup is going on, there is very very
>> high load on network. this can lead to ocfs2 network idle timeout and
>> when it can't connect back in time, some nodes have to be fensed out
>> the cluster domain which is not really what we want.
>
> Bug#? SR? Have we ruled out a bug in our code? The last time I saw one
> of these
> we determined it was because of a bug.
one of the bugs is:
https://bug.oraclecorp.com/pls/bug/webbug_print.show?c_rptno=8443612
oh, sorry that I didn't notice it could be caused by a bug. will get
tcpdumps to do more analyse on it..
>
>> there is a configuration O2CB_IDLE_TIMEOUT_MS by which we can set
>> the timeout value. but looks it takes effect on when o2cb service is
>> restarted, so it's not possible to change it in the already running
>> system.
>>
>> suggestion:
>> if we can modify the timeout value at runtime, it's better. we can
>> add a proc file under /proc/fs/ocfs2_nodemanager, for example,
>> idle_timeout, so that a userspace application(such as debugfs.ocfs2)
>> can read/write the timeout value. before the customer run the backup,
>> set the value to a big value(or to no limit) and set it back when
>> backup finished.
>> contents in /proc/fs/ocfs2_nodemanager/idle_timeout is the timeout
>> value in MS. 0 means no limit.
>>
>> if it's good, I'm glad to do it.
>
> One cannot just set this value on one node. It would have to be set
> atomically
> on all nodes.
>
Yes, I know that.
> While that can still be done, my issue is as to why one cannot set that
> timeout
> up front. Asking clients to "set" timeout dynamically before certain fs
> operations
> is not at all friendly. Especially when the user has no idea as what
> workload a
> certain operation entails.
if the timeout is set as a too large value, I think it will cause slower
response when a timeout happens(a true node death or network partition)
for a normal network load. for a production environment, it's not good.
and yes it's difficult for clients to determine a high network load
unless they has a very cool administrator -- that's a problem.
Ok, then we put it away now and put it up when we know clearly about the
problem.
thanks
wengang.
--
--just begin to learn, you are never too late...
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout
2009-06-09 3:34 ` Wengang Wang
@ 2009-06-09 21:12 ` Sunil Mushran
0 siblings, 0 replies; 4+ messages in thread
From: Sunil Mushran @ 2009-06-09 21:12 UTC (permalink / raw)
To: ocfs2-devel
This is the same dlm hash too small in 1.2. It has been addressed in 1.4.
Suggest client upgrade to 1.4.
Wengang Wang wrote:
> Sunil,
>
> Sunil Mushran wrote:
>> wengang wang wrote:
>>> backgroud:
>>> there is a network idle timeout regarding which a node is
>>> considered dead or network partition occures.
>>> problem:
>>> for some product environment, there is a special time during a
>>> day. in this special time, a backup work is happening over private
>>> network. at the time that the backup is going on, there is very very
>>> high load on network. this can lead to ocfs2 network idle timeout
>>> and when it can't connect back in time, some nodes have to be fensed
>>> out the cluster domain which is not really what we want.
>>
>> Bug#? SR? Have we ruled out a bug in our code? The last time I saw
>> one of these
>> we determined it was because of a bug.
>
> one of the bugs is:
> https://bug.oraclecorp.com/pls/bug/webbug_print.show?c_rptno=8443612
>
> oh, sorry that I didn't notice it could be caused by a bug. will get
> tcpdumps to do more analyse on it..
>
>>
>>> there is a configuration O2CB_IDLE_TIMEOUT_MS by which we can
>>> set the timeout value. but looks it takes effect on when o2cb
>>> service is restarted, so it's not possible to change it in the
>>> already running system.
>>>
>>> suggestion:
>>> if we can modify the timeout value at runtime, it's better. we
>>> can add a proc file under /proc/fs/ocfs2_nodemanager, for example,
>>> idle_timeout, so that a userspace application(such as debugfs.ocfs2)
>>> can read/write the timeout value. before the customer run the
>>> backup, set the value to a big value(or to no limit) and set it back
>>> when backup finished.
>>> contents in /proc/fs/ocfs2_nodemanager/idle_timeout is the
>>> timeout value in MS. 0 means no limit.
>>>
>>> if it's good, I'm glad to do it.
>>
>> One cannot just set this value on one node. It would have to be set
>> atomically
>> on all nodes.
>>
>
> Yes, I know that.
>
>> While that can still be done, my issue is as to why one cannot set
>> that timeout
>> up front. Asking clients to "set" timeout dynamically before certain
>> fs operations
>> is not at all friendly. Especially when the user has no idea as what
>> workload a
>> certain operation entails.
>
> if the timeout is set as a too large value, I think it will cause
> slower response when a timeout happens(a true node death or network
> partition) for a normal network load. for a production environment,
> it's not good.
>
> and yes it's difficult for clients to determine a high network load
> unless they has a very cool administrator -- that's a problem.
>
> Ok, then we put it away now and put it up when we know clearly about
> the problem.
>
> thanks
> wengang.
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-06-09 21:12 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-08 5:36 [Ocfs2-devel] [SUGGESSTION 1/1] OCFS2: runtime tunable network idle timeout wengang wang
2009-06-08 18:01 ` Sunil Mushran
2009-06-09 3:34 ` Wengang Wang
2009-06-09 21:12 ` Sunil Mushran
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.