* [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes @ 2009-08-23 9:00 Jiaju Zhang 2009-08-24 14:31 ` [Cluster-devel] " David Teigland 0 siblings, 1 reply; 5+ messages in thread From: Jiaju Zhang @ 2009-08-23 9:00 UTC (permalink / raw) To: cluster-devel.redhat.com Hello, Currently, it seems that dlm_controld won't drop an unused plock resource (which has exceed the timeout) when there is no another plock request for other resource comes. I found this issue when I run the pingpong test case http://junkcode.samba.org/ftp/unpacked/junkcode/ping_pong.c In the pingpong, it just uses fcntl(F_SETLKW) to lock/unlock a file repeatedly to see the lock performance. Now, my test case is: Step 1: start pingpong on node A, then stop it. Step 2: after a while(more than 10 seconds), start pingpong on node B. I haven't run the pingpong concurrently on the two nodes, but the lock performance in node A is much higher than in node B. The user might be confused about why the performance on the two nodes seems so different. After some investigating, I found the reason is when running pingpong on node A, r->owner == A, so all the plock requests operates locally. When pingpong stops on node A, there is no other plock request comes from node A, so the drop_resources won't be triggered even if it has exceed the timeout. Then, pingpong on node B and it found r->owner == A so eventually it turned into the state r->owner == 0. This is the distributed mode, so the performance is decreasing. It seems no chance to trigger the drop_resources to work if there is no other plock requests (which are requesting for another lock resource) come from node A. But I still wonder if there is a way to improve this? Thanks, Jiaju -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090823/85555855/attachment.htm> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes 2009-08-23 9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang @ 2009-08-24 14:31 ` David Teigland 2009-08-25 3:10 ` Jiaju Zhang 0 siblings, 1 reply; 5+ messages in thread From: David Teigland @ 2009-08-24 14:31 UTC (permalink / raw) To: cluster-devel.redhat.com On Sun, Aug 23, 2009 at 05:00:44PM +0800, Jiaju Zhang wrote: > It seems no chance to trigger the drop_resources to work if there is no > other plock requests (which are requesting for another lock resource) come > from node A. > But I still wonder if there is a way to improve this? You're correct; yes the daemon could easily be changed to drop resources when there's no locking activity. We'd just set a poll timeout when plock_resources is non-empty and call drop_resources() if it times out. (If you simply want to work around this, you can write a little program to lock a file on the fs to trigger the drop.) Dave ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes 2009-08-24 14:31 ` [Cluster-devel] " David Teigland @ 2009-08-25 3:10 ` Jiaju Zhang 2009-08-25 15:35 ` David Teigland 0 siblings, 1 reply; 5+ messages in thread From: Jiaju Zhang @ 2009-08-25 3:10 UTC (permalink / raw) To: cluster-devel.redhat.com On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote: > > You're correct; yes the daemon could easily be changed to drop resources > when > there's no locking activity. We'd just set a poll timeout when > plock_resources is non-empty and call drop_resources() if it times out. > > (If you simply want to work around this, you can write a little program to > lock a file on the fs to trigger the drop.) > > Dave Many thanks for your guidance :-) So I write a patch to try to fix this. Review and comments are welcome. Thanks, Jiaju diff -Nupr a/group/dlm_controld/dlm_daemon.h b/group/dlm_controld/dlm_daemon.h --- a/group/dlm_controld/dlm_daemon.h 2009-07-03 14:53:42.000000000 +0800 +++ b/group/dlm_controld/dlm_daemon.h 2009-08-25 10:38:17.000000000 +0800 @@ -300,6 +300,8 @@ void store_plocks(struct lockspace *ls); void retrieve_plocks(struct lockspace *ls); void purge_plocks(struct lockspace *ls, int nodeid, int unmount); int fill_plock_dump_buf(struct lockspace *ls); +unsigned long time_diff_ms(struct timeval *begin, struct timeval *end); +int drop_resources(struct lockspace *ls); /* logging.c */ diff -Nupr a/group/dlm_controld/main.c b/group/dlm_controld/main.c --- a/group/dlm_controld/main.c 2009-07-03 14:53:42.000000000 +0800 +++ b/group/dlm_controld/main.c 2009-08-25 07:59:53.000000000 +0800 @@ -842,6 +842,8 @@ static void loop(void) struct lockspace *ls; int poll_timeout = -1; int rv, i; + int need_to_drop = 0; + struct timeval now, last_access; void (*workfn) (int ci); void (*deadfn) (int ci); @@ -963,6 +965,30 @@ static void loop(void) } poll_timeout = 1000; } + + if (cfgd_plock_ownership) { + gettimeofday(&now, NULL); + if (need_to_drop && time_diff_ms(&last_access, &now) >= cfgd_drop_resources_time) { + list_for_each_entry(ls, &lockspaces, list) { + if (!list_empty(&ls->plock_resources)) { + poll_timeout = cfgd_drop_resources_time; + ls->drop_resources_last = now; + drop_resources(ls); + } + } + need_to_drop = 0; + last_access = now; + } else { + list_for_each_entry(ls, &lockspaces, list) { + if (!list_empty(&ls->plock_resources)) { + poll_timeout = cfgd_drop_resources_time; + need_to_drop = 1; + last_access = now; + break; + } + } + } + } query_unlock(); } out: diff -Nupr a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c --- a/group/dlm_controld/plock.c 2009-07-27 12:04:07.000000000 +0800 +++ b/group/dlm_controld/plock.c 2009-08-25 10:37:43.000000000 +0800 @@ -216,7 +216,7 @@ static uint32_t mg_to_ls_id(uint32_t fsi /* FIXME: unify these two */ -static unsigned long time_diff_ms(struct timeval *begin, struct timeval *end) +unsigned long time_diff_ms(struct timeval *begin, struct timeval *end) { struct timeval result; timersub(end, begin, &result); @@ -1344,7 +1344,7 @@ void receive_drop(struct lockspace *ls, /* FIXME: in the transition from owner = us, to owner = 0, to drop; we want the second period to be shorter than the first */ -static int drop_resources(struct lockspace *ls) +int drop_resources(struct lockspace *ls) { struct resource *r; struct timeval now; -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090825/1b0359c7/attachment.htm> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes 2009-08-25 3:10 ` Jiaju Zhang @ 2009-08-25 15:35 ` David Teigland 2009-08-26 6:54 ` Jiaju Zhang 0 siblings, 1 reply; 5+ messages in thread From: David Teigland @ 2009-08-25 15:35 UTC (permalink / raw) To: cluster-devel.redhat.com On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote: > On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote: > > > > > You're correct; yes the daemon could easily be changed to drop resources > > when > > there's no locking activity. We'd just set a poll timeout when > > plock_resources is non-empty and call drop_resources() if it times out. > > > > (If you simply want to work around this, you can write a little program to > > lock a file on the fs to trigger the drop.) > > > > Dave > > > Many thanks for your guidance :-) > So I write a patch to try to fix this. Review and comments are welcome. Thanks, that looks like it would do the job, but the code is a little complicated. Here's a simpler patch, I've not tried it so it may not work :) Dave diff --git a/group/dlm_controld/dlm_daemon.h b/group/dlm_controld/dlm_daemon.h index df2e148..18479d7 100644 --- a/group/dlm_controld/dlm_daemon.h +++ b/group/dlm_controld/dlm_daemon.h @@ -73,6 +73,7 @@ extern int poll_fencing; extern int poll_quorum; extern int poll_fs; extern int poll_ignore_plock; +extern int poll_drop_plock; extern int plock_fd; extern int plock_ci; extern struct list_head lockspaces; @@ -296,6 +297,7 @@ void process_netlink(int ci); int setup_plocks(void); void close_plocks(void); void process_plocks(int ci); +void drop_resources_all(void); int limit_plocks(void); void receive_plock(struct lockspace *ls, struct dlm_header *hd, int len); void receive_own(struct lockspace *ls, struct dlm_header *hd, int len); diff --git a/group/dlm_controld/main.c b/group/dlm_controld/main.c index 93b40f8..75ee55d 100644 --- a/group/dlm_controld/main.c +++ b/group/dlm_controld/main.c @@ -1011,6 +1011,13 @@ static void loop(void) } poll_timeout = 1000; } + + if (poll_drop_plock) { + drop_resources_all(); + if (poll_drop_plock) + poll_timeout = 1000; + } + query_unlock(); } out: @@ -1310,6 +1317,7 @@ int poll_fencing; int poll_quorum; int poll_fs; int poll_ignore_plock; +int poll_drop_plock; int plock_fd; int plock_ci; struct list_head lockspaces; diff --git a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c index 3d4431e..197b15c 100644 --- a/group/dlm_controld/plock.c +++ b/group/dlm_controld/plock.c @@ -1351,8 +1351,20 @@ static int drop_resources(struct lockspace *ls) struct timeval now; int count = 0; + if (!cfgd_plock_ownership) + return 0; + + if (list_empty(&ls->plock_resources)) + return 0; + gettimeofday(&now, NULL); + if (time_diff_ms(&ls->drop_resources_last, &now) < + cfgd_drop_resources_time) + return 1; + + ls->drop_resources_last = now; + /* try to drop the oldest, unused resources */ list_for_each_entry_reverse(r, &ls->plock_resources, list) { @@ -1376,7 +1388,21 @@ static int drop_resources(struct lockspace *ls) } } - return 0; + return 1; +} + +void drop_resources_all(void) +{ + struct lockspace *ls; + int rv = 0; + + poll_drop_plock = 0; + + list_for_each_entry(ls, &lockspaces, list) { + rv = drop_resources(ls); + if (rv) + poll_drop_plock = 1; + } } int limit_plocks(void) @@ -1495,13 +1521,8 @@ void process_plocks(int ci) save_pending_plock(ls, r, &info); } - if (cfgd_plock_ownership && - time_diff_ms(&ls->drop_resources_last, &now) >= - cfgd_drop_resources_time) { - ls->drop_resources_last = now; - drop_resources(ls); - } - + if (cfgd_plock_ownership && !list_empty(&ls->plock_resources)) + poll_drop_plock = 1; return; fail: ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes 2009-08-25 15:35 ` David Teigland @ 2009-08-26 6:54 ` Jiaju Zhang 0 siblings, 0 replies; 5+ messages in thread From: Jiaju Zhang @ 2009-08-26 6:54 UTC (permalink / raw) To: cluster-devel.redhat.com On Tue, Aug 25, 2009 at 11:35 PM, David Teigland <teigland@redhat.com>wrote: > On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote: > > On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com > >wrote: > > > > > > > > You're correct; yes the daemon could easily be changed to drop > resources > > > when > > > there's no locking activity. We'd just set a poll timeout when > > > plock_resources is non-empty and call drop_resources() if it times out. > > > > > > (If you simply want to work around this, you can write a little program > to > > > lock a file on the fs to trigger the drop.) > > > > > > Dave > > > > > > Many thanks for your guidance :-) > > So I write a patch to try to fix this. Review and comments are welcome. > > Thanks, that looks like it would do the job, but the code is a little > complicated. Here's a simpler patch, I've not tried it so it may not work > :) > The code is so elegant and it works fine :) Thanks, Jiaju -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090826/bee575bd/attachment.htm> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-08-26 6:54 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-23 9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang 2009-08-24 14:31 ` [Cluster-devel] " David Teigland 2009-08-25 3:10 ` Jiaju Zhang 2009-08-25 15:35 ` David Teigland 2009-08-26 6:54 ` Jiaju Zhang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).