* [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes
@ 2009-08-23 9:00 Jiaju Zhang
2009-08-24 14:31 ` [Cluster-devel] " David Teigland
0 siblings, 1 reply; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-23 9:00 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hello,
Currently, it seems that dlm_controld won't drop an unused plock resource
(which has exceed the timeout) when there is no another plock request for
other resource comes. I found this issue when I run the pingpong test case
http://junkcode.samba.org/ftp/unpacked/junkcode/ping_pong.c
In the pingpong, it just uses fcntl(F_SETLKW) to lock/unlock a file
repeatedly to see the lock performance. Now, my test case is:
Step 1: start pingpong on node A, then stop it.
Step 2: after a while(more than 10 seconds), start pingpong on node B.
I haven't run the pingpong concurrently on the two nodes, but the lock
performance in node A is much higher than in node B.
The user might be confused about why the performance on the two nodes seems
so different.
After some investigating, I found the reason is when running pingpong on
node A, r->owner == A, so all the plock requests operates locally. When
pingpong stops on node A, there is no other plock request comes from node A,
so the drop_resources won't be triggered even if it has exceed the timeout.
Then, pingpong on node B and it found r->owner == A so eventually it turned
into the state r->owner == 0. This is the distributed mode, so the
performance is decreasing.
It seems no chance to trigger the drop_resources to work if there is no
other plock requests (which are requesting for another lock resource) come
from node A.
But I still wonder if there is a way to improve this?
Thanks,
Jiaju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090823/85555855/attachment.htm>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
2009-08-23 9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang
@ 2009-08-24 14:31 ` David Teigland
2009-08-25 3:10 ` Jiaju Zhang
0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2009-08-24 14:31 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Sun, Aug 23, 2009 at 05:00:44PM +0800, Jiaju Zhang wrote:
> It seems no chance to trigger the drop_resources to work if there is no
> other plock requests (which are requesting for another lock resource) come
> from node A.
> But I still wonder if there is a way to improve this?
You're correct; yes the daemon could easily be changed to drop resources when
there's no locking activity. We'd just set a poll timeout when
plock_resources is non-empty and call drop_resources() if it times out.
(If you simply want to work around this, you can write a little program to
lock a file on the fs to trigger the drop.)
Dave
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
2009-08-24 14:31 ` [Cluster-devel] " David Teigland
@ 2009-08-25 3:10 ` Jiaju Zhang
2009-08-25 15:35 ` David Teigland
0 siblings, 1 reply; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-25 3:10 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote:
>
> You're correct; yes the daemon could easily be changed to drop resources
> when
> there's no locking activity. We'd just set a poll timeout when
> plock_resources is non-empty and call drop_resources() if it times out.
>
> (If you simply want to work around this, you can write a little program to
> lock a file on the fs to trigger the drop.)
>
> Dave
Many thanks for your guidance :-)
So I write a patch to try to fix this. Review and comments are welcome.
Thanks,
Jiaju
diff -Nupr a/group/dlm_controld/dlm_daemon.h
b/group/dlm_controld/dlm_daemon.h
--- a/group/dlm_controld/dlm_daemon.h 2009-07-03 14:53:42.000000000 +0800
+++ b/group/dlm_controld/dlm_daemon.h 2009-08-25 10:38:17.000000000 +0800
@@ -300,6 +300,8 @@ void store_plocks(struct lockspace *ls);
void retrieve_plocks(struct lockspace *ls);
void purge_plocks(struct lockspace *ls, int nodeid, int unmount);
int fill_plock_dump_buf(struct lockspace *ls);
+unsigned long time_diff_ms(struct timeval *begin, struct timeval *end);
+int drop_resources(struct lockspace *ls);
/* logging.c */
diff -Nupr a/group/dlm_controld/main.c b/group/dlm_controld/main.c
--- a/group/dlm_controld/main.c 2009-07-03 14:53:42.000000000 +0800
+++ b/group/dlm_controld/main.c 2009-08-25 07:59:53.000000000 +0800
@@ -842,6 +842,8 @@ static void loop(void)
struct lockspace *ls;
int poll_timeout = -1;
int rv, i;
+ int need_to_drop = 0;
+ struct timeval now, last_access;
void (*workfn) (int ci);
void (*deadfn) (int ci);
@@ -963,6 +965,30 @@ static void loop(void)
}
poll_timeout = 1000;
}
+
+ if (cfgd_plock_ownership) {
+ gettimeofday(&now, NULL);
+ if (need_to_drop && time_diff_ms(&last_access, &now) >=
cfgd_drop_resources_time) {
+ list_for_each_entry(ls, &lockspaces, list) {
+ if (!list_empty(&ls->plock_resources)) {
+ poll_timeout = cfgd_drop_resources_time;
+ ls->drop_resources_last = now;
+ drop_resources(ls);
+ }
+ }
+ need_to_drop = 0;
+ last_access = now;
+ } else {
+ list_for_each_entry(ls, &lockspaces, list) {
+ if (!list_empty(&ls->plock_resources)) {
+ poll_timeout = cfgd_drop_resources_time;
+ need_to_drop = 1;
+ last_access = now;
+ break;
+ }
+ }
+ }
+ }
query_unlock();
}
out:
diff -Nupr a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c
--- a/group/dlm_controld/plock.c 2009-07-27 12:04:07.000000000 +0800
+++ b/group/dlm_controld/plock.c 2009-08-25 10:37:43.000000000 +0800
@@ -216,7 +216,7 @@ static uint32_t mg_to_ls_id(uint32_t fsi
/* FIXME: unify these two */
-static unsigned long time_diff_ms(struct timeval *begin, struct timeval
*end)
+unsigned long time_diff_ms(struct timeval *begin, struct timeval *end)
{
struct timeval result;
timersub(end, begin, &result);
@@ -1344,7 +1344,7 @@ void receive_drop(struct lockspace *ls,
/* FIXME: in the transition from owner = us, to owner = 0, to drop;
we want the second period to be shorter than the first */
-static int drop_resources(struct lockspace *ls)
+int drop_resources(struct lockspace *ls)
{
struct resource *r;
struct timeval now;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090825/1b0359c7/attachment.htm>
^ permalink raw reply [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
2009-08-25 3:10 ` Jiaju Zhang
@ 2009-08-25 15:35 ` David Teigland
2009-08-26 6:54 ` Jiaju Zhang
0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2009-08-25 15:35 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote:
> On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote:
>
> >
> > You're correct; yes the daemon could easily be changed to drop resources
> > when
> > there's no locking activity. We'd just set a poll timeout when
> > plock_resources is non-empty and call drop_resources() if it times out.
> >
> > (If you simply want to work around this, you can write a little program to
> > lock a file on the fs to trigger the drop.)
> >
> > Dave
>
>
> Many thanks for your guidance :-)
> So I write a patch to try to fix this. Review and comments are welcome.
Thanks, that looks like it would do the job, but the code is a little
complicated. Here's a simpler patch, I've not tried it so it may not work :)
Dave
diff --git a/group/dlm_controld/dlm_daemon.h b/group/dlm_controld/dlm_daemon.h
index df2e148..18479d7 100644
--- a/group/dlm_controld/dlm_daemon.h
+++ b/group/dlm_controld/dlm_daemon.h
@@ -73,6 +73,7 @@ extern int poll_fencing;
extern int poll_quorum;
extern int poll_fs;
extern int poll_ignore_plock;
+extern int poll_drop_plock;
extern int plock_fd;
extern int plock_ci;
extern struct list_head lockspaces;
@@ -296,6 +297,7 @@ void process_netlink(int ci);
int setup_plocks(void);
void close_plocks(void);
void process_plocks(int ci);
+void drop_resources_all(void);
int limit_plocks(void);
void receive_plock(struct lockspace *ls, struct dlm_header *hd, int len);
void receive_own(struct lockspace *ls, struct dlm_header *hd, int len);
diff --git a/group/dlm_controld/main.c b/group/dlm_controld/main.c
index 93b40f8..75ee55d 100644
--- a/group/dlm_controld/main.c
+++ b/group/dlm_controld/main.c
@@ -1011,6 +1011,13 @@ static void loop(void)
}
poll_timeout = 1000;
}
+
+ if (poll_drop_plock) {
+ drop_resources_all();
+ if (poll_drop_plock)
+ poll_timeout = 1000;
+ }
+
query_unlock();
}
out:
@@ -1310,6 +1317,7 @@ int poll_fencing;
int poll_quorum;
int poll_fs;
int poll_ignore_plock;
+int poll_drop_plock;
int plock_fd;
int plock_ci;
struct list_head lockspaces;
diff --git a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c
index 3d4431e..197b15c 100644
--- a/group/dlm_controld/plock.c
+++ b/group/dlm_controld/plock.c
@@ -1351,8 +1351,20 @@ static int drop_resources(struct lockspace *ls)
struct timeval now;
int count = 0;
+ if (!cfgd_plock_ownership)
+ return 0;
+
+ if (list_empty(&ls->plock_resources))
+ return 0;
+
gettimeofday(&now, NULL);
+ if (time_diff_ms(&ls->drop_resources_last, &now) <
+ cfgd_drop_resources_time)
+ return 1;
+
+ ls->drop_resources_last = now;
+
/* try to drop the oldest, unused resources */
list_for_each_entry_reverse(r, &ls->plock_resources, list) {
@@ -1376,7 +1388,21 @@ static int drop_resources(struct lockspace *ls)
}
}
- return 0;
+ return 1;
+}
+
+void drop_resources_all(void)
+{
+ struct lockspace *ls;
+ int rv = 0;
+
+ poll_drop_plock = 0;
+
+ list_for_each_entry(ls, &lockspaces, list) {
+ rv = drop_resources(ls);
+ if (rv)
+ poll_drop_plock = 1;
+ }
}
int limit_plocks(void)
@@ -1495,13 +1521,8 @@ void process_plocks(int ci)
save_pending_plock(ls, r, &info);
}
- if (cfgd_plock_ownership &&
- time_diff_ms(&ls->drop_resources_last, &now) >=
- cfgd_drop_resources_time) {
- ls->drop_resources_last = now;
- drop_resources(ls);
- }
-
+ if (cfgd_plock_ownership && !list_empty(&ls->plock_resources))
+ poll_drop_plock = 1;
return;
fail:
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
2009-08-25 15:35 ` David Teigland
@ 2009-08-26 6:54 ` Jiaju Zhang
0 siblings, 0 replies; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-26 6:54 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Tue, Aug 25, 2009 at 11:35 PM, David Teigland <teigland@redhat.com>wrote:
> On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote:
> > On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com
> >wrote:
> >
> > >
> > > You're correct; yes the daemon could easily be changed to drop
> resources
> > > when
> > > there's no locking activity. We'd just set a poll timeout when
> > > plock_resources is non-empty and call drop_resources() if it times out.
> > >
> > > (If you simply want to work around this, you can write a little program
> to
> > > lock a file on the fs to trigger the drop.)
> > >
> > > Dave
> >
> >
> > Many thanks for your guidance :-)
> > So I write a patch to try to fix this. Review and comments are welcome.
>
> Thanks, that looks like it would do the job, but the code is a little
> complicated. Here's a simpler patch, I've not tried it so it may not work
> :)
>
The code is so elegant and it works fine :)
Thanks,
Jiaju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090826/bee575bd/attachment.htm>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2009-08-26 6:54 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-23 9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang
2009-08-24 14:31 ` [Cluster-devel] " David Teigland
2009-08-25 3:10 ` Jiaju Zhang
2009-08-25 15:35 ` David Teigland
2009-08-26 6:54 ` Jiaju Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).