cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes
@ 2009-08-23  9:00 Jiaju Zhang
  2009-08-24 14:31 ` [Cluster-devel] " David Teigland
  0 siblings, 1 reply; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-23  9:00 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hello,

Currently, it seems that dlm_controld won't drop an unused plock resource
(which has exceed the timeout) when there is no another plock request for
other resource comes. I found this issue when I run the pingpong test case
http://junkcode.samba.org/ftp/unpacked/junkcode/ping_pong.c
In the pingpong, it just uses fcntl(F_SETLKW) to lock/unlock a file
repeatedly to see the lock performance. Now, my test case is:
Step 1: start pingpong on node A, then stop it.
Step 2: after a while(more than 10 seconds), start pingpong on node B.
I haven't run the pingpong concurrently on the two nodes, but the lock
performance in node A is much higher than in node B.
The user might be confused about why the performance on the two nodes seems
so different.

After some investigating, I found the reason is when running pingpong on
node A, r->owner == A, so all the plock requests operates locally. When
pingpong stops on node A, there is no other plock request comes from node A,
so the drop_resources won't be triggered even if it has exceed the timeout.
Then, pingpong on node B and it found r->owner == A so eventually it turned
into the state r->owner == 0. This is the distributed mode, so the
performance is decreasing.

It seems no chance to trigger the drop_resources to work if there is no
other plock requests (which are requesting for another lock resource) come
from node A.
But I still wonder if there is a way to improve this?

Thanks,
Jiaju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090823/85555855/attachment.htm>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
  2009-08-23  9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang
@ 2009-08-24 14:31 ` David Teigland
  2009-08-25  3:10   ` Jiaju Zhang
  0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2009-08-24 14:31 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Sun, Aug 23, 2009 at 05:00:44PM +0800, Jiaju Zhang wrote:
> It seems no chance to trigger the drop_resources to work if there is no
> other plock requests (which are requesting for another lock resource) come
> from node A.
> But I still wonder if there is a way to improve this?

You're correct; yes the daemon could easily be changed to drop resources when
there's no locking activity.  We'd just set a poll timeout when
plock_resources is non-empty and call drop_resources() if it times out.

(If you simply want to work around this, you can write a little program to
lock a file on the fs to trigger the drop.)

Dave



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
  2009-08-24 14:31 ` [Cluster-devel] " David Teigland
@ 2009-08-25  3:10   ` Jiaju Zhang
  2009-08-25 15:35     ` David Teigland
  0 siblings, 1 reply; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-25  3:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote:

>
> You're correct; yes the daemon could easily be changed to drop resources
> when
> there's no locking activity.  We'd just set a poll timeout when
> plock_resources is non-empty and call drop_resources() if it times out.
>
> (If you simply want to work around this, you can write a little program to
> lock a file on the fs to trigger the drop.)
>
> Dave


Many thanks for your guidance :-)
So I write a patch to try to fix this. Review and comments are welcome.

Thanks,
Jiaju


diff -Nupr a/group/dlm_controld/dlm_daemon.h
b/group/dlm_controld/dlm_daemon.h
--- a/group/dlm_controld/dlm_daemon.h    2009-07-03 14:53:42.000000000 +0800
+++ b/group/dlm_controld/dlm_daemon.h    2009-08-25 10:38:17.000000000 +0800
@@ -300,6 +300,8 @@ void store_plocks(struct lockspace *ls);
 void retrieve_plocks(struct lockspace *ls);
 void purge_plocks(struct lockspace *ls, int nodeid, int unmount);
 int fill_plock_dump_buf(struct lockspace *ls);
+unsigned long time_diff_ms(struct timeval *begin, struct timeval *end);
+int drop_resources(struct lockspace *ls);

 /* logging.c */

diff -Nupr a/group/dlm_controld/main.c b/group/dlm_controld/main.c
--- a/group/dlm_controld/main.c    2009-07-03 14:53:42.000000000 +0800
+++ b/group/dlm_controld/main.c    2009-08-25 07:59:53.000000000 +0800
@@ -842,6 +842,8 @@ static void loop(void)
     struct lockspace *ls;
     int poll_timeout = -1;
     int rv, i;
+    int need_to_drop = 0;
+    struct timeval now, last_access;
     void (*workfn) (int ci);
     void (*deadfn) (int ci);

@@ -963,6 +965,30 @@ static void loop(void)
             }
             poll_timeout = 1000;
         }
+
+        if (cfgd_plock_ownership) {
+            gettimeofday(&now, NULL);
+            if (need_to_drop && time_diff_ms(&last_access, &now) >=
cfgd_drop_resources_time) {
+                list_for_each_entry(ls, &lockspaces, list) {
+                    if (!list_empty(&ls->plock_resources)) {
+                        poll_timeout = cfgd_drop_resources_time;
+                        ls->drop_resources_last = now;
+                        drop_resources(ls);
+                    }
+                }
+                need_to_drop = 0;
+                last_access = now;
+            } else {
+                list_for_each_entry(ls, &lockspaces, list) {
+                    if (!list_empty(&ls->plock_resources)) {
+                        poll_timeout = cfgd_drop_resources_time;
+                        need_to_drop = 1;
+                        last_access = now;
+                        break;
+                    }
+                }
+            }
+        }
         query_unlock();
     }
  out:
diff -Nupr a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c
--- a/group/dlm_controld/plock.c    2009-07-27 12:04:07.000000000 +0800
+++ b/group/dlm_controld/plock.c    2009-08-25 10:37:43.000000000 +0800
@@ -216,7 +216,7 @@ static uint32_t mg_to_ls_id(uint32_t fsi

 /* FIXME: unify these two */

-static unsigned long time_diff_ms(struct timeval *begin, struct timeval
*end)
+unsigned long time_diff_ms(struct timeval *begin, struct timeval *end)
 {
     struct timeval result;
     timersub(end, begin, &result);
@@ -1344,7 +1344,7 @@ void receive_drop(struct lockspace *ls,
 /* FIXME: in the transition from owner = us, to owner = 0, to drop;
    we want the second period to be shorter than the first */

-static int drop_resources(struct lockspace *ls)
+int drop_resources(struct lockspace *ls)
 {
     struct resource *r;
     struct timeval now;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090825/1b0359c7/attachment.htm>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
  2009-08-25  3:10   ` Jiaju Zhang
@ 2009-08-25 15:35     ` David Teigland
  2009-08-26  6:54       ` Jiaju Zhang
  0 siblings, 1 reply; 5+ messages in thread
From: David Teigland @ 2009-08-25 15:35 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote:
> On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com>wrote:
> 
> >
> > You're correct; yes the daemon could easily be changed to drop resources
> > when
> > there's no locking activity.  We'd just set a poll timeout when
> > plock_resources is non-empty and call drop_resources() if it times out.
> >
> > (If you simply want to work around this, you can write a little program to
> > lock a file on the fs to trigger the drop.)
> >
> > Dave
> 
> 
> Many thanks for your guidance :-)
> So I write a patch to try to fix this. Review and comments are welcome.

Thanks, that looks like it would do the job, but the code is a little
complicated.  Here's a simpler patch, I've not tried it so it may not work :)

Dave


diff --git a/group/dlm_controld/dlm_daemon.h b/group/dlm_controld/dlm_daemon.h
index df2e148..18479d7 100644
--- a/group/dlm_controld/dlm_daemon.h
+++ b/group/dlm_controld/dlm_daemon.h
@@ -73,6 +73,7 @@ extern int poll_fencing;
 extern int poll_quorum;
 extern int poll_fs;
 extern int poll_ignore_plock;
+extern int poll_drop_plock;
 extern int plock_fd;
 extern int plock_ci;
 extern struct list_head lockspaces;
@@ -296,6 +297,7 @@ void process_netlink(int ci);
 int setup_plocks(void);
 void close_plocks(void);
 void process_plocks(int ci);
+void drop_resources_all(void);
 int limit_plocks(void);
 void receive_plock(struct lockspace *ls, struct dlm_header *hd, int len);
 void receive_own(struct lockspace *ls, struct dlm_header *hd, int len);
diff --git a/group/dlm_controld/main.c b/group/dlm_controld/main.c
index 93b40f8..75ee55d 100644
--- a/group/dlm_controld/main.c
+++ b/group/dlm_controld/main.c
@@ -1011,6 +1011,13 @@ static void loop(void)
 			}
 			poll_timeout = 1000;
 		}
+
+		if (poll_drop_plock) {
+			drop_resources_all();
+			if (poll_drop_plock)
+				poll_timeout = 1000;
+		}
+
 		query_unlock();
 	}
  out:
@@ -1310,6 +1317,7 @@ int poll_fencing;
 int poll_quorum;
 int poll_fs;
 int poll_ignore_plock;
+int poll_drop_plock;
 int plock_fd;
 int plock_ci;
 struct list_head lockspaces;
diff --git a/group/dlm_controld/plock.c b/group/dlm_controld/plock.c
index 3d4431e..197b15c 100644
--- a/group/dlm_controld/plock.c
+++ b/group/dlm_controld/plock.c
@@ -1351,8 +1351,20 @@ static int drop_resources(struct lockspace *ls)
 	struct timeval now;
 	int count = 0;
 
+	if (!cfgd_plock_ownership)
+		return 0;
+
+	if (list_empty(&ls->plock_resources))
+		return 0;
+
 	gettimeofday(&now, NULL);
 
+	if (time_diff_ms(&ls->drop_resources_last, &now) <
+	    		 cfgd_drop_resources_time)
+		return 1;
+
+	ls->drop_resources_last = now;
+
 	/* try to drop the oldest, unused resources */
 
 	list_for_each_entry_reverse(r, &ls->plock_resources, list) {
@@ -1376,7 +1388,21 @@ static int drop_resources(struct lockspace *ls)
 		}
 	}
 
-	return 0;
+	return 1;
+}
+
+void drop_resources_all(void)
+{
+	struct lockspace *ls;
+	int rv = 0;
+
+	poll_drop_plock = 0;
+
+	list_for_each_entry(ls, &lockspaces, list) {
+		rv = drop_resources(ls);
+		if (rv)
+			poll_drop_plock = 1;
+	}
 }
 
 int limit_plocks(void)
@@ -1495,13 +1521,8 @@ void process_plocks(int ci)
 		save_pending_plock(ls, r, &info);
 	}
 
-	if (cfgd_plock_ownership &&
-	    time_diff_ms(&ls->drop_resources_last, &now) >=
-	    		 cfgd_drop_resources_time) {
-		ls->drop_resources_last = now;
-		drop_resources(ls);
-	}
-
+	if (cfgd_plock_ownership && !list_empty(&ls->plock_resources))
+		poll_drop_plock = 1;
 	return;
 
  fail:



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Cluster-devel] Re: [RFC]Drop unused plock resource when no other plock request comes
  2009-08-25 15:35     ` David Teigland
@ 2009-08-26  6:54       ` Jiaju Zhang
  0 siblings, 0 replies; 5+ messages in thread
From: Jiaju Zhang @ 2009-08-26  6:54 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Aug 25, 2009 at 11:35 PM, David Teigland <teigland@redhat.com>wrote:

> On Tue, Aug 25, 2009 at 11:10:03AM +0800, Jiaju Zhang wrote:
> > On Mon, Aug 24, 2009 at 10:31 PM, David Teigland <teigland@redhat.com
> >wrote:
> >
> > >
> > > You're correct; yes the daemon could easily be changed to drop
> resources
> > > when
> > > there's no locking activity.  We'd just set a poll timeout when
> > > plock_resources is non-empty and call drop_resources() if it times out.
> > >
> > > (If you simply want to work around this, you can write a little program
> to
> > > lock a file on the fs to trigger the drop.)
> > >
> > > Dave
> >
> >
> > Many thanks for your guidance :-)
> > So I write a patch to try to fix this. Review and comments are welcome.
>
> Thanks, that looks like it would do the job, but the code is a little
> complicated.  Here's a simpler patch, I've not tried it so it may not work
> :)
>

The code is so elegant and it works fine :)

Thanks,
Jiaju
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/cluster-devel/attachments/20090826/bee575bd/attachment.htm>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-08-26  6:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-23  9:00 [Cluster-devel] [RFC]Drop unused plock resource when no other plock request comes Jiaju Zhang
2009-08-24 14:31 ` [Cluster-devel] " David Teigland
2009-08-25  3:10   ` Jiaju Zhang
2009-08-25 15:35     ` David Teigland
2009-08-26  6:54       ` Jiaju Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).