cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified
       [not found] <38741039.1176852.1412183985958.JavaMail.zimbra@redhat.com>
@ 2014-10-01 17:21 ` Bob Peterson
  2014-10-01 18:04   ` Steven Whitehouse
  2014-10-01 18:42   ` David Teigland
  0 siblings, 2 replies; 5+ messages in thread
From: Bob Peterson @ 2014-10-01 17:21 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This patch adds a new lock flag, DLM_LKF_NOLOOKUP, which instructs DLM
to refrain from sending lookup requests in cases where the lock library
node is not the current node. This is similar to the DLM_LKF_NOQUEUE
flag, except it fails locks that would require a lookup, with -EAGAIN.

This is not just about saving a network operation. It allows callers
like GFS2 to master locks for which they are the directory node. Each
node can then "prefer" local locks, especially in the case of GFS2
selecting resource groups for block allocations (implemented with a
separate patch). This mastering of local locks distributes the locks
between the nodes (at least until nodes enter or leave the cluster),
which tends to make each node "keep to itself" when doing allocations.
Thus, dlm communications are kept to a minimum, which results in
significantly faster block allocations.

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
---
 fs/dlm/lock.c                     | 16 ++++++++++++++--
 include/uapi/linux/dlmconstants.h |  7 +++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
index 83f3d55..f1e5b04 100644
--- a/fs/dlm/lock.c
+++ b/fs/dlm/lock.c
@@ -222,6 +222,11 @@ static inline int can_be_queued(struct dlm_lkb *lkb)
 	return !(lkb->lkb_exflags & DLM_LKF_NOQUEUE);
 }
 
+static inline int can_be_looked_up(struct dlm_lkb *lkb)
+{
+	return !(lkb->lkb_exflags & DLM_LKF_NOLOOKUP);
+}
+
 static inline int force_blocking_asts(struct dlm_lkb *lkb)
 {
 	return (lkb->lkb_exflags & DLM_LKF_NOQUEUEBAST);
@@ -2745,6 +2750,11 @@ static int set_master(struct dlm_rsb *r, struct dlm_lkb *lkb)
 		return 0;
 	}
 
+	if (!can_be_looked_up(lkb)) {
+		queue_cast(r, lkb, -EAGAIN);
+		return -EAGAIN;
+	}
+
 	wait_pending_remove(r);
 
 	r->res_first_lkid = lkb->lkb_id;
@@ -2828,7 +2838,8 @@ static int set_lock_args(int mode, struct dlm_lksb *lksb, uint32_t flags,
 	if (flags & DLM_LKF_CONVDEADLK && !(flags & DLM_LKF_CONVERT))
 		goto out;
 
-	if (flags & DLM_LKF_CONVDEADLK && flags & DLM_LKF_NOQUEUE)
+	if (flags & DLM_LKF_CONVDEADLK && (flags & (DLM_LKF_NOQUEUE |
+						    DLM_LKF_NOLOOKUP)))
 		goto out;
 
 	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_CONVERT)
@@ -2837,7 +2848,8 @@ static int set_lock_args(int mode, struct dlm_lksb *lksb, uint32_t flags,
 	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_QUECVT)
 		goto out;
 
-	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_NOQUEUE)
+	if (flags & DLM_LKF_EXPEDITE && (flags & (DLM_LKF_NOQUEUE |
+						  DLM_LKF_NOLOOKUP)))
 		goto out;
 
 	if (flags & DLM_LKF_EXPEDITE && mode != DLM_LOCK_NL)
diff --git a/include/uapi/linux/dlmconstants.h b/include/uapi/linux/dlmconstants.h
index 47bf08d..4b9ba15 100644
--- a/include/uapi/linux/dlmconstants.h
+++ b/include/uapi/linux/dlmconstants.h
@@ -131,6 +131,12 @@
  * Unlock the lock even if it is converting or waiting or has sublocks.
  * Only really for use by the userland device.c code.
  *
+ * DLM_LKF_NOLOOKUP
+ *
+ * Don't take any network time/bandwidth to do directory owner lookups.
+ * This is a lock for which we only care whether it's completely under
+ * local jurisdiction.
+ *
  */
 
 #define DLM_LKF_NOQUEUE		0x00000001
@@ -152,6 +158,7 @@
 #define DLM_LKF_ALTCW		0x00010000
 #define DLM_LKF_FORCEUNLOCK	0x00020000
 #define DLM_LKF_TIMEOUT		0x00040000
+#define DLM_LKF_NOLOOKUP	0x00080000
 
 /*
  * Some return codes that are not in errno.h



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified
  2014-10-01 17:21 ` [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified Bob Peterson
@ 2014-10-01 18:04   ` Steven Whitehouse
  2014-10-03 17:24     ` Bob Peterson
  2014-10-01 18:42   ` David Teigland
  1 sibling, 1 reply; 5+ messages in thread
From: Steven Whitehouse @ 2014-10-01 18:04 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 01/10/14 18:21, Bob Peterson wrote:
> Hi,
>
> This patch adds a new lock flag, DLM_LKF_NOLOOKUP, which instructs DLM
> to refrain from sending lookup requests in cases where the lock library
> node is not the current node. This is similar to the DLM_LKF_NOQUEUE
> flag, except it fails locks that would require a lookup, with -EAGAIN.
>
> This is not just about saving a network operation. It allows callers
> like GFS2 to master locks for which they are the directory node. Each
> node can then "prefer" local locks, especially in the case of GFS2
> selecting resource groups for block allocations (implemented with a
> separate patch). This mastering of local locks distributes the locks
> between the nodes (at least until nodes enter or leave the cluster),
> which tends to make each node "keep to itself" when doing allocations.
> Thus, dlm communications are kept to a minimum, which results in
> significantly faster block allocations.
I think we need to do some more investigation here... how long do the 
lookups take? If the issue is just to create a list of perferred rgrps 
for each node, then there are various ways in which we might do that. 
That is not to say that this isn't a good way to do it, but I think we 
should try to understand the timings here first and make sure that we 
are solving the right problem,

Steve.

> Regards,
>
> Bob Peterson
> Red Hat File Systems
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/dlm/lock.c                     | 16 ++++++++++++++--
>   include/uapi/linux/dlmconstants.h |  7 +++++++
>   2 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/fs/dlm/lock.c b/fs/dlm/lock.c
> index 83f3d55..f1e5b04 100644
> --- a/fs/dlm/lock.c
> +++ b/fs/dlm/lock.c
> @@ -222,6 +222,11 @@ static inline int can_be_queued(struct dlm_lkb *lkb)
>   	return !(lkb->lkb_exflags & DLM_LKF_NOQUEUE);
>   }
>   
> +static inline int can_be_looked_up(struct dlm_lkb *lkb)
> +{
> +	return !(lkb->lkb_exflags & DLM_LKF_NOLOOKUP);
> +}
> +
>   static inline int force_blocking_asts(struct dlm_lkb *lkb)
>   {
>   	return (lkb->lkb_exflags & DLM_LKF_NOQUEUEBAST);
> @@ -2745,6 +2750,11 @@ static int set_master(struct dlm_rsb *r, struct dlm_lkb *lkb)
>   		return 0;
>   	}
>   
> +	if (!can_be_looked_up(lkb)) {
> +		queue_cast(r, lkb, -EAGAIN);
> +		return -EAGAIN;
> +	}
> +
>   	wait_pending_remove(r);
>   
>   	r->res_first_lkid = lkb->lkb_id;
> @@ -2828,7 +2838,8 @@ static int set_lock_args(int mode, struct dlm_lksb *lksb, uint32_t flags,
>   	if (flags & DLM_LKF_CONVDEADLK && !(flags & DLM_LKF_CONVERT))
>   		goto out;
>   
> -	if (flags & DLM_LKF_CONVDEADLK && flags & DLM_LKF_NOQUEUE)
> +	if (flags & DLM_LKF_CONVDEADLK && (flags & (DLM_LKF_NOQUEUE |
> +						    DLM_LKF_NOLOOKUP)))
>   		goto out;
>   
>   	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_CONVERT)
> @@ -2837,7 +2848,8 @@ static int set_lock_args(int mode, struct dlm_lksb *lksb, uint32_t flags,
>   	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_QUECVT)
>   		goto out;
>   
> -	if (flags & DLM_LKF_EXPEDITE && flags & DLM_LKF_NOQUEUE)
> +	if (flags & DLM_LKF_EXPEDITE && (flags & (DLM_LKF_NOQUEUE |
> +						  DLM_LKF_NOLOOKUP)))
>   		goto out;
>   
>   	if (flags & DLM_LKF_EXPEDITE && mode != DLM_LOCK_NL)
> diff --git a/include/uapi/linux/dlmconstants.h b/include/uapi/linux/dlmconstants.h
> index 47bf08d..4b9ba15 100644
> --- a/include/uapi/linux/dlmconstants.h
> +++ b/include/uapi/linux/dlmconstants.h
> @@ -131,6 +131,12 @@
>    * Unlock the lock even if it is converting or waiting or has sublocks.
>    * Only really for use by the userland device.c code.
>    *
> + * DLM_LKF_NOLOOKUP
> + *
> + * Don't take any network time/bandwidth to do directory owner lookups.
> + * This is a lock for which we only care whether it's completely under
> + * local jurisdiction.
> + *
>    */
>   
>   #define DLM_LKF_NOQUEUE		0x00000001
> @@ -152,6 +158,7 @@
>   #define DLM_LKF_ALTCW		0x00010000
>   #define DLM_LKF_FORCEUNLOCK	0x00020000
>   #define DLM_LKF_TIMEOUT		0x00040000
> +#define DLM_LKF_NOLOOKUP	0x00080000
>   
>   /*
>    * Some return codes that are not in errno.h
>



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified
  2014-10-01 17:21 ` [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified Bob Peterson
  2014-10-01 18:04   ` Steven Whitehouse
@ 2014-10-01 18:42   ` David Teigland
  2014-10-03 17:28     ` Bob Peterson
  1 sibling, 1 reply; 5+ messages in thread
From: David Teigland @ 2014-10-01 18:42 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Wed, Oct 01, 2014 at 01:21:41PM -0400, Bob Peterson wrote:
> Hi,
> 
> This patch adds a new lock flag, DLM_LKF_NOLOOKUP, which instructs DLM
> to refrain from sending lookup requests in cases where the lock library
> node is not the current node. This is similar to the DLM_LKF_NOQUEUE
> flag, except it fails locks that would require a lookup, with -EAGAIN.

Can we just use NOQUEUE?  It tells you that there's a lock conflict, which
tells you to move along and try another if you don't want to contend.  If
you cache acquired locks and reuse them, then it doesn't matter if the
master node is remote or local.

If lookups are a problem in general, there is the "nodir" lockspace mode,
which replaces the resource directory lookup system with a static mapping
of resources to master nodes.

> This is not just about saving a network operation. It allows callers
> like GFS2 to master locks for which they are the directory node. Each
> node can then "prefer" local locks, especially in the case of GFS2
> selecting resource groups for block allocations (implemented with a
> separate patch). This mastering of local locks distributes the locks
> between the nodes (at least until nodes enter or leave the cluster),
> which tends to make each node "keep to itself" when doing allocations.
> Thus, dlm communications are kept to a minimum, which results in
> significantly faster block allocations.

Back in 2002 I solved what sounds like the same problem in gfs(1).  It
allowed all nodes to allocate blocks independent of each other, without
constant locking.  You can see the solution here:

https://git.fedorahosted.org/cgit/cluster.git/tree/gfs-kernel/src/gfs/rgrp.c?h=RHEL4




^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified
  2014-10-01 18:04   ` Steven Whitehouse
@ 2014-10-03 17:24     ` Bob Peterson
  0 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2014-10-03 17:24 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> Hi,
> 
> I think we need to do some more investigation here... how long do the
> lookups take? If the issue is just to create a list of perferred rgrps
> for each node, then there are various ways in which we might do that.
> That is not to say that this isn't a good way to do it, but I think we
> should try to understand the timings here first and make sure that we
> are solving the right problem,
> 
> Steve.

Hi,

Contrary to my previous findings (which I think I may have screwed up),
I've done further investigation and found that the DLM lookups aren't the
issue at all. Also, making the DLM master the lookup node doesn't seem to
make much difference either. So I'm scrapping this dlm patch.

My latest set of patches now includes one that evenly distributes a
preferred set of rgrps to each node. Unlike Dave's original algorithm in
GFS1, this is a round-robin scheme that makes every node "prefer" every
Nth rgrp, where N is the number of nodes. This seems to be just as fast
if not faster than messing around with DLM.

Hopefully I'll be posting more patches in the near future. I'm currently
running some more tests regarding minimum reservations, and I'll post
patches depending on those results.

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified
  2014-10-01 18:42   ` David Teigland
@ 2014-10-03 17:28     ` Bob Peterson
  0 siblings, 0 replies; 5+ messages in thread
From: Bob Peterson @ 2014-10-03 17:28 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
> Can we just use NOQUEUE?  It tells you that there's a lock conflict, which
> tells you to move along and try another if you don't want to contend.  If
> you cache acquired locks and reuse them, then it doesn't matter if the
> master node is remote or local.

The problem with NOQUEUE is that it seems to depend on circumstances.
If each node mounts the file system at a separate time, and as part of that
process, they do a NOQUEUE lock on every rgrp, they all are granted the lock.
With this scheme, they're evenly divided between the nodes.

It doesn't matter anyway, because I'm scrapping the DLM patch in favor of
a scheme like the one you pointed out in GFS1 below. See my other email for
more details.

> If lookups are a problem in general, there is the "nodir" lockspace mode,
> which replaces the resource directory lookup system with a static mapping
> of resources to master nodes.
> 
(snip)
> Back in 2002 I solved what sounds like the same problem in gfs(1).  It
> allowed all nodes to allocate blocks independent of each other, without
> constant locking.  You can see the solution here:
> 
> https://git.fedorahosted.org/cgit/cluster.git/tree/gfs-kernel/src/gfs/rgrp.c?h=RHEL4

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-10-03 17:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <38741039.1176852.1412183985958.JavaMail.zimbra@redhat.com>
2014-10-01 17:21 ` [Cluster-devel] [DLM PATCH] DLM: Don't wait for resource library lookups if NOLOOKUP is specified Bob Peterson
2014-10-01 18:04   ` Steven Whitehouse
2014-10-03 17:24     ` Bob Peterson
2014-10-01 18:42   ` David Teigland
2014-10-03 17:28     ` Bob Peterson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).