[Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns

cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
       [not found] <ec4f8e73-e24b-4594-991c-90245824cda7@zmail12.collab.prod.int.phx2.redhat.com>
@ 2012-04-05 16:11 ` Bob Peterson
  2012-04-10  9:12   ` Steven Whitehouse
  0 siblings, 1 reply; 6+ messages in thread
From: Bob Peterson @ 2012-04-05 16:11 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

Here's another patch (explanation below). This patch replies upon
a DLM patch that hasn't fully gone upstream yet, so perhaps it
shouldn't be added to the nmw tree until it is. This greatly
improves the performance of gfs2_grow in a clustered gfs2 file system.

Regards,

Bob Peterson
Red Hat File Systems

Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
---
Author: Bob Peterson <rpeterso@redhat.com>
Date:   Wed Apr 4 15:14:51 2012 -0500

    GFS2: Instruct DLM to avoid queue convert slowdowns
    
    This patch instructs DLM to prevent an "in place" conversion, where the
    lock just stays on the granted queue, and instead forces the conversion to
    the back of the convert queue. This is done on upward conversions only.
    
    This is useful in cases where, for example, a lock is frequently needed in
    PR on one node, but another node needs it temporarily in EX to update it.
    This may happen, for example, when the rindex is being updated by gfs2_grow.
    The gfs2_grow needs to have the lock in EX, but the other nodes need to
    re-read it to retrieve the updates. The glock is already granted in PR on
    the non-growing nodes, so this prevents them from continually re-granting
    the lock in PR, and forces the EX from gfs2_grow to go through.

diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
index bd5af03..43466d6 100644
--- a/fs/gfs2/lock_dlm.c
+++ b/fs/gfs2/lock_dlm.c
@@ -16,6 +16,25 @@
 #include "glock.h"
 #include "util.h"
 
+/*
+ * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit with
+ * requests that are compatible, so we use dlm's check:
+ *
+ * Compatibility matrix for conversions with QUECVT set.
+ * Granted mode is the row; requested mode is the column.
+ * Usage: matrix[grmode+1][rqmode+1]
+ */
+static const int quecvt_compat_matrix[8][8] = {
+      /* UN NL CR CW PR PW EX PD */
+        {0, 0, 0, 0, 0, 0, 0, 0},       /* UN */
+        {0, 0, 1, 1, 1, 1, 1, 0},       /* NL */
+        {0, 0, 0, 1, 1, 1, 1, 0},       /* CR */
+        {0, 0, 0, 0, 1, 1, 1, 0},       /* CW */
+        {0, 0, 0, 1, 0, 1, 1, 0},       /* PR */
+        {0, 0, 0, 0, 0, 0, 1, 0},       /* PW */
+        {0, 0, 0, 0, 0, 0, 0, 0},       /* EX */
+        {0, 0, 0, 0, 0, 0, 0, 0}        /* PD */
+};
 
 static void gdlm_ast(void *arg)
 {
@@ -104,10 +123,13 @@ static int make_mode(const unsigned int lmstate)
 	return -1;
 }
 
-static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
+static u32 make_flags(struct gfs2_glock *gl, const unsigned int gfs_flags,
 		      const int req)
 {
 	u32 lkf = 0;
+	u32 lkid = gl->gl_lksb.sb_lkid;
+	int is_upconvert =
+		quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
 
 	if (gfs_flags & LM_FLAG_TRY)
 		lkf |= DLM_LKF_NOQUEUE;
@@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
 			BUG();
 	}
 
-	if (lkid != 0) 
+	if (lkid != 0) {
 		lkf |= DLM_LKF_CONVERT;
+		if (is_upconvert)
+			lkf |= DLM_LKF_QUECVT;
+	}
 
 	lkf |= DLM_LKF_VALBLK;
 
@@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock *gl,
 
 	gl->gl_req = req_state;
 	req = make_mode(req_state);
-	lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
+	lkf = make_flags(gl, flags, req);
 
 	/*
 	 * Submit the actual lock request.



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
  2012-04-05 16:11 ` [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns Bob Peterson
@ 2012-04-10  9:12   ` Steven Whitehouse
  2012-04-10 14:01     ` Bob Peterson
  2012-04-10 14:08     ` David Teigland
  0 siblings, 2 replies; 6+ messages in thread
From: Steven Whitehouse @ 2012-04-10  9:12 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
> Hi,
> 
> Here's another patch (explanation below). This patch replies upon
> a DLM patch that hasn't fully gone upstream yet, so perhaps it
> shouldn't be added to the nmw tree until it is. This greatly
> improves the performance of gfs2_grow in a clustered gfs2 file system.
> 
> Regards,
> 
I'm still not very keen on dragging in this bit of dlm. If it is really
needed, then we should use the copy in the dlm itself and not add our
own copy of it. Also, why can we not use the GLF_BLOCKING flag for this
purpose? Is there some issue with try locks in that case?

When you say that this relies upon this dlm patch, what does that mean?
What are the consequences of having this patch but not the dlm one? I'm
wondering whether I should hold off on this at least until the dlm one
has been finalised and applied.

Aside from those points though, this is a really big step forward and
should make a measurable difference to performance across the board when
contention between multiple nodes occurs. So I'm very happy to see such
significant progress in this area,

Steve.


> Bob Peterson
> Red Hat File Systems
> 
> Signed-off-by: Bob Peterson <rpeterso@redhat.com> 
> ---
> Author: Bob Peterson <rpeterso@redhat.com>
> Date:   Wed Apr 4 15:14:51 2012 -0500
> 
>     GFS2: Instruct DLM to avoid queue convert slowdowns
>     
>     This patch instructs DLM to prevent an "in place" conversion, where the
>     lock just stays on the granted queue, and instead forces the conversion to
>     the back of the convert queue. This is done on upward conversions only.
>     
>     This is useful in cases where, for example, a lock is frequently needed in
>     PR on one node, but another node needs it temporarily in EX to update it.
>     This may happen, for example, when the rindex is being updated by gfs2_grow.
>     The gfs2_grow needs to have the lock in EX, but the other nodes need to
>     re-read it to retrieve the updates. The glock is already granted in PR on
>     the non-growing nodes, so this prevents them from continually re-granting
>     the lock in PR, and forces the EX from gfs2_grow to go through.
> 
> diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
> index bd5af03..43466d6 100644
> --- a/fs/gfs2/lock_dlm.c
> +++ b/fs/gfs2/lock_dlm.c
> @@ -16,6 +16,25 @@
>  #include "glock.h"
>  #include "util.h"
>  
> +/*
> + * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit with
> + * requests that are compatible, so we use dlm's check:
> + *
> + * Compatibility matrix for conversions with QUECVT set.
> + * Granted mode is the row; requested mode is the column.
> + * Usage: matrix[grmode+1][rqmode+1]
> + */
> +static const int quecvt_compat_matrix[8][8] = {
> +      /* UN NL CR CW PR PW EX PD */
> +        {0, 0, 0, 0, 0, 0, 0, 0},       /* UN */
> +        {0, 0, 1, 1, 1, 1, 1, 0},       /* NL */
> +        {0, 0, 0, 1, 1, 1, 1, 0},       /* CR */
> +        {0, 0, 0, 0, 1, 1, 1, 0},       /* CW */
> +        {0, 0, 0, 1, 0, 1, 1, 0},       /* PR */
> +        {0, 0, 0, 0, 0, 0, 1, 0},       /* PW */
> +        {0, 0, 0, 0, 0, 0, 0, 0},       /* EX */
> +        {0, 0, 0, 0, 0, 0, 0, 0}        /* PD */
> +};
>  
>  static void gdlm_ast(void *arg)
>  {
> @@ -104,10 +123,13 @@ static int make_mode(const unsigned int lmstate)
>  	return -1;
>  }
>  
> -static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
> +static u32 make_flags(struct gfs2_glock *gl, const unsigned int gfs_flags,
>  		      const int req)
>  {
>  	u32 lkf = 0;
> +	u32 lkid = gl->gl_lksb.sb_lkid;
> +	int is_upconvert =
> +		quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
>  
>  	if (gfs_flags & LM_FLAG_TRY)
>  		lkf |= DLM_LKF_NOQUEUE;
> @@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const unsigned int gfs_flags,
>  			BUG();
>  	}
>  
> -	if (lkid != 0) 
> +	if (lkid != 0) {
>  		lkf |= DLM_LKF_CONVERT;
> +		if (is_upconvert)
> +			lkf |= DLM_LKF_QUECVT;
> +	}
>  
>  	lkf |= DLM_LKF_VALBLK;
>  
> @@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock *gl,
>  
>  	gl->gl_req = req_state;
>  	req = make_mode(req_state);
> -	lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
> +	lkf = make_flags(gl, flags, req);
>  
>  	/*
>  	 * Submit the actual lock request.
> 




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
  2012-04-10  9:12   ` Steven Whitehouse
@ 2012-04-10 14:01     ` Bob Peterson
  2012-04-10 14:49       ` Steven Whitehouse
  2012-04-10 14:08     ` David Teigland
  1 sibling, 1 reply; 6+ messages in thread
From: Bob Peterson @ 2012-04-10 14:01 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Hi,
| 
| On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
| > Hi,
| > 
| > Here's another patch (explanation below). This patch replies upon
| > a DLM patch that hasn't fully gone upstream yet, so perhaps it
| > shouldn't be added to the nmw tree until it is. This greatly
| > improves the performance of gfs2_grow in a clustered gfs2 file
| > system.
| > 
| > Regards,
| > 
| I'm still not very keen on dragging in this bit of dlm. If it is
| really
| needed, then we should use the copy in the dlm itself and not add our
| own copy of it. Also, why can we not use the GLF_BLOCKING flag for
| this
| purpose? Is there some issue with try locks in that case?

To answer your questions:
(1) Yes, this bit is really needed.
(2) We cannot use the actual dlm flags because they're part of
    dlm/lock.c and not in a header. I can request they be moved
    to a header file, but that's at the whim of upstream dlm.
(3) We can't use GLF_BLOCKING because it's a different issue.
    The issue is this: Unless instructed otherwise, dlm will
    grant GFS2 (or any caller) locks as it sees fit, and when there
    is a hotly contested file (such as rindex, once it's been
    updated by gfs2_grow and needed immediately by all nodes in
    a cluster) dlm often finds it more convenient to grant the lock
    to a process on the current owner node rather than bouncing the
    lock to a different node.

    That can result in one node "hogging" the lock for long periods of
    time. That means a gfs2_grow can take 30 minutes to complete
    rather than 0.3 seconds. To use a farming analogy: your animals
    are starving because the hogs at the trough are eagerly fighting
    over the morsel at the bottom, and not giving the farmer enough
    of a break to put their food in. To avoid this fighting behavior
    dlm has the DLM_LKF_QUECVT bit which basically says lock upgrades
    (the farmer) over the convenience of keeping the lock on the same
    node (the hogs). Employing this bit does wonders for gfs2_grow.
    However, we can only use the bit on dlm lock upgrades. If we
    use it in other cases, dlm goes through an error path that
    results in a hang.

| When you say that this relies upon this dlm patch, what does that
| mean?

(4) It means that there's a DLM bug whereby using the DLM_LKF_QUECVT
    bit causes DLM (and therefore GFS2) to hang in certain situations.
    There is a patch to fix this bad DLM behavior but it hasn't gone
    upstream yet.

| What are the consequences of having this patch but not the dlm one?

(5) The consequences of having this patch but not the DLM one are
    disastrous: gfs2 is likely to hang.

| I'm
| wondering whether I should hold off on this at least until the dlm
| one
| has been finalised and applied.

(6) Yes, that's what we need to do.

| Aside from those points though, this is a really big step forward and
| should make a measurable difference to performance across the board
| when
| contention between multiple nodes occurs. So I'm very happy to see
| such
| significant progress in this area,
| 
| Steve.
| 
| 
| > Bob Peterson
| > Red Hat File Systems
| > 
| > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
| > ---
| > Author: Bob Peterson <rpeterso@redhat.com>
| > Date:   Wed Apr 4 15:14:51 2012 -0500
| > 
| >     GFS2: Instruct DLM to avoid queue convert slowdowns
| >     
| >     This patch instructs DLM to prevent an "in place" conversion,
| >     where the
| >     lock just stays on the granted queue, and instead forces the
| >     conversion to
| >     the back of the convert queue. This is done on upward
| >     conversions only.
| >     
| >     This is useful in cases where, for example, a lock is
| >     frequently needed in
| >     PR on one node, but another node needs it temporarily in EX to
| >     update it.
| >     This may happen, for example, when the rindex is being updated
| >     by gfs2_grow.
| >     The gfs2_grow needs to have the lock in EX, but the other nodes
| >     need to
| >     re-read it to retrieve the updates. The glock is already
| >     granted in PR on
| >     the non-growing nodes, so this prevents them from continually
| >     re-granting
| >     the lock in PR, and forces the EX from gfs2_grow to go through.
| > 
| > diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
| > index bd5af03..43466d6 100644
| > --- a/fs/gfs2/lock_dlm.c
| > +++ b/fs/gfs2/lock_dlm.c
| > @@ -16,6 +16,25 @@
| >  #include "glock.h"
| >  #include "util.h"
| >  
| > +/*
| > + * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit
| > with
| > + * requests that are compatible, so we use dlm's check:
| > + *
| > + * Compatibility matrix for conversions with QUECVT set.
| > + * Granted mode is the row; requested mode is the column.
| > + * Usage: matrix[grmode+1][rqmode+1]
| > + */
| > +static const int quecvt_compat_matrix[8][8] = {
| > +      /* UN NL CR CW PR PW EX PD */
| > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* UN */
| > +        {0, 0, 1, 1, 1, 1, 1, 0},       /* NL */
| > +        {0, 0, 0, 1, 1, 1, 1, 0},       /* CR */
| > +        {0, 0, 0, 0, 1, 1, 1, 0},       /* CW */
| > +        {0, 0, 0, 1, 0, 1, 1, 0},       /* PR */
| > +        {0, 0, 0, 0, 0, 0, 1, 0},       /* PW */
| > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* EX */
| > +        {0, 0, 0, 0, 0, 0, 0, 0}        /* PD */
| > +};
| >  
| >  static void gdlm_ast(void *arg)
| >  {
| > @@ -104,10 +123,13 @@ static int make_mode(const unsigned int
| > lmstate)
| >  	return -1;
| >  }
| >  
| > -static u32 make_flags(const u32 lkid, const unsigned int
| > gfs_flags,
| > +static u32 make_flags(struct gfs2_glock *gl, const unsigned int
| > gfs_flags,
| >  		      const int req)
| >  {
| >  	u32 lkf = 0;
| > +	u32 lkid = gl->gl_lksb.sb_lkid;
| > +	int is_upconvert =
| > +		quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
| >  
| >  	if (gfs_flags & LM_FLAG_TRY)
| >  		lkf |= DLM_LKF_NOQUEUE;
| > @@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const
| > unsigned int gfs_flags,
| >  			BUG();
| >  	}
| >  
| > -	if (lkid != 0)
| > +	if (lkid != 0) {
| >  		lkf |= DLM_LKF_CONVERT;
| > +		if (is_upconvert)
| > +			lkf |= DLM_LKF_QUECVT;
| > +	}
| >  
| >  	lkf |= DLM_LKF_VALBLK;
| >  
| > @@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock
| > *gl,
| >  
| >  	gl->gl_req = req_state;
| >  	req = make_mode(req_state);
| > -	lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
| > +	lkf = make_flags(gl, flags, req);
| >  
| >  	/*
| >  	 * Submit the actual lock request.
| > 
| 
| 
| 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
  2012-04-10  9:12   ` Steven Whitehouse
  2012-04-10 14:01     ` Bob Peterson
@ 2012-04-10 14:08     ` David Teigland
  2012-04-10 14:46       ` Steven Whitehouse
  1 sibling, 1 reply; 6+ messages in thread
From: David Teigland @ 2012-04-10 14:08 UTC (permalink / raw)
  To: cluster-devel.redhat.com

On Tue, Apr 10, 2012 at 10:12:28AM +0100, Steven Whitehouse wrote:
> Hi,
> 
> On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
> > Hi,
> > 
> > Here's another patch (explanation below). This patch replies upon
> > a DLM patch that hasn't fully gone upstream yet, so perhaps it
> > shouldn't be added to the nmw tree until it is. This greatly
> > improves the performance of gfs2_grow in a clustered gfs2 file system.
> > 
> > Regards,
> > 
> I'm still not very keen on dragging in this bit of dlm. If it is really
> needed, then we should use the copy in the dlm itself and not add our
> own copy of it.

The table is equivalent to:
(rq_mode > gr_mode) || (gr_mode == PR && rq_mode == CW)

> When you say that this relies upon this dlm patch, what does that mean?
> What are the consequences of having this patch but not the dlm one? I'm
> wondering whether I should hold off on this at least until the dlm one
> has been finalised and applied.

Yeah, using QUECVT everywhere will make things worse until it's fixed.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
  2012-04-10 14:08     ` David Teigland
@ 2012-04-10 14:46       ` Steven Whitehouse
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Whitehouse @ 2012-04-10 14:46 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Tue, 2012-04-10 at 10:08 -0400, David Teigland wrote:
> On Tue, Apr 10, 2012 at 10:12:28AM +0100, Steven Whitehouse wrote:
> > Hi,
> > 
> > On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
> > > Hi,
> > > 
> > > Here's another patch (explanation below). This patch replies upon
> > > a DLM patch that hasn't fully gone upstream yet, so perhaps it
> > > shouldn't be added to the nmw tree until it is. This greatly
> > > improves the performance of gfs2_grow in a clustered gfs2 file system.
> > > 
> > > Regards,
> > > 
> > I'm still not very keen on dragging in this bit of dlm. If it is really
> > needed, then we should use the copy in the dlm itself and not add our
> > own copy of it.
> 
> The table is equivalent to:
> (rq_mode > gr_mode) || (gr_mode == PR && rq_mode == CW)
> 
The GLF_BLOCKING flags is almost identical to that, except that it is
also cleared on try locks. I don't think that makes any difference in
this case. Since we already distinguish between getting a new DLM lock
and conversions, it should just be a case of setting the QUECVT flag in
the conversion branch if the GLF_BLOCKING flag is set if I've understood
whats going on correctly here.

> > When you say that this relies upon this dlm patch, what does that mean?
> > What are the consequences of having this patch but not the dlm one? I'm
> > wondering whether I should hold off on this at least until the dlm one
> > has been finalised and applied.
> 
> Yeah, using QUECVT everywhere will make things worse until it's fixed.
> 

Ok, I'll hold off merging this (or whatever we land up with, finally)
until the DLM patch has reached a similar stage in that case,

Steve.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns
  2012-04-10 14:01     ` Bob Peterson
@ 2012-04-10 14:49       ` Steven Whitehouse
  0 siblings, 0 replies; 6+ messages in thread
From: Steven Whitehouse @ 2012-04-10 14:49 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On Tue, 2012-04-10 at 10:01 -0400, Bob Peterson wrote:
> ----- Original Message -----
> | Hi,
> | 
> | On Thu, 2012-04-05 at 12:11 -0400, Bob Peterson wrote:
> | > Hi,
> | > 
> | > Here's another patch (explanation below). This patch replies upon
> | > a DLM patch that hasn't fully gone upstream yet, so perhaps it
> | > shouldn't be added to the nmw tree until it is. This greatly
> | > improves the performance of gfs2_grow in a clustered gfs2 file
> | > system.
> | > 
> | > Regards,
> | > 
> | I'm still not very keen on dragging in this bit of dlm. If it is
> | really
> | needed, then we should use the copy in the dlm itself and not add our
> | own copy of it. Also, why can we not use the GLF_BLOCKING flag for
> | this
> | purpose? Is there some issue with try locks in that case?
> 
> To answer your questions:
> (1) Yes, this bit is really needed.
> (2) We cannot use the actual dlm flags because they're part of
>     dlm/lock.c and not in a header. I can request they be moved
>     to a header file, but that's at the whim of upstream dlm.
Indeed, but we don't normally duplicate code like this, and in this case
I'm suggesting that it is not required anyway.

> (3) We can't use GLF_BLOCKING because it's a different issue.
>     The issue is this: Unless instructed otherwise, dlm will
>     grant GFS2 (or any caller) locks as it sees fit, and when there
>     is a hotly contested file (such as rindex, once it's been
>     updated by gfs2_grow and needed immediately by all nodes in
>     a cluster) dlm often finds it more convenient to grant the lock
>     to a process on the current owner node rather than bouncing the
>     lock to a different node.
> 
>     That can result in one node "hogging" the lock for long periods of
>     time. That means a gfs2_grow can take 30 minutes to complete
>     rather than 0.3 seconds. To use a farming analogy: your animals
>     are starving because the hogs at the trough are eagerly fighting
>     over the morsel at the bottom, and not giving the farmer enough
>     of a break to put their food in. To avoid this fighting behavior
>     dlm has the DLM_LKF_QUECVT bit which basically says lock upgrades
>     (the farmer) over the convenience of keeping the lock on the same
>     node (the hogs). Employing this bit does wonders for gfs2_grow.
>     However, we can only use the bit on dlm lock upgrades. If we
>     use it in other cases, dlm goes through an error path that
>     results in a hang.
> 
I still don't understand why the GLF_BLOCKING flag is not the correct
condition here - see my reply to Dave.

> | When you say that this relies upon this dlm patch, what does that
> | mean?
> 
> (4) It means that there's a DLM bug whereby using the DLM_LKF_QUECVT
>     bit causes DLM (and therefore GFS2) to hang in certain situations.
>     There is a patch to fix this bad DLM behavior but it hasn't gone
>     upstream yet.
> 
> | What are the consequences of having this patch but not the dlm one?
> 
> (5) The consequences of having this patch but not the DLM one are
>     disastrous: gfs2 is likely to hang.
> 
> | I'm
> | wondering whether I should hold off on this at least until the dlm
> | one
> | has been finalised and applied.
> 
> (6) Yes, that's what we need to do.
> 
Ok, sounds good to me,

Steve.

> | Aside from those points though, this is a really big step forward and
> | should make a measurable difference to performance across the board
> | when
> | contention between multiple nodes occurs. So I'm very happy to see
> | such
> | significant progress in this area,
> | 
> | Steve.
> | 
> | 
> | > Bob Peterson
> | > Red Hat File Systems
> | > 
> | > Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> | > ---
> | > Author: Bob Peterson <rpeterso@redhat.com>
> | > Date:   Wed Apr 4 15:14:51 2012 -0500
> | > 
> | >     GFS2: Instruct DLM to avoid queue convert slowdowns
> | >     
> | >     This patch instructs DLM to prevent an "in place" conversion,
> | >     where the
> | >     lock just stays on the granted queue, and instead forces the
> | >     conversion to
> | >     the back of the convert queue. This is done on upward
> | >     conversions only.
> | >     
> | >     This is useful in cases where, for example, a lock is
> | >     frequently needed in
> | >     PR on one node, but another node needs it temporarily in EX to
> | >     update it.
> | >     This may happen, for example, when the rindex is being updated
> | >     by gfs2_grow.
> | >     The gfs2_grow needs to have the lock in EX, but the other nodes
> | >     need to
> | >     re-read it to retrieve the updates. The glock is already
> | >     granted in PR on
> | >     the non-growing nodes, so this prevents them from continually
> | >     re-granting
> | >     the lock in PR, and forces the EX from gfs2_grow to go through.
> | > 
> | > diff --git a/fs/gfs2/lock_dlm.c b/fs/gfs2/lock_dlm.c
> | > index bd5af03..43466d6 100644
> | > --- a/fs/gfs2/lock_dlm.c
> | > +++ b/fs/gfs2/lock_dlm.c
> | > @@ -16,6 +16,25 @@
> | >  #include "glock.h"
> | >  #include "util.h"
> | >  
> | > +/*
> | > + * Borrowed from dlm's lock.c: We need to only pass the QUECVT bit
> | > with
> | > + * requests that are compatible, so we use dlm's check:
> | > + *
> | > + * Compatibility matrix for conversions with QUECVT set.
> | > + * Granted mode is the row; requested mode is the column.
> | > + * Usage: matrix[grmode+1][rqmode+1]
> | > + */
> | > +static const int quecvt_compat_matrix[8][8] = {
> | > +      /* UN NL CR CW PR PW EX PD */
> | > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* UN */
> | > +        {0, 0, 1, 1, 1, 1, 1, 0},       /* NL */
> | > +        {0, 0, 0, 1, 1, 1, 1, 0},       /* CR */
> | > +        {0, 0, 0, 0, 1, 1, 1, 0},       /* CW */
> | > +        {0, 0, 0, 1, 0, 1, 1, 0},       /* PR */
> | > +        {0, 0, 0, 0, 0, 0, 1, 0},       /* PW */
> | > +        {0, 0, 0, 0, 0, 0, 0, 0},       /* EX */
> | > +        {0, 0, 0, 0, 0, 0, 0, 0}        /* PD */
> | > +};
> | >  
> | >  static void gdlm_ast(void *arg)
> | >  {
> | > @@ -104,10 +123,13 @@ static int make_mode(const unsigned int
> | > lmstate)
> | >  	return -1;
> | >  }
> | >  
> | > -static u32 make_flags(const u32 lkid, const unsigned int
> | > gfs_flags,
> | > +static u32 make_flags(struct gfs2_glock *gl, const unsigned int
> | > gfs_flags,
> | >  		      const int req)
> | >  {
> | >  	u32 lkf = 0;
> | > +	u32 lkid = gl->gl_lksb.sb_lkid;
> | > +	int is_upconvert =
> | > +		quecvt_compat_matrix[make_mode(gl->gl_state) + 1][req + 1];
> | >  
> | >  	if (gfs_flags & LM_FLAG_TRY)
> | >  		lkf |= DLM_LKF_NOQUEUE;
> | > @@ -131,8 +153,11 @@ static u32 make_flags(const u32 lkid, const
> | > unsigned int gfs_flags,
> | >  			BUG();
> | >  	}
> | >  
> | > -	if (lkid != 0)
> | > +	if (lkid != 0) {
> | >  		lkf |= DLM_LKF_CONVERT;
> | > +		if (is_upconvert)
> | > +			lkf |= DLM_LKF_QUECVT;
> | > +	}
> | >  
> | >  	lkf |= DLM_LKF_VALBLK;
> | >  
> | > @@ -149,7 +174,7 @@ static unsigned int gdlm_lock(struct gfs2_glock
> | > *gl,
> | >  
> | >  	gl->gl_req = req_state;
> | >  	req = make_mode(req_state);
> | > -	lkf = make_flags(gl->gl_lksb.sb_lkid, flags, req);
> | > +	lkf = make_flags(gl, flags, req);
> | >  
> | >  	/*
> | >  	 * Submit the actual lock request.
> | > 
> | 
> | 
> | 




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-04-10 14:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ec4f8e73-e24b-4594-991c-90245824cda7@zmail12.collab.prod.int.phx2.redhat.com>
2012-04-05 16:11 ` [Cluster-devel] [GFS2 PATCH] GFS2: Instruct DLM to avoid queue convert slowdowns Bob Peterson
2012-04-10  9:12   ` Steven Whitehouse
2012-04-10 14:01     ` Bob Peterson
2012-04-10 14:49       ` Steven Whitehouse
2012-04-10 14:08     ` David Teigland
2012-04-10 14:46       ` Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).