cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates
       [not found] <83265106.13073562.1481897633945.JavaMail.zimbra@redhat.com>
@ 2016-12-16 14:16 ` Bob Peterson
  2016-12-16 14:40   ` Steven Whitehouse
  0 siblings, 1 reply; 3+ messages in thread
From: Bob Peterson @ 2016-12-16 14:16 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

This patch limits the number of transaction blocks requested during
file truncates. If we have very large multi-terabyte files, and want
to delete or truncate them, they might span so many resource groups
that we overflow the journal blocks, and cause an assert failure.
By limiting the number of blocks in the transaction, we prevent this
overflow and give other running processes time to do transactions.

Note that this is a temporary solution. A long-term solution would
be to implement the non-recursive truncate that processes one
resource group at a time, and therefore, the transactions need
much fewer blocks.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 645721f..77f2ab1 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -720,6 +720,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 {
 	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
 	struct gfs2_rgrp_list rlist;
+	struct gfs2_trans *tr;
 	u64 bn, bstart;
 	u32 blen, btotal;
 	__be64 *p;
@@ -728,6 +729,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 	unsigned int revokes = 0;
 	int x;
 	int error;
+	int jblocks_rqsted;
 
 	error = gfs2_rindex_update(sdp);
 	if (error)
@@ -791,12 +793,17 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 	if (gfs2_rs_active(&ip->i_res)) /* needs to be done with the rgrp glock held */
 		gfs2_rs_deltree(&ip->i_res);
 
-	error = gfs2_trans_begin(sdp, rg_blocks + RES_DINODE +
-				 RES_INDIRECT + RES_STATFS + RES_QUOTA,
-				 revokes);
+restart:
+	jblocks_rqsted = rg_blocks + RES_DINODE +
+		RES_INDIRECT + RES_STATFS + RES_QUOTA +
+		gfs2_struct2blk(sdp, revokes, sizeof(u64));
+	if (jblocks_rqsted > RES_MAX)
+		jblocks_rqsted = RES_MAX;
+	error = gfs2_trans_begin(sdp, jblocks_rqsted, revokes);
 	if (error)
 		goto out_rg_gunlock;
 
+	tr = current->journal_info;
 	down_write(&ip->i_rw_mutex);
 
 	gfs2_trans_add_meta(ip->i_gl, dibh);
@@ -810,6 +817,15 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 		if (!*p)
 			continue;
 
+		/* check for max reasonable journal transaction blocks */
+		if (tr->tr_num_buf_new + RES_STATFS + RES_QUOTA >= RES_MAX) {
+			if (rg_blocks >= tr->tr_num_buf_new)
+				rg_blocks -= tr->tr_num_buf_new;
+			else
+				rg_blocks = 0;
+			break;
+		}
+
 		bn = be64_to_cpu(*p);
 
 		if (bstart + blen == bn)
@@ -827,6 +843,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 		*p = 0;
 		gfs2_add_inode_blocks(&ip->i_inode, -1);
 	}
+	if (p == bottom)
+		rg_blocks = 0;
+
 	if (bstart) {
 		__gfs2_free_blocks(ip, bstart, blen, metadata);
 		btotal += blen;
@@ -844,6 +863,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
 
 	gfs2_trans_end(sdp);
 
+	if (rg_blocks)
+		goto restart;
+
 out_rg_gunlock:
 	gfs2_glock_dq_m(rlist.rl_rgrps, rlist.rl_ghs);
 out_rlist:
diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
index 1e6e7da..c212d16 100644
--- a/fs/gfs2/trans.h
+++ b/fs/gfs2/trans.h
@@ -25,6 +25,7 @@ struct gfs2_glock;
 #define RES_EATTR	1
 #define RES_STATFS	1
 #define RES_QUOTA	2
+#define RES_MAX		256 /* Must not exceed 8192 (min journal is 32MB) */
 
 /* reserve either the number of blocks to be allocated plus the rg header
  * block, or all of the blocks in the rg, whichever is smaller */



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates
  2016-12-16 14:16 ` [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates Bob Peterson
@ 2016-12-16 14:40   ` Steven Whitehouse
  2016-12-16 16:10     ` Bob Peterson
  0 siblings, 1 reply; 3+ messages in thread
From: Steven Whitehouse @ 2016-12-16 14:40 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

On 16/12/16 14:16, Bob Peterson wrote:
> Hi,
>
> This patch limits the number of transaction blocks requested during
> file truncates. If we have very large multi-terabyte files, and want
> to delete or truncate them, they might span so many resource groups
> that we overflow the journal blocks, and cause an assert failure.
> By limiting the number of blocks in the transaction, we prevent this
> overflow and give other running processes time to do transactions.
>
> Note that this is a temporary solution. A long-term solution would
> be to implement the non-recursive truncate that processes one
> resource group at a time, and therefore, the transactions need
> much fewer blocks.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 645721f..77f2ab1 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -720,6 +720,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   {
>   	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
>   	struct gfs2_rgrp_list rlist;
> +	struct gfs2_trans *tr;
>   	u64 bn, bstart;
>   	u32 blen, btotal;
>   	__be64 *p;
> @@ -728,6 +729,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   	unsigned int revokes = 0;
>   	int x;
>   	int error;
> +	int jblocks_rqsted;
>   
>   	error = gfs2_rindex_update(sdp);
>   	if (error)
> @@ -791,12 +793,17 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   	if (gfs2_rs_active(&ip->i_res)) /* needs to be done with the rgrp glock held */
>   		gfs2_rs_deltree(&ip->i_res);
>   
> -	error = gfs2_trans_begin(sdp, rg_blocks + RES_DINODE +
> -				 RES_INDIRECT + RES_STATFS + RES_QUOTA,
> -				 revokes);
> +restart:
> +	jblocks_rqsted = rg_blocks + RES_DINODE +
> +		RES_INDIRECT + RES_STATFS + RES_QUOTA +
> +		gfs2_struct2blk(sdp, revokes, sizeof(u64));
> +	if (jblocks_rqsted > RES_MAX)
> +		jblocks_rqsted = RES_MAX;
> +	error = gfs2_trans_begin(sdp, jblocks_rqsted, revokes);
>   	if (error)
>   		goto out_rg_gunlock;
>   
> +	tr = current->journal_info;
>   	down_write(&ip->i_rw_mutex);
>   
>   	gfs2_trans_add_meta(ip->i_gl, dibh);
> @@ -810,6 +817,15 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   		if (!*p)
>   			continue;
>   
> +		/* check for max reasonable journal transaction blocks */
> +		if (tr->tr_num_buf_new + RES_STATFS + RES_QUOTA >= RES_MAX) {
> +			if (rg_blocks >= tr->tr_num_buf_new)
> +				rg_blocks -= tr->tr_num_buf_new;
> +			else
> +				rg_blocks = 0;
> +			break;
> +		}
> +
>   		bn = be64_to_cpu(*p);
>   
>   		if (bstart + blen == bn)
> @@ -827,6 +843,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   		*p = 0;
>   		gfs2_add_inode_blocks(&ip->i_inode, -1);
>   	}
> +	if (p == bottom)
> +		rg_blocks = 0;
> +
>   	if (bstart) {
>   		__gfs2_free_blocks(ip, bstart, blen, metadata);
>   		btotal += blen;
> @@ -844,6 +863,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   
>   	gfs2_trans_end(sdp);
>   
> +	if (rg_blocks)
> +		goto restart;
> +
>   out_rg_gunlock:
>   	gfs2_glock_dq_m(rlist.rl_rgrps, rlist.rl_ghs);
>   out_rlist:
> diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
> index 1e6e7da..c212d16 100644
> --- a/fs/gfs2/trans.h
> +++ b/fs/gfs2/trans.h
> @@ -25,6 +25,7 @@ struct gfs2_glock;
>   #define RES_EATTR	1
>   #define RES_STATFS	1
>   #define RES_QUOTA	2
> +#define RES_MAX		256 /* Must not exceed 8192 (min journal is 32MB) */
>   
>   /* reserve either the number of blocks to be allocated plus the rg header
>    * block, or all of the blocks in the rg, whichever is smaller */
>
If it can't exceed 8192, then why is it only 256, could there not be a 
larger number? It should probably scale with journal size to avoid 
causing issues for those with larger journals. The approach looks good 
though for a temporary fix,

Steve.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates
  2016-12-16 14:40   ` Steven Whitehouse
@ 2016-12-16 16:10     ` Bob Peterson
  0 siblings, 0 replies; 3+ messages in thread
From: Bob Peterson @ 2016-12-16 16:10 UTC (permalink / raw)
  To: cluster-devel.redhat.com

----- Original Message -----
| Hi,
| 
| If it can't exceed 8192, then why is it only 256, could there not be a
| larger number? It should probably scale with journal size to avoid
| causing issues for those with larger journals. The approach looks good
| though for a temporary fix,
| 
| Steve.

Hi,

Yeah, I toyed with the idea of making it scale with the journal, but
anything I chose just seemed so...arbitrary. I originally coded it as the
number of blocks in the journal - some slop for log headers and such:

+       if (jblocks_needed + RES_SLOP > sdp->sd_jdesc->jd_blocks) {
+               jblocks_needed = sdp->sd_jdesc->jd_blocks - RES_SLOP;

At that time I started defined it as:

+#define RES_SLOP       16

I also toyed with the idea of using half the journal blocks:

jblocks_needed = sdp->sd_jdesc->jd_blocks > 1;

Or maybe 3/4 of the journal:

jblocks_needed = sdp->sd_jdesc->jd_blocks > 1 +
                 sdp->sd_jdesc->jd_blocks > 2;

...but I didn't want to waste a bunch of CPU time doing calculations
either: deletes are already slow. Not that that would slow us down much.

In the end, I decided an arbitrary 256 was simple and enough for the
vast majority of deletes.

I'm perfectly happy to use whatever value makes the most sense.
Suggestions?

Regards,

Bob Peterson
Red Hat File Systems



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-12-16 16:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <83265106.13073562.1481897633945.JavaMail.zimbra@redhat.com>
2016-12-16 14:16 ` [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates Bob Peterson
2016-12-16 14:40   ` Steven Whitehouse
2016-12-16 16:10     ` Bob Peterson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).