cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Steven Whitehouse <swhiteho@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates
Date: Fri, 16 Dec 2016 14:40:10 +0000	[thread overview]
Message-ID: <5853FCCA.8000809@redhat.com> (raw)
In-Reply-To: <1207853613.13078711.1481897777538.JavaMail.zimbra@redhat.com>

Hi,

On 16/12/16 14:16, Bob Peterson wrote:
> Hi,
>
> This patch limits the number of transaction blocks requested during
> file truncates. If we have very large multi-terabyte files, and want
> to delete or truncate them, they might span so many resource groups
> that we overflow the journal blocks, and cause an assert failure.
> By limiting the number of blocks in the transaction, we prevent this
> overflow and give other running processes time to do transactions.
>
> Note that this is a temporary solution. A long-term solution would
> be to implement the non-recursive truncate that processes one
> resource group at a time, and therefore, the transactions need
> much fewer blocks.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 645721f..77f2ab1 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -720,6 +720,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   {
>   	struct gfs2_sbd *sdp = GFS2_SB(&ip->i_inode);
>   	struct gfs2_rgrp_list rlist;
> +	struct gfs2_trans *tr;
>   	u64 bn, bstart;
>   	u32 blen, btotal;
>   	__be64 *p;
> @@ -728,6 +729,7 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   	unsigned int revokes = 0;
>   	int x;
>   	int error;
> +	int jblocks_rqsted;
>   
>   	error = gfs2_rindex_update(sdp);
>   	if (error)
> @@ -791,12 +793,17 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   	if (gfs2_rs_active(&ip->i_res)) /* needs to be done with the rgrp glock held */
>   		gfs2_rs_deltree(&ip->i_res);
>   
> -	error = gfs2_trans_begin(sdp, rg_blocks + RES_DINODE +
> -				 RES_INDIRECT + RES_STATFS + RES_QUOTA,
> -				 revokes);
> +restart:
> +	jblocks_rqsted = rg_blocks + RES_DINODE +
> +		RES_INDIRECT + RES_STATFS + RES_QUOTA +
> +		gfs2_struct2blk(sdp, revokes, sizeof(u64));
> +	if (jblocks_rqsted > RES_MAX)
> +		jblocks_rqsted = RES_MAX;
> +	error = gfs2_trans_begin(sdp, jblocks_rqsted, revokes);
>   	if (error)
>   		goto out_rg_gunlock;
>   
> +	tr = current->journal_info;
>   	down_write(&ip->i_rw_mutex);
>   
>   	gfs2_trans_add_meta(ip->i_gl, dibh);
> @@ -810,6 +817,15 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   		if (!*p)
>   			continue;
>   
> +		/* check for max reasonable journal transaction blocks */
> +		if (tr->tr_num_buf_new + RES_STATFS + RES_QUOTA >= RES_MAX) {
> +			if (rg_blocks >= tr->tr_num_buf_new)
> +				rg_blocks -= tr->tr_num_buf_new;
> +			else
> +				rg_blocks = 0;
> +			break;
> +		}
> +
>   		bn = be64_to_cpu(*p);
>   
>   		if (bstart + blen == bn)
> @@ -827,6 +843,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   		*p = 0;
>   		gfs2_add_inode_blocks(&ip->i_inode, -1);
>   	}
> +	if (p == bottom)
> +		rg_blocks = 0;
> +
>   	if (bstart) {
>   		__gfs2_free_blocks(ip, bstart, blen, metadata);
>   		btotal += blen;
> @@ -844,6 +863,9 @@ static int do_strip(struct gfs2_inode *ip, struct buffer_head *dibh,
>   
>   	gfs2_trans_end(sdp);
>   
> +	if (rg_blocks)
> +		goto restart;
> +
>   out_rg_gunlock:
>   	gfs2_glock_dq_m(rlist.rl_rgrps, rlist.rl_ghs);
>   out_rlist:
> diff --git a/fs/gfs2/trans.h b/fs/gfs2/trans.h
> index 1e6e7da..c212d16 100644
> --- a/fs/gfs2/trans.h
> +++ b/fs/gfs2/trans.h
> @@ -25,6 +25,7 @@ struct gfs2_glock;
>   #define RES_EATTR	1
>   #define RES_STATFS	1
>   #define RES_QUOTA	2
> +#define RES_MAX		256 /* Must not exceed 8192 (min journal is 32MB) */
>   
>   /* reserve either the number of blocks to be allocated plus the rg header
>    * block, or all of the blocks in the rg, whichever is smaller */
>
If it can't exceed 8192, then why is it only 256, could there not be a 
larger number? It should probably scale with journal size to avoid 
causing issues for those with larger journals. The approach looks good 
though for a temporary fix,

Steve.



  reply	other threads:[~2016-12-16 14:40 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <83265106.13073562.1481897633945.JavaMail.zimbra@redhat.com>
2016-12-16 14:16 ` [Cluster-devel] [GFS2 PATCH] GFS2: Limit number of transaction blocks requested for truncates Bob Peterson
2016-12-16 14:40   ` Steven Whitehouse [this message]
2016-12-16 16:10     ` Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5853FCCA.8000809@redhat.com \
    --to=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).