From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 29 Jul 2013 11:49:10 +0100
Subject: [Cluster-devel] [PATCH] GFS2: don't overrun reserved revokes
In-Reply-To: <20130726220933.GJ2113@dhcp80-209.msp.redhat.com>
References: <20130726220933.GJ2113@dhcp80-209.msp.redhat.com>
Message-ID: <1375094950.2804.22.camel@menhir>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Hi,

On Fri, 2013-07-26 at 17:09 -0500, Benjamin Marzinski wrote:
> When run during fsync, a gfs2_log_flush could happen between the
> time when gfs2_ail_flush checked the number of blocks to revoke,
> and when it actually started the transaction to do those revokes.
> This occassionally caused it to need more revokes than it reserved,
> causing gfs2 to crash.
> 
> Instead of just reserving enough revokes to handle the blocks that
> currently need them, this patch makes gfs2_ail_flush reserve the
> maximum number of revokes it can, without increasing the total number
> of reserved log blocks. This patch also passes the number of reserved
> revokes to __gfs2_ail_flush() so that it doesn't go over its limit
> and cause a crash like we're seeing. Non-fsync calls to __gfs2_ail_flush
> will still cause a BUG() necessary revokes are skipped.
> 
> Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
> ---
>  fs/gfs2/glops.c |   18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
> 
> Index: gfs2-3.0-nmw-130722/fs/gfs2/glops.c
> ===================================================================
> --- gfs2-3.0-nmw-130722.orig/fs/gfs2/glops.c
> +++ gfs2-3.0-nmw-130722/fs/gfs2/glops.c
> @@ -47,7 +47,8 @@ static void gfs2_ail_error(struct gfs2_g
>   * None of the buffers should be dirty, locked, or pinned.
>   */
>  
> -static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync)
> +static void __gfs2_ail_flush(struct gfs2_glock *gl, bool fsync,
> +			     unsigned int nr_revokes)
>  {
>  	struct gfs2_sbd *sdp = gl->gl_sbd;
>  	struct list_head *head = &gl->gl_ail_list;
> @@ -57,7 +58,9 @@ static void __gfs2_ail_flush(struct gfs2
>  
>  	gfs2_log_lock(sdp);
>  	spin_lock(&sdp->sd_ail_lock);
> -	list_for_each_entry_safe(bd, tmp, head, bd_ail_gl_list) {
> +	list_for_each_entry_safe_reverse(bd, tmp, head, bd_ail_gl_list) {
> +		if (nr_revokes == 0)
> +			break;
>  		bh = bd->bd_bh;
>  		if (bh->b_state & b_state) {
>  			if (fsync)
> @@ -65,6 +68,7 @@ static void __gfs2_ail_flush(struct gfs2
>  			gfs2_ail_error(gl, bh);
>  		}
>  		gfs2_trans_add_revoke(sdp, bd);
> +		nr_revokes--;
>  	}
>  	GLOCK_BUG_ON(gl, !fsync && atomic_read(&gl->gl_ail_count));
>  	spin_unlock(&sdp->sd_ail_lock);
> @@ -91,7 +95,7 @@ static void gfs2_ail_empty_gl(struct gfs
>  	WARN_ON_ONCE(current->journal_info);
>  	current->journal_info = &tr;
>  
> -	__gfs2_ail_flush(gl, 0);
> +	__gfs2_ail_flush(gl, 0, tr.tr_revokes);
>  
>  	gfs2_trans_end(sdp);
>  	gfs2_log_flush(sdp, NULL);
> @@ -101,15 +105,19 @@ void gfs2_ail_flush(struct gfs2_glock *g
>  {
>  	struct gfs2_sbd *sdp = gl->gl_sbd;
>  	unsigned int revokes = atomic_read(&gl->gl_ail_count);
> +	unsigned int max_revokes = (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_log_descriptor)) / sizeof(u64);
>  	int ret;
>  
>  	if (!revokes)
>  		return;
>  
> -	ret = gfs2_trans_begin(sdp, 0, revokes);
> +	while (revokes > max_revokes)
> +		max_revokes += (sdp->sd_sb.sb_bsize - sizeof(struct gfs2_meta_header)) / sizeof(u64);
> +
> +	ret = gfs2_trans_begin(sdp, 0, max_revokes);
>  	if (ret)
>  		return;
> -	__gfs2_ail_flush(gl, fsync);
> +	__gfs2_ail_flush(gl, fsync, max_revokes);
>  	gfs2_trans_end(sdp);
>  	gfs2_log_flush(sdp, NULL);
>  }
> 

Will this always write back all the revokes that are queued up? We
really must do that here as fsync requires them to be on-disk. I wonder
whether one solution is just to add a loop to use multiple transactions
in the case that we cannot fit them all into a single transaction?

Steve.