cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Andreas Gruenbacher <agruenba@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 20/25] GFS2: rgrp free blocks used incorrectly
Date: Mon, 13 Aug 2018 19:45:50 +0200	[thread overview]
Message-ID: <20180813174555.10402-21-agruenba@redhat.com> (raw)
In-Reply-To: <20180813174555.10402-1-agruenba@redhat.com>

From: Bob Peterson <rpeterso@redhat.com>

Before this patch, several functions in rgrp.c checked the value of
rgd->rd_free_clone. That does not take into account blocks that were
reserved by a multi-block reservation. This causes a problem when
space gets tight in the file system. For example, when function
gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
satisfy the request, it can accept a rgrp that it should reject
because, although there are enough blocks to satisfy the request
_now_, those blocks may be reserved for another running process.

A second problem with this occurs when we've reserved the remaining
blocks in an rgrp: function rg_mblk_search() can reject an rgrp
improperly because it calculates:

   u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;

But rd_reserved includes blocks that the current process just
reserved in its own call to inplace_reserve. For example, it can
reserve the last 128 blocks of an rgrp, then reject that same rgrp
because the above calculates out to free_blocks = 0;

Consequences include, but are not limited to, (1) leaving holes,
and thus increasing file system fragmentation, and (2) reporting
file system is full long before it actually is.

This patch introduces a new function, rgd_free, which returns the
number of clone-free blocks (blocks that are truly free as opposed
to blocks that are still being used because an unlinked file is
still open) minus the number of blocks reserved by processes, but
not counting the blocks we ourselves reserved (because obviously
we need to allocate them).

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/rgrp.c | 39 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 0a484a009ba2..68a81afd3b4a 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1489,6 +1489,34 @@ static void rs_insert(struct gfs2_inode *ip)
 	trace_gfs2_rs(rs, TRACE_RS_INSERT);
 }
 
+/**
+ * rgd_free - return the number of free blocks we can allocate.
+ * @rgd: the resource group
+ *
+ * This function returns the number of free blocks for an rgrp.
+ * That's the clone-free blocks (blocks that are free, not including those
+ * still being used for unlinked files that haven't been deleted.)
+ *
+ * It also subtracts any blocks reserved by someone else, but does not
+ * include free blocks that are still part of our current reservation,
+ * because obviously we can (and will) allocate them.
+ */
+static inline u32 rgd_free(struct gfs2_rgrpd *rgd, struct gfs2_blkreserv *rs)
+{
+	u32 tot_reserved, tot_free;
+
+	if (WARN_ON_ONCE(rgd->rd_reserved < rs->rs_free))
+		return 0;
+	tot_reserved = rgd->rd_reserved - rs->rs_free;
+
+	if (rgd->rd_free_clone < tot_reserved)
+		tot_reserved = 0;
+
+	tot_free = rgd->rd_free_clone - tot_reserved;
+
+	return tot_free;
+}
+
 /**
  * rg_mblk_search - find a group of multiple free blocks to form a reservation
  * @rgd: the resource group descriptor
@@ -1504,7 +1532,7 @@ static void rg_mblk_search(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip,
 	u64 goal;
 	struct gfs2_blkreserv *rs = &ip->i_res;
 	u32 extlen;
-	u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;
+	u32 free_blocks = rgd_free(rgd, rs);
 	int ret;
 	struct inode *inode = &ip->i_inode;
 
@@ -1985,7 +2013,7 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 	int error = 0, rg_locked, flags = 0;
 	u64 last_unlinked = NO_BLOCK;
 	int loops = 0;
-	u32 skip = 0;
+	u32 free_blocks, skip = 0;
 
 	if (sdp->sd_args.ar_rgrplvb)
 		flags |= GL_SKIP;
@@ -2056,10 +2084,11 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
 			goto check_rgrp;
 
 		/* If rgrp has enough free space, use it */
-		if (rs->rs_rbm.rgd->rd_free_clone >= ap->target ||
+		free_blocks = rgd_free(rs->rs_rbm.rgd, rs);
+		if (free_blocks >= ap->target ||
 		    (loops == 2 && ap->min_target &&
-		     rs->rs_rbm.rgd->rd_free_clone >= ap->min_target)) {
-			ap->allowed = rs->rs_rbm.rgd->rd_free_clone;
+		     free_blocks >= ap->min_target)) {
+			ap->allowed = free_blocks;
 			return 0;
 		}
 check_rgrp:
-- 
2.17.1



  parent reply	other threads:[~2018-08-13 17:45 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-13 17:45 [Cluster-devel] [PATCH 00/25] GFS2: Pre-pull patch posting (merge window) Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 01/25] gfs2: eliminate rs_inum and reduce the size of gfs2 inodes Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 02/25] gfs2: Don't withdraw under a spin lock Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 03/25] gfs2: Minor clarification to __gfs2_punch_hole Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 04/25] gfs2: call ktime_get_coarse_real_ts64() directly Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 05/25] gfs2: Further iomap cleanups Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 06/25] gfs2: iomap buffered write support Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 07/25] gfs2: gfs2_extent_length cleanup Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 08/25] gfs2: iomap direct I/O support Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 09/25] gfs2: Remove gfs2_write_{begin,end} Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 10/25] gfs2: Stop messing with ip->i_rgd in the rlist code Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 11/25] gfs2: Eliminate redundant ip->i_rgd Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 12/25] gfs2: Don't reject a supposedly full bitmap if we have blocks reserved Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 13/25] gfs2: using posix_acl_xattr_size instead of posix_acl_to_xattr Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 14/25] fs: gfs2: Adding new return type vm_fault_t Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 15/25] GFS2: Fix recovery issues for spectators Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 16/25] gfs2: fallocate_chunk: Always initialize struct iomap Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 17/25] gfs2: Use iomap for stuffed direct I/O reads Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 18/25] gfs2: use iomap_readpage for blocksize == PAGE_SIZE Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 19/25] gfs2: remove redundant variable 'moved' Andreas Gruenbacher
2018-08-13 17:45 ` Andreas Gruenbacher [this message]
2018-08-13 17:45 ` [Cluster-devel] [PATCH 21/25] gfs2: Special-case rindex for gfs2_grow Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 22/25] gfs2: cleanup: call gfs2_rgrp_ondisk2lvb from gfs2_rgrp_out Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 23/25] gfs2: Get rid of gfs2_ea_strlen Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 24/25] gfs2: Fix gfs2_testbit to use clone bitmaps Andreas Gruenbacher
2018-08-13 17:45 ` [Cluster-devel] [PATCH 25/25] gfs2: eliminate update_rgrp_lvb_unlinked Andreas Gruenbacher
2018-08-16 13:53 ` [Cluster-devel] [PATCH 00/25] GFS2: Pre-pull patch posting (merge window) Andreas Gruenbacher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180813174555.10402-21-agruenba@redhat.com \
    --to=agruenba@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).