* [Cluster-devel] [GFS2 v2 PATCH v2] GFS2: rgrp free blocks used incorrectly
[not found] <1630635157.44678257.1529586451082.JavaMail.zimbra@redhat.com>
@ 2018-06-21 13:09 ` Bob Peterson
0 siblings, 0 replies; only message in thread
From: Bob Peterson @ 2018-06-21 13:09 UTC (permalink / raw)
To: cluster-devel.redhat.com
Hi,
I posted a nearly identical patch on 30 May, which was ignored.
Since then, we've had patches that required me to revise it.
This is the revised (v2) version.
---
Before this patch, several functions in rgrp.c checked the value of
rgd->rd_free_clone. That does not take into account blocks that were
reserved by a multi-block reservation. This causes a problem when
space gets tight in the file system. For example, when function
gfs2_inplace_reserve checks to see if a rgrp has enough blocks to
satisfy the request, it can accept a rgrp that it should reject
because, although there are enough blocks to satisfy the request
_now_, those blocks may be reserved for another running process.
A second problem with this occurs when we've reserved the remaining
blocks in an rgrp: function rg_mblk_search() can reject an rgrp
improperly because it calculates:
u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;
But rd_reserved includes blocks that the current process just
reserved in its own call to inplace_reserve. For example, it can
reserve the last 128 blocks of an rgrp, then reject that same rgrp
because the above calculates out to free_blocks = 0;
Consequences include, but are not limited to, (1) leaving holes,
and thus increasing file system fragmentation, and (2) reporting
file system is full long before it actually is.
This patch introduces a new function, rgd_free, which returns the
number of clone-free blocks (blocks that are truly free as opposed
to blocks that are still being used because an unlinked file is
still open) minus the number of blocks reserved by processes, but
not counting the blocks we ourselves reserved (because obviously
we need to allocate them).
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
fs/gfs2/rgrp.c | 33 +++++++++++++++++++++++++++++----
1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 02e100a44456..9611b45b41a1 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1489,6 +1489,31 @@ static void rs_insert(struct gfs2_inode *ip)
trace_gfs2_rs(rs, TRACE_RS_INSERT);
}
+/**
+ * rgd_free - return the number of free blocks we can allocate.
+ * @rgd: the resource group
+ *
+ * This function returns the number of free blocks for an rgrp.
+ * That's the clone-free blocks (blocks that are free, not including those
+ * still being used for unlinked files that haven't been deleted.)
+ *
+ * It also subtracts any blocks reserved by someone else, but does not
+ * include free blocks that are still part of our current reservation,
+ * because obviously we can (and will) allocate them.
+ */
+static inline u32 rgd_free(struct gfs2_rgrpd *rgd, struct gfs2_blkreserv *rs)
+{
+ u32 tot_reserved, tot_free;
+
+ BUG_ON(rgd->rd_reserved < rs->rs_free);
+ tot_reserved = rgd->rd_reserved - rs->rs_free;
+
+ BUG_ON(rgd->rd_free_clone < tot_reserved);
+ tot_free = rgd->rd_free_clone - tot_reserved;
+
+ return tot_free;
+}
+
/**
* rg_mblk_search - find a group of multiple free blocks to form a reservation
* @rgd: the resource group descriptor
@@ -1504,7 +1529,7 @@ static void rg_mblk_search(struct gfs2_rgrpd *rgd, struct gfs2_inode *ip,
u64 goal;
struct gfs2_blkreserv *rs = &ip->i_res;
u32 extlen;
- u32 free_blocks = rgd->rd_free_clone - rgd->rd_reserved;
+ u32 free_blocks = rgd_free(rgd, rs);
int ret;
struct inode *inode = &ip->i_inode;
@@ -2054,10 +2079,10 @@ int gfs2_inplace_reserve(struct gfs2_inode *ip, struct gfs2_alloc_parms *ap)
goto check_rgrp;
/* If rgrp has enough free space, use it */
- if (rs->rs_rbm.rgd->rd_free_clone >= ap->target ||
+ if (rgd_free(rs->rs_rbm.rgd, rs) >= ap->target ||
(loops == 2 && ap->min_target &&
- rs->rs_rbm.rgd->rd_free_clone >= ap->min_target)) {
- ap->allowed = rs->rs_rbm.rgd->rd_free_clone;
+ rgd_free(rs->rs_rbm.rgd, rs) >= ap->min_target)) {
+ ap->allowed = rgd_free(rs->rs_rbm.rgd, rs);
return 0;
}
check_rgrp:
^ permalink raw reply related [flat|nested] only message in thread
only message in thread, other threads:[~2018-06-21 13:09 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1630635157.44678257.1529586451082.JavaMail.zimbra@redhat.com>
2018-06-21 13:09 ` [Cluster-devel] [GFS2 v2 PATCH v2] GFS2: rgrp free blocks used incorrectly Bob Peterson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).