cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Bob Peterson <rpeterso@redhat.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] Corrections to patch d1475c07f7 "GFS2: rgrp free blocks used incorrectly"
Date: Tue, 24 Jul 2018 10:34:35 -0400 (EDT)	[thread overview]
Message-ID: <627950576.53776198.1532442875057.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <1577025468.53770297.1532441451778.JavaMail.zimbra@redhat.com>

Hi,

Just a heads up:

There's a problem with patch d1475c07f7ce737d06560c24bf025533b7969e3b,
"GFS2: rgrp free blocks used incorrectly".

In today's code, block reservations are currently a "hint" about
where to reserve blocks. As long as the resource group (rgrp) glock
is held in EX on the local node, they are more than just hints:
they're skipped by other allocations. But if the rgrp is moved from
one cluster node to another, the reservations are never returned
(at least not until the patch I posted on 13 July is pushed: see
https://www.redhat.com/archives/cluster-devel/2018-July/msg00059.html)

Until that gets pushed, we can have lots of reservations accumulate
on an rgrp, but they won't necessarily reflect the accurate bitmaps
of the rgrp, which means rd_free (and therefore rd_free_clone)
can have fewer free blocks available than the total reservations.

For that reason, the following line from patch d1475c07f7 is incorrect:

+	BUG_ON(rgd->rd_free_clone < tot_reserved);

That's because tot_reserved can exceed rd_free_clone in today's gfs2.

I've had some discussions with Andreas Gruenbacher and Steve
Whitehouse about trying to enforce gfs2 block reservations as more
than just a hint, but until we resolve that, the patch should be
revised. I'm therefore planning to revise that patch like this:

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 3821b3ab04fa..3632a9a78e3c 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1508,7 +1508,9 @@ static inline u32 rgd_free(struct gfs2_rgrpd *rgd, struct gfs2_blkreserv *rs)
 	BUG_ON(rgd->rd_reserved < rs->rs_free);
 	tot_reserved = rgd->rd_reserved - rs->rs_free;
 
-	BUG_ON(rgd->rd_free_clone < tot_reserved);
+	if (rgd->rd_free_clone < tot_reserved)
+		tot_reserved = 0;
+
 	tot_free = rgd->rd_free_clone - tot_reserved;
 
 	return tot_free;

In other words, if the number of blocks reserved exceeds the number
of free blocks, just return rd_free_clone like we did before.
When those blocks are allocated, function gfs2_adjust_reservation
should then remove the reservation from the rgrp tree and adjust its
count of reserved blocks accordingly.

Later, if we tighten the enforcement of block reservations,
we should change it back into a BUG_ON.

Regards,

Bob Peterson
Red Hat File Systems



           reply	other threads:[~2018-07-24 14:34 UTC|newest]

Thread overview: expand[flat|nested]  mbox.gz  Atom feed
 [parent not found: <1577025468.53770297.1532441451778.JavaMail.zimbra@redhat.com>]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=627950576.53776198.1532442875057.JavaMail.zimbra@redhat.com \
    --to=rpeterso@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).