cluster-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [Cluster-devel] [GFS2 PATCH] gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
       [not found] <311347480.3490063.1534521010908.JavaMail.zimbra@redhat.com>
@ 2018-08-17 15:51 ` Bob Peterson
  2018-08-20  8:47   ` Steven Whitehouse
  0 siblings, 1 reply; 2+ messages in thread
From: Bob Peterson @ 2018-08-17 15:51 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Hi,

The GFS2_RDF_UPTODATE flag in the rgrp is used to determine when
a rgrp buffer is valid. It's cleared when the glock is invalidated,
signifying that the buffer data is now invalid. But before this
patch, function update_rgrp_lvb was setting the flag when it
determined it had a valid lvb. But that's an invalid assumption:
just because you have a valid lvb doesn't mean you have valid
buffers. After all, another node may have made the lvb valid,
and this node just fetched it from the glock via dlm.

Consider this scenario:
1. The file system is mounted with RGRPLVB option.
2. In gfs2_inplace_reserve it locks the rgrp glock EX, but thanks
   to GL_SKIP, it skips the gfs2_rgrp_bh_get.
3. Since loops == 0 and the allocation target (ap->target) is
   bigger than the largest known chunk of blocks in the rgrp
   (rs->rs_rbm.rgd->rd_extfail_pt) it skips that rgrp and bypasses
   the call to gfs2_rgrp_bh_get there as well.
4. update_rgrp_lvb sees the lvb MAGIC number is valid, so bypasses
   gfs2_rgrp_bh_get, but it still sets sets GFS2_RDF_UPTODATE due
   to this invalid assumption.
5. The next time update_rgrp_lvb is called, it sees the bit is set
   and just returns 0, assuming both the lvb and rgrp are both
   uptodate. But since this is a smaller allocation, or space has
   been freed by another node, thus adjusting the lvb values,
   it decides to use the rgrp for allocations, with invalid rd_free
   due to the fact it was never updated.

This patch changes update_rgrp_lvb so it doesn't set the UPTODATE
flag anymore. That way, it has no choice but to fetch the latest
values.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
---
 fs/gfs2/rgrp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
index 44a4cbc043dd..fc181c81cca2 100644
--- a/fs/gfs2/rgrp.c
+++ b/fs/gfs2/rgrp.c
@@ -1249,7 +1249,7 @@ static int update_rgrp_lvb(struct gfs2_rgrpd *rgd)
 	rl_flags = be32_to_cpu(rgd->rd_rgl->rl_flags);
 	rl_flags &= ~GFS2_RDF_MASK;
 	rgd->rd_flags &= GFS2_RDF_MASK;
-	rgd->rd_flags |= (rl_flags | GFS2_RDF_UPTODATE | GFS2_RDF_CHECK);
+	rgd->rd_flags |= (rl_flags | GFS2_RDF_CHECK);
 	if (rgd->rd_rgl->rl_unlinked == 0)
 		rgd->rd_flags &= ~GFS2_RDF_CHECK;
 	rgd->rd_free = be32_to_cpu(rgd->rd_rgl->rl_free);



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* [Cluster-devel] [GFS2 PATCH] gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated
  2018-08-17 15:51 ` [Cluster-devel] [GFS2 PATCH] gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated Bob Peterson
@ 2018-08-20  8:47   ` Steven Whitehouse
  0 siblings, 0 replies; 2+ messages in thread
From: Steven Whitehouse @ 2018-08-20  8:47 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Acked-by: Steven Whitehouse <swhiteho@redhat.com>


On 17/08/18 16:51, Bob Peterson wrote:
> Hi,
>
> The GFS2_RDF_UPTODATE flag in the rgrp is used to determine when
> a rgrp buffer is valid. It's cleared when the glock is invalidated,
> signifying that the buffer data is now invalid. But before this
> patch, function update_rgrp_lvb was setting the flag when it
> determined it had a valid lvb. But that's an invalid assumption:
> just because you have a valid lvb doesn't mean you have valid
> buffers. After all, another node may have made the lvb valid,
> and this node just fetched it from the glock via dlm.
>
> Consider this scenario:
> 1. The file system is mounted with RGRPLVB option.
> 2. In gfs2_inplace_reserve it locks the rgrp glock EX, but thanks
>     to GL_SKIP, it skips the gfs2_rgrp_bh_get.
> 3. Since loops == 0 and the allocation target (ap->target) is
>     bigger than the largest known chunk of blocks in the rgrp
>     (rs->rs_rbm.rgd->rd_extfail_pt) it skips that rgrp and bypasses
>     the call to gfs2_rgrp_bh_get there as well.
> 4. update_rgrp_lvb sees the lvb MAGIC number is valid, so bypasses
>     gfs2_rgrp_bh_get, but it still sets sets GFS2_RDF_UPTODATE due
>     to this invalid assumption.
> 5. The next time update_rgrp_lvb is called, it sees the bit is set
>     and just returns 0, assuming both the lvb and rgrp are both
>     uptodate. But since this is a smaller allocation, or space has
>     been freed by another node, thus adjusting the lvb values,
>     it decides to use the rgrp for allocations, with invalid rd_free
>     due to the fact it was never updated.
>
> This patch changes update_rgrp_lvb so it doesn't set the UPTODATE
> flag anymore. That way, it has no choice but to fetch the latest
> values.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
>   fs/gfs2/rgrp.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/gfs2/rgrp.c b/fs/gfs2/rgrp.c
> index 44a4cbc043dd..fc181c81cca2 100644
> --- a/fs/gfs2/rgrp.c
> +++ b/fs/gfs2/rgrp.c
> @@ -1249,7 +1249,7 @@ static int update_rgrp_lvb(struct gfs2_rgrpd *rgd)
>   	rl_flags = be32_to_cpu(rgd->rd_rgl->rl_flags);
>   	rl_flags &= ~GFS2_RDF_MASK;
>   	rgd->rd_flags &= GFS2_RDF_MASK;
> -	rgd->rd_flags |= (rl_flags | GFS2_RDF_UPTODATE | GFS2_RDF_CHECK);
> +	rgd->rd_flags |= (rl_flags | GFS2_RDF_CHECK);
>   	if (rgd->rd_rgl->rl_unlinked == 0)
>   		rgd->rd_flags &= ~GFS2_RDF_CHECK;
>   	rgd->rd_free = be32_to_cpu(rgd->rd_rgl->rl_free);
>



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-08-20  8:47 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <311347480.3490063.1534521010908.JavaMail.zimbra@redhat.com>
2018-08-17 15:51 ` [Cluster-devel] [GFS2 PATCH] gfs2: Don't set GFS2_RDF_UPTODATE when the lvb is updated Bob Peterson
2018-08-20  8:47   ` Steven Whitehouse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).