From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Wed, 9 Jan 2019 17:18:50 +0000
Subject: [Cluster-devel] kernel BUG at fs/gfs2/inode.h:64
In-Reply-To: <2022831.JNBRD0B6hm@dhcp-3-135.uk.xensource.com>
References: <ac9147e8a8584927ae4f6ff6bbfecbe3@AMSPEX02CL02.citrite.net>
	<ece7069401bd4646878e830285e5b9c5@AMSPEX02CL02.citrite.net>
	<CAHc6FU6at5sjoc5Kyma1OkBC5YE8tDUQadxUg_6xT44HGaVYBg@mail.gmail.com>
	<2022831.JNBRD0B6hm@dhcp-3-135.uk.xensource.com>
Message-ID: <c5e61f17-eb05-9be7-72fd-c47ce182f84d@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Hi,

On 09/01/2019 17:14, Tim Smith wrote:
> On Wednesday, 9 January 2019 15:35:05 GMT Andreas Gruenbacher wrote:
>> On Wed, 9 Jan 2019 at 14:43, Mark Syms <Mark.Syms@citrix.com> wrote:
>>> We don't yet know how the assert got triggered as we've only seen it once
>>> and in the original form it looks like it would be very hard to trigger in
>>> any normal case (given that in default usage i_blocks should be at least 8
>>> times what any putative value for change could be). So, for the assert to
>>> have triggered we've been asked to remove at least 8 times the number of
>>> blocks currently allocated to the inode. Possible causes could be a double
>>> release or some other higher level bug that will require further
>>> investigation to uncover.
>> The following change has at least survived xfstests:
>>
>> --- a/fs/gfs2/inode.h
>> +++ b/fs/gfs2/inode.h
>> @@ -61,8 +61,8 @@ static inline u64 gfs2_get_inode_blocks(const struct
>> inode *inode)
>>
>>   static inline void gfs2_add_inode_blocks(struct inode *inode, s64 change)
>>   {
>> -    gfs2_assert(GFS2_SB(inode), (change >= 0 || inode->i_blocks >
>> -change));
>> -    change *= (GFS2_SB(inode)->sd_sb.sb_bsize/GFS2_BASIC_BLOCK);
>> +    change <<= inode->i_blkbits - 9;
>> +    gfs2_assert(GFS2_SB(inode), change >= 0 || inode->i_blocks >= -change);
>> inode->i_blocks += change;
>>   }
>>
>> Andreas
> I'll use
>
> change <<= (GFS2_SB(inode)->sd_sb.sb_bsize_shift - GFS2_BASIC_BLOCK_SHIFT);
>
> for consistency with the gfs2_get/set_inode_blocks(). I'll send the patch in a
> bit.
>
> Given what it was like before, either i_blocks was already 0 or -change
> somehow became stupidly huge. Anything else looks like it would be hard to
> reproduce without mkfs.gfs2 -b 512 (so that GFS2_SB(inode)->sd_sb.sb_bsize ==
> GFS2_BASIC_BLOCK) which we don't do.
>
> I'll try to work out what could have caused it and see if we can provoke it
> again.
>
> Out of curiosity I did a few tests where I created a file on GFS2, copied /
> dev/null on top of it, and then ran stat on the file. It seems like GFS2 never
> frees the last allocation on truncate; stat always reports 1, 2, 4 or 8 blocks
> in use for a zero-length file depending on the underlying filesystem block
> size, unlike (say) ext3/4 where it reports 0. I presume this is intentional so
> maybe some corner case where it *is* trying to do that is the root of the
> problem.
>
That is because gfs2 uses a block for each inode, so it makes sense to 
account for it in that way. For ext* the inodes are in separate areas of 
the disk, and they only use up a partial block, and they are also 
counted separately too. So it is a historical design difference I think,

Steve.