public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Baokun Li <libaokun1@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>, <tytso@mit.edu>,
	<adilger.kernel@dilger.ca>, <ritesh.list@gmail.com>,
	<linux-kernel@vger.kernel.org>, <yi.zhang@huawei.com>,
	<yukuai3@huawei.com>
Subject: Re: [PATCH v2 2/3] ext4: fix corrupt backup group descriptors after online resize
Date: Thu, 17 Nov 2022 09:37:14 +0800	[thread overview]
Message-ID: <205254f9-ee77-4e5f-d5b3-f315c377f41e@huawei.com> (raw)
In-Reply-To: <20221116152618.hznqamogp2gwpqtp@quack3>

On 2022/11/16 23:26, Jan Kara wrote:
> On Wed 16-11-22 21:14:16, Baokun Li wrote:
>> On 2022/11/16 19:49, Jan Kara wrote:
>>> On Wed 16-11-22 15:28:01, Baokun Li wrote:
>>>> In commit 9a8c5b0d0615 ("ext4: update the backup superblock's at the end
>>>> of the online resize"), it is assumed that update_backups() only updates
>>>> backup superblocks, so each b_data is treated as a backupsuper block to
>>>> update its s_block_group_nr and s_checksum. However, update_backups()
>>>> also updates the backup group descriptors, which causes the backup group
>>>> descriptors to be corrupted.
>>>>
>>>> The above commit fixes the problem of invalid checksum of the backup
>>>> superblock. The root cause of this problem is that the checksum of
>>>> ext4_update_super() is not set correctly. This problem has been fixed
>>>> in the previous patch ("ext4: fix bad checksum after online resize").
>>>> Therefore, roll back some modifications in the above commit.
>>>>
>>>> Fixes: 9a8c5b0d0615 ("ext4: update the backup superblock's at the end of the online resize")
>>>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
>>> So I agree commit 9a8c5b0d0615 is broken and does corrupt group
>>> descriptors. However I don't see how PATCH 1/3 in this series would fix all
>>> the problems commit 9a8c5b0d0615 is trying to fix. In particular checksums
>>> on backup superblocks will not be properly set by the resize code AFAICT.
>>>
>>> 								Honza
>> I didn't find these two issues to be the same until I researched the problem
>> in
>> PATCH 3/3 and found that commit 9a8c5b0d0615 introduced a similar problem.
>> Then, it is found that the backup superblock is directly copied from the
>> primary
>> superblock. If the backup superblock is faulty, the primary superblock must
>> be
>> faulty. In this case, patch 1 that fixes the primary superblock problem is
>> thought
>> of. So by rolling back commit 9a8c5b0d0615 to verify, I found that patch 1
>> did
>> fix the problem.
>>
>> Only ext4_flex_group_add() and ext4_group_extend_no_check() call
>> update_backups() to update the backup superblock. Both of these functions
>> correctly set the checksum of the primary superblock. The backup superblocks
>> that are copied from them are also correct.
>>
>> In ext4_flex_group_add(), we only update the backup superblock if there are
>> no
>> previous errors, indicating that we must have updated the checksum in
>> ext4_update_super() before executing update_backups(). The previous problem
>> was that after we updated the checksum in ext4_update_super(), we modified
>> s_overhead_clusters, so the checksums for both the primary and backup
>> superblocks
>> were incorrect. This problem has been fixed in PATCH 1/3, so checksum is set
>> correctly in ext4_flex_group_add().
>>
>> The same is true in ext4_group_extend_no_check(), we only update the backup
>> superblock if there are no errors, and we execute ext4_superblock_csum_set()
>> to update the checksum before updating the backup superblock. Therefore,
>> checksum is correctly set in ext4_group_extend_no_check().
>>
>> I think we only need to ensure that the checksum is set correctly when the
>> buffer
>> lock of sbi->s_sbh is unlocked. Therefore, the checksum should be correct
>> before
>> update_backups() holds the buffer lock. Also, in update_backups() we copy
>> the
>> entire superblock completely, and the checksum is unchanged, so we don't
>> need
>> to reset it.
> So I agree the checksum should be matching but the backup superblock should
> have also s_block_group_nr set properly and after updating that we need to
> recalculate the checksum as well.
>
> 								Honza

Totally agree!

I will try to fix this in a better way in V3.

>>>> ---
>>>>    fs/ext4/resize.c | 5 -----
>>>>    1 file changed, 5 deletions(-)
>>>>
>>>>
Thank you for your review!
-- 
With Best Regards,
Baokun Li
.

  reply	other threads:[~2022-11-17  1:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-16  7:27 [PATCH v2 0/3] ext4: fix some bugs in online resize Baokun Li
2022-11-16  7:28 ` [PATCH v2 1/3] ext4: fix bad checksum after " Baokun Li
2022-11-16  7:28 ` [PATCH v2 2/3] ext4: fix corrupt backup group descriptors " Baokun Li
2022-11-16 11:49   ` Jan Kara
2022-11-16 13:14     ` Baokun Li
2022-11-16 15:26       ` Jan Kara
2022-11-17  1:37         ` Baokun Li [this message]
2022-11-16  7:28 ` [PATCH v2 3/3] ext4: fix corruption when online resizing a 1K bigalloc fs Baokun Li
2022-11-16 11:56   ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=205254f9-ee77-4e5f-d5b3-f315c377f41e@huawei.com \
    --to=libaokun1@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox