public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Baokun Li <libaokun1@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: <linux-ext4@vger.kernel.org>, <tytso@mit.edu>,
	<adilger.kernel@dilger.ca>, <ritesh.list@gmail.com>,
	<lczerner@redhat.com>, <enwlinux@gmail.com>,
	<linux-kernel@vger.kernel.org>, <yi.zhang@huawei.com>,
	<yebin10@huawei.com>, <yukuai3@huawei.com>,
	Baokun Li <libaokun1@huawei.com>
Subject: Re: [PATCH 2/2] ext4: add inode table check in __ext4_get_inode_loc to aovid possible infinite loop
Date: Thu, 18 Aug 2022 09:54:45 +0800	[thread overview]
Message-ID: <803e4ab0-e336-5434-2827-a5091eefad59@huawei.com> (raw)
In-Reply-To: <20220817143138.7krkxzoa3skruiyx@quack3>

On 8/17/2022 10:31 PM, Jan Kara wrote:
> On Wed 17-08-22 21:27:01, Baokun Li wrote:
>> In do_writepages, if the value returned by ext4_writepages is "-ENOMEM"
>> and "wbc->sync_mode == WB_SYNC_ALL", retry until the condition is not met.
>>
>> In __ext4_get_inode_loc, if the bh returned by sb_getblk is NULL,
>> the function returns -ENOMEM.
>>
>> In __getblk_slow, if the return value of grow_buffers is less than 0,
>> the function returns NULL.
>>
>> When the three processes are connected in series like the following stack,
>> an infinite loop may occur:
>>
>> do_writepages					<--- keep retrying
>>   ext4_writepages
>>    mpage_map_and_submit_extent
>>     mpage_map_one_extent
>>      ext4_map_blocks
>>       ext4_ext_map_blocks
>>        ext4_ext_handle_unwritten_extents
>>         ext4_ext_convert_to_initialized
>>          ext4_split_extent
>>           ext4_split_extent_at
>>            __ext4_ext_dirty
>>             __ext4_mark_inode_dirty
>>              ext4_reserve_inode_write
>>               ext4_get_inode_loc
>>                __ext4_get_inode_loc		<--- return -ENOMEM
>>                 sb_getblk
>>                  __getblk_gfp
>>                   __getblk_slow			<--- return NULL
>>                    grow_buffers
>>                     grow_dev_page		<--- return -ENXIO
>>                      ret = (block < end_block) ? 1 : -ENXIO;
>>
>> In this issue, bg_inode_table_hi is overwritten as an incorrect value.
>> As a result, `block < end_block` cannot be met in grow_dev_page.
>> Therefore, __ext4_get_inode_loc always returns '-ENOMEM' and do_writepages
>> keeps retrying. As a result, the writeback process is in the D state due
>> to an infinite loop.
>>
>> Add a check on inode table block in the __ext4_get_inode_loc function by
>> referring to ext4_read_inode_bitmap to avoid this infinite loop.
>>
>> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> Thanks for the fixes. Normally, we check that inode table is fine in
> ext4_check_descriptors() (and those checks are much stricter) so it seems
Yes, when I first found this problem, I thought that the inode table was 
an abnormal value
when the disk was mounted. However, in ext4_ check_ Descriptors see the 
inspection of inode table.
ext4_ check_ Descriptors is more strict, and it also checks blocks_ 
Bitmap and inode_ bitmap。

However, in addition to mounting, I can't find any changes to 
bg_inode_table_hi  in other parts of
the kernel, which makes me very confused. It wasn't until I tracked the 
address of the variable that
I found it was modified by memcpy.

This check is added here to facilitate us to find problems and avoid 
dead loops. Otherwise,
we may not find problems until the write back process is stuck in the D 
state for several hours.
  At this time, the file system may be in a mess.
> unnecessary to check it again here. I understand that in your case it was
> resize that corrupted the group descriptor after the filesystem was mounted
> which is nasty but there's much more metadata that can be corrupted like
> this and it's infeasible to check each metadata block before we use it.
Indeed, But is it true that checking for inode_bitmap in 
ext4_read_inode_bitmap and
checking for block_bitmap in ext4_read_block_bitmap_nowait also 
unnecessary?
After all, inode_bitmap and block_bitmap may be faulty only when the 
metadata  is corrupted.
I think it seems unreasonable that both block_bitmap and inode_bitmap 
have checks and inode_table does not.
>
> IMHO a proper fix to this class of issues would be for sb_getblk() to
> return proper error so that we can distinguish ENOMEM from other errors.

Totally agree! There is a FIXME in the comments of __getblk_gfp:

` ` `
__getblk_gfp() will lock up the machine if grow_dev_page's
try_to_free_buffers() attempt is failing. FIXME, perhaps?
` ` `

This is the same as this issue because the return value of the 
grow_buffers function is not propagated correctly.

> But that will be a larger undertaking...
>
> 								Honza
Yes, this function can be called in many places, and external interface 
changes are involved.
>> ---
>>   fs/ext4/inode.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
Thanks!
-- 
With Best Regards,
Baokun Li

  reply	other threads:[~2022-08-18  1:54 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17 13:26 [PATCH 0/2] ext4: fix a infinite loop in do_writepages after online resizing Baokun Li
2022-08-17 13:27 ` [PATCH 1/2] ext4: fix GDT corruption after online resizing with bigalloc enable and blocksize is 1024 Baokun Li
2022-08-17 13:27 ` [PATCH 2/2] ext4: add inode table check in __ext4_get_inode_loc to aovid possible infinite loop Baokun Li
2022-08-17 14:31   ` Jan Kara
2022-08-18  1:54     ` Baokun Li [this message]
2022-08-18 14:43     ` Ritesh Harjani (IBM)
2022-08-18 17:23       ` Jan Kara
2022-08-18 23:15         ` Ritesh Harjani (IBM)
2022-08-19  8:44           ` Jan Kara
2022-11-28 20:44       ` Theodore Ts'o
2022-11-29  8:54         ` Ritesh Harjani (IBM)
2022-11-29 21:12 ` [PATCH 0/2] ext4: fix a infinite loop in do_writepages after online resizing Theodore Ts'o
2022-11-30  2:08   ` Baokun Li
2022-12-01  3:42     ` Theodore Ts'o
2022-12-01  6:26       ` Baokun Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=803e4ab0-e336-5434-2827-a5091eefad59@huawei.com \
    --to=libaokun1@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=enwlinux@gmail.com \
    --cc=jack@suse.cz \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yebin10@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox