From: "Theodore Ts'o" <tytso@mit.edu>
To: zhanchengbin <zhanchengbin1@huawei.com>
Cc: Ye Bin <yebin@huaweicloud.com>,
adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
linux-kernel@vger.kernel.org, jack@suse.cz,
Ye Bin <yebin10@huawei.com>
Subject: Re: [PATCH RFC] ext4:record error information when insert extent failed in 'ext4_split_extent_at'
Date: Mon, 5 Dec 2022 12:48:50 -0500 [thread overview]
Message-ID: <Y44vAjbPtntOHy+M@mit.edu> (raw)
In-Reply-To: <2ab8e268-4e1a-8e97-4798-48fcdb651cdf@huawei.com>
On Fri, Oct 28, 2022 at 09:41:55AM +0800, zhanchengbin wrote:
> There have been a lot of problems here before, but the problem has not been
> fundamentally solved.
Indeed, this only papers only the problem. The commit description is
pretty good, in that it describes the the root cause of the problem:
> > If 'ext4_ext_insert_extent' return '-ENOMEM' which will not fix 'ex->ee_len' by
> > old length. 'ext4_ext_insert_extent' will trigger extent tree merge, fix like
> > 'ex->ee_len = orig_ex.ee_len' may lead to new issues.
> > To solve above issue, record error messages when 'ext4_ext_insert_extent' return
> > 'err' not equal '(-ENOSPC && -EDQUOT)'. If filesysten is mounted with 'errors=continue'
> > as filesystem is not clean 'fsck' will repair issue. If filesystem is mounted with
> > 'errors=remount-ro' filesystem will be remounted by read-only.
What should actually happen here is that we undo the change in orig_ex
if ext4_ext_insert_extent fails. As you point out, in an earlier part
of the code path, this gets handled by a "goto fix_extent_len".
The problem is that it's possible that the shape of the extent tree
may have changed before ext4_insert_extent() fails with the ENOMEM.
So the simplistic fix of just jumping to fix_extent_len isn't going to
work. But fixing it by much marking the file system is corrupt is a
bit of a cop-out. Consider that if you were running ext4 in a Huawei
Cloud, and you run into the memory allocation failure, what would you
prefer? For the individual system call to fail, and propagating the
failure to userspace? Or to leave the file system corrupted? (And
then either forcing a reboot so the file system to be fixed, or allow
the system to stumble along, with further unknown unexpected behaviour
from userspace?)
The real correct fix is that ext4_ext_insert_extent() needs to
rollback any changes to the extent tree that was made, *or* it needs
to make sure that the operation will succeed before starting to make
any changes, *or* we need to look up the orig_extent via
orig_ex->ee_block, and then undo the change.
It might be that we can't always reliably rollback the change, or we
might think that the operation will succeed, but then it fail due to
an I/O error. If it's due to an I/O error, then it's fine to bail and
mark the file system as corrupted. But if the failure is caused by an
ENOMEM, we should be able to handle this case more gracefully.
Can you look into a better fix?
Thanks,
- Ted
prev parent reply other threads:[~2022-12-05 17:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-24 12:27 [PATCH RFC] ext4:record error information when insert extent failed in 'ext4_split_extent_at' Ye Bin
2022-10-28 1:41 ` zhanchengbin
2022-12-05 17:48 ` Theodore Ts'o [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y44vAjbPtntOHy+M@mit.edu \
--to=tytso@mit.edu \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=yebin10@huawei.com \
--cc=yebin@huaweicloud.com \
--cc=zhanchengbin1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).