linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: yebin <yebin10@huawei.com>
To: Jan Kara <jack@suse.cz>
Cc: <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
	<linux-ext4@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<lczerner@redhat.com>
Subject: Re: [PATCH -next] ext4: fix warning in ext4_handle_inode_extension
Date: Wed, 30 Mar 2022 20:08:13 +0800	[thread overview]
Message-ID: <6244482D.4090603@huawei.com> (raw)
In-Reply-To: <20220329092810.j5ngxckygut6mxo2@quack3.lan>



On 2022/3/29 17:28, Jan Kara wrote:
> On Sat 26-03-22 14:53:51, Ye Bin wrote:
>> We got issue as follows:
>> EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
>> EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
>> EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
>> EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
>> Modules linked in:
>> CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
>> RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
>> RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
>> RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
>> RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
>> RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
>> R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
>> R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
>> FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>   do_iter_readv_writev+0x2e5/0x360
>>   do_iter_write+0x112/0x4c0
>>   do_pwritev+0x1e5/0x390
>>   __x64_sys_pwritev2+0x7e/0xa0
>>   do_syscall_64+0x37/0x50
>>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Above issue may happen as follows:
>> Assume
>> inode.i_size=4096
>> EXT4_I(inode)->i_disksize=4096
>>
>> step 1: set inode->i_isize = 8192
>> ext4_setattr
>>    if (attr->ia_size != inode->i_size)
>>      EXT4_I(inode)->i_disksize = attr->ia_size;
>>      rc = ext4_mark_inode_dirty
>>         ext4_reserve_inode_write
>>            ext4_get_inode_loc
>>              __ext4_get_inode_loc
>>                sb_getblk --> return -ENOMEM
>>     ...
>>     if (!error)  ->will not update i_size
>>       i_size_write(inode, attr->ia_size);
>> Now:
>> inode.i_size=4096
>> EXT4_I(inode)->i_disksize=8192
>>
>> step 2: Direct write 4096 bytes
>> ext4_file_write_iter
>>   ext4_dio_write_iter
>>     iomap_dio_rw ->return error
>>   if (extend)
>>     ext4_handle_inode_extension
>>       WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
>> ->Then trigger warning.
>>
>> To solve above issue, if mark inode dirty failed in ext4_setattr just
>> set 'EXT4_I(inode)->i_disksize' with old value.
>>
>> Signed-off-by: Ye Bin <yebin10@huawei.com>
> Thanks for the fix! So I think this deserves a further debate. I have two
> points here:
>
> 1) If ext4_mark_inode_dirty() fails (or basically any metadata writeback)
> we must abort the journal because metadata is not guaranteed to be
> consistent anymore. In this particular callsite of ext4_mark_inode_dirty()
> you were able to undo the changes but there are many more where it is not
> sanely possible AFAICT. Hence I think that ext4_reserve_inode_write() needs
> to call ext4_journal_abort_handle() (as already happens inside
> __ext4_journal_get_write_access()) and not just ext4_std_error().
>
> 2) The assertion in ext4_handle_inode_extension() should be conditioned on
> !is_journal_aborted() to avoid useless warnings for filesystems we know are
> inconsistent anyway.
>
> Thoughts?
>
> 								Honza
Do you mean call jbd2_abort in ext4_reserve_inode_write() ?
If we abort journal when metadata is not guaranteed to be consistent. 
The mode of
‘errors=continue’ is unnecessary.
>> ---
>>   fs/ext4/inode.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index 90fd6f7b6209..8adf1f802f6c 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -5384,6 +5384,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
>>   	if (attr->ia_valid & ATTR_SIZE) {
>>   		handle_t *handle;
>>   		loff_t oldsize = inode->i_size;
>> +		loff_t old_disksize;
>>   		int shrink = (attr->ia_size < inode->i_size);
>>   
>>   		if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) {
>> @@ -5455,6 +5456,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
>>   					inode->i_sb->s_blocksize_bits);
>>   
>>   			down_write(&EXT4_I(inode)->i_data_sem);
>> +			old_disksize = EXT4_I(inode)->i_disksize;
>>   			EXT4_I(inode)->i_disksize = attr->ia_size;
>>   			rc = ext4_mark_inode_dirty(handle, inode);
>>   			if (!error)
>> @@ -5466,6 +5468,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
>>   			 */
>>   			if (!error)
>>   				i_size_write(inode, attr->ia_size);
>> +			else
>> +				EXT4_I(inode)->i_disksize = old_disksize;
>>   			up_write(&EXT4_I(inode)->i_data_sem);
>>   			ext4_journal_stop(handle);
>>   			if (error)
>> -- 
>> 2.31.1
>>


  reply	other threads:[~2022-03-30 12:18 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-26  6:53 [PATCH -next] ext4: fix warning in ext4_handle_inode_extension Ye Bin
2022-03-28 15:57 ` Gabriel Krisman Bertazi
     [not found]   ` <62426226.6060903@huawei.com>
2022-03-29 16:11     ` Gabriel Krisman Bertazi
2022-03-29  9:28 ` Jan Kara
2022-03-30 12:08   ` yebin [this message]
2022-03-30 13:30     ` Jan Kara
2022-04-14  2:46       ` yebin
2022-05-11 14:09         ` Theodore Ts'o
2022-05-11 14:27 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6244482D.4090603@huawei.com \
    --to=yebin10@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).