linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: yebin <yebin10@huawei.com>
To: Jan Kara <jack@suse.cz>, Ritesh Harjani <ritesh.list@gmail.com>
Cc: <tytso@mit.edu>, <adilger.kernel@dilger.ca>,
	<linux-ext4@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -next] ext4: fix bug_on in ext4_iomap_begin as race between bmap and write
Date: Thu, 16 Jun 2022 14:21:57 +0800	[thread overview]
Message-ID: <62AACC05.3030207@huawei.com> (raw)
In-Reply-To: <20220615172609.oydxhjw7exas23hd@quack3.lan>



On 2022/6/16 1:26, Jan Kara wrote:
> On Wed 15-06-22 21:01:23, Ritesh Harjani wrote:
>> On 22/06/15 08:51PM, Ritesh Harjani wrote:
>>> On 22/06/15 09:58PM, Ye Bin wrote:
>>>> We got issue as follows:
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 3 PID: 9310 at fs/ext4/inode.c:3441 ext4_iomap_begin+0x182/0x5d0
>>>> RIP: 0010:ext4_iomap_begin+0x182/0x5d0
>>>> RSP: 0018:ffff88812460fa08 EFLAGS: 00010293
>>>> RAX: ffff88811f168000 RBX: 0000000000000000 RCX: ffffffff97793c12
>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
>>>> RBP: ffff88812c669160 R08: ffff88811f168000 R09: ffffed10258cd20f
>>>> R10: ffff88812c669077 R11: ffffed10258cd20e R12: 0000000000000001
>>>> R13: 00000000000000a4 R14: 000000000000000c R15: ffff88812c6691ee
>>>> FS:  00007fd0d6ff3740(0000) GS:ffff8883af180000(0000) knlGS:0000000000000000
>>>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00007fd0d6dda290 CR3: 0000000104a62000 CR4: 00000000000006e0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> Call Trace:
>>>>   iomap_apply+0x119/0x570
>>>>   iomap_bmap+0x124/0x150
>>>>   ext4_bmap+0x14f/0x250
>>>>   bmap+0x55/0x80
>>>>   do_vfs_ioctl+0x952/0xbd0
>>>>   __x64_sys_ioctl+0xc6/0x170
>>>>   do_syscall_64+0x33/0x40
>>>>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>>
>>>> Above issue may happen as follows:
>>>>            bmap                    write
>>>> bmap
>>>>    ext4_bmap
>>>>      iomap_bmap
>>>>        ext4_iomap_begin
>>>>                              ext4_file_write_iter
>>>> 			      ext4_buffered_write_iter
>>>> 			        generic_perform_write
>>>> 				  ext4_da_write_begin
>>>> 				    ext4_da_write_inline_data_begin
>>>> 				      ext4_prepare_inline_data
>>>> 				        ext4_create_inline_data
>>>> 					  ext4_set_inode_flag(inode,
>>>> 						EXT4_INODE_INLINE_DATA);
>>>>        if (WARN_ON_ONCE(ext4_has_inline_data(inode))) ->trigger bug_on
>>>>
>>>> To solved above issue hold inode lock in ext4_bamp.
>>> 											^^^ ext4_bmap()
>>>
>>> I checked the paths where bmap() kernel api can be called i.e. from jbd2/fc and
>>> generic_swapfile_activate() (apart from ioctl())
>>> For jbd2, it will be called with j_inode within bmap(), hence taking a inode lock
>>> of the inode passed within ext4_bmap() (j_inode in this case) should be safe here.
>>> Same goes with swapfile path as well.
>>>
>>> However I feel maybe we should hold inode_lock_shared() since there is no
>>> block/extent map layout changes that can happen via ext4_bmap().
>>> Hence read lock is what IMO should be used here.
>> On second thoughts, shoudn't we use ext4_iomap_report_ops here?  Can't
>> recollect why we didn't use ext4_iomap_report_ops for iomap_bmap() in the
>> first place. Should be good to verify it once.
> Hum, but I guess there's a deeper problem than ext4_bmap(). Generally we
> have places doing block mapping (such as ext4_writepages(), readahead, or
> page fault) where we don't hold i_rwsem and racing
> ext4_create_inline_data() could confuse them? I guess we need to come up
> with a sound scheme how inline data creation is serialized with these
> operations (or just decide to remove the inline data feature altogether as
> we already discussed once because the complexity likely is not worth the
> gain).
>
> 								Honza
Indeed, this feature has various concurrency problems. At present, there 
is no scenario
using this feature in the actual production environment. However, 
various problems in
the code are easy to be exploited by attackers if they are not solved.
Do I fix this single point problem or remove inline data feature?


  reply	other threads:[~2022-06-16  6:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-15 13:58 [PATCH -next] ext4: fix bug_on in ext4_iomap_begin as race between bmap and write Ye Bin
2022-06-15 15:21 ` Ritesh Harjani
2022-06-15 15:31   ` Ritesh Harjani
2022-06-15 17:26     ` Jan Kara
2022-06-16  6:21       ` yebin [this message]
2022-06-16  6:31       ` Ritesh Harjani
2022-06-16 10:54         ` Jan Kara
2022-06-16  1:44   ` yebin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62AACC05.3030207@huawei.com \
    --to=yebin10@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).