From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Baokun Li <libaokun1@huawei.com>, linux-ext4@vger.kernel.org
Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz,
ojaswin@linux.ibm.com, linux-kernel@vger.kernel.org,
yi.zhang@huawei.com, yangerkun@huawei.com, yukuai3@huawei.com,
libaokun1@huawei.com
Subject: Re: [PATCH 2/4] ext4: fix BUG in ext4_mb_new_inode_pa() due to overflow
Date: Thu, 20 Jul 2023 18:14:52 +0530 [thread overview]
Message-ID: <87edl2oipn.fsf@doe.com> (raw)
In-Reply-To: <20230718131052.283350-3-libaokun1@huawei.com>
Baokun Li <libaokun1@huawei.com> writes:
> When we calculate the end position of ext4_free_extent, this position may
> be exactly where ext4_lblk_t (i.e. uint) overflows. For example, if
> ac_g_ex.fe_logical is 4294965248 and ac_orig_goal_len is 2048, then the
> computed end is 0x100000000, which is 0. If ac->ac_o_ex.fe_logical is not
> the first case of adjusting the best extent, that is, new_bex_end > 0, the
> following BUG_ON will be triggered:
Nice spotting.
>
> =========================================================
> kernel BUG at fs/ext4/mballoc.c:5116!
> invalid opcode: 0000 [#1] PREEMPT SMP PTI
> CPU: 3 PID: 673 Comm: xfs_io Tainted: G E 6.5.0-rc1+ #279
> RIP: 0010:ext4_mb_new_inode_pa+0xc5/0x430
> Call Trace:
> <TASK>
> ext4_mb_use_best_found+0x203/0x2f0
> ext4_mb_try_best_found+0x163/0x240
> ext4_mb_regular_allocator+0x158/0x1550
> ext4_mb_new_blocks+0x86a/0xe10
> ext4_ext_map_blocks+0xb0c/0x13a0
> ext4_map_blocks+0x2cd/0x8f0
> ext4_iomap_begin+0x27b/0x400
> iomap_iter+0x222/0x3d0
> __iomap_dio_rw+0x243/0xcb0
> iomap_dio_rw+0x16/0x80
> =========================================================
>
> A simple reproducer demonstrating the problem:
>
> mkfs.ext4 -F /dev/sda -b 4096 100M
> mount /dev/sda /tmp/test
> fallocate -l1M /tmp/test/tmp
> fallocate -l10M /tmp/test/file
> fallocate -i -o 1M -l16777203M /tmp/test/file
> fsstress -d /tmp/test -l 0 -n 100000 -p 8 &
> sleep 10 && killall -9 fsstress
> rm -f /tmp/test/tmp
> xfs_io -c "open -ad /tmp/test/file" -c "pwrite -S 0xff 0 8192"
Could you please also add it into xfstests?
I think it's a nice test which can check the boundary conditions for
start and end of data types used in mballoc. I think it should even work
if you don't do fsstress but instead just fallocate some remaining space
in filesystem, so that when you open and try to write it to 0th offset,
if automatically hits this error in ext4_mb_new_inode_pa().
>
> We declare new_bex_start and new_bex_end as correct types and use fex_end()
> to avoid the problems caused by the ext4_lblk_t overflow above.
>
> Fixes: 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()")
> Signed-off-by: Baokun Li <libaokun1@huawei.com>
> ---
> fs/ext4/mballoc.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index eb7f5d35ef96..2090e5e7ba58 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -5076,8 +5076,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> pa = ac->ac_pa;
>
> if (ac->ac_b_ex.fe_len < ac->ac_orig_goal_len) {
> - int new_bex_start;
> - int new_bex_end;
> + ext4_lblk_t new_bex_start;
> + loff_t new_bex_end;
>
> /* we can't allocate as much as normalizer wants.
> * so, found space must get proper lstart
> @@ -5096,8 +5096,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> * still cover original start
> * 3. Else, keep the best ex at start of original request.
> */
> - new_bex_end = ac->ac_g_ex.fe_logical +
> - EXT4_C2B(sbi, ac->ac_orig_goal_len);
> + new_bex_end = fex_end(sbi, &ac->ac_g_ex, &ac->ac_orig_goal_len);
> new_bex_start = new_bex_end - EXT4_C2B(sbi, ac->ac_b_ex.fe_len);
> if (ac->ac_o_ex.fe_logical >= new_bex_start)
> goto adjust_bex;
> @@ -5117,8 +5116,8 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
>
> BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical);
> BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len);
> - BUG_ON(new_bex_end > (ac->ac_g_ex.fe_logical +
> - EXT4_C2B(sbi, ac->ac_orig_goal_len)));
Ok so the right hand becomes 0 (because then end can go upto 1<<32 which
will be 0 for unsigned int). And the left (new_bex_end) might be
negative due to some operations above as I see it.
And comparing an int with unsigned int, it will promote new_bex_end to
unsigned int which will make it's value too large and will hit the
bug_on.
I would like to carefully review all such paths. I will soon review and
get back.
> + BUG_ON(new_bex_end >
> + fex_end(sbi, &ac->ac_g_ex, &ac->ac_orig_goal_len));
I am not sure whether using fex_end or pa_end is any helpful.
I think we can just typecast if needed and keep it simple rather
than adding helpers functions for addition operation.
(because of the fact that fex_end() can take a third parameter which
sometimes you pass as NULL. Hence it doesn't look clean, IMO)
> }
>
> pa->pa_lstart = ac->ac_b_ex.fe_logical;
> --
> 2.31.1
-ritesh
next prev parent reply other threads:[~2023-07-20 12:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-18 13:10 [PATCH 0/4] ext4: fix some ext4_lblk_t overflow issues Baokun Li
2023-07-18 13:10 ` [PATCH 1/4] ext4: add two helper functions fex_end() and pa_end() Baokun Li
2023-07-20 17:48 ` Ritesh Harjani
2023-07-18 13:10 ` [PATCH 2/4] ext4: fix BUG in ext4_mb_new_inode_pa() due to overflow Baokun Li
2023-07-20 12:44 ` Ritesh Harjani [this message]
2023-07-20 13:42 ` Baokun Li
2023-07-20 19:30 ` Ritesh Harjani
2023-07-21 7:29 ` Baokun Li
2023-07-21 8:24 ` Ritesh Harjani
2023-07-21 9:13 ` Baokun Li
2023-07-21 17:15 ` Theodore Ts'o
2023-07-22 3:16 ` Baokun Li
2023-07-20 19:07 ` Ritesh Harjani
2023-07-21 8:01 ` Baokun Li
2023-07-18 13:10 ` [PATCH 3/4] ext4: avoid overlapping preallocations " Baokun Li
2023-07-20 19:24 ` Ritesh Harjani
2023-07-18 13:10 ` [PATCH 4/4] ext4: avoid prealloc space being skipped " Baokun Li
2023-07-20 19:22 ` Ritesh Harjani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edl2oipn.fsf@doe.com \
--to=ritesh.list@gmail.com \
--cc=adilger.kernel@dilger.ca \
--cc=jack@suse.cz \
--cc=libaokun1@huawei.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=tytso@mit.edu \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
--cc=yukuai3@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox