Re: [PATCH] ext4: reject 1k block fs on the first block of disk

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tudor Ambarus <tudor.ambarus@linaro.org>
To: Theodore Ts'o <tytso@mit.edu>, Jun Nie <jun.nie@linaro.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org, Lee Jones <joneslee@google.com>
Subject: Re: [PATCH] ext4: reject 1k block fs on the first block of disk
Date: Wed, 15 Feb 2023 16:26:39 +0000	[thread overview]
Message-ID: <dbd7cd6f-5d1d-7fcf-bd19-d22fef4db746@linaro.org> (raw)
In-Reply-To: <4e5fb36f-d234-1f94-5e6c-746aef612bb6@linaro.org>



On 2/15/23 11:53, Tudor Ambarus wrote:
> 
> 
> On 2/15/23 11:46, Tudor Ambarus wrote:
>> Hi, Ted!
>>
>> On 2/15/23 04:32, Theodore Ts'o wrote:
>>> On Wed, Jan 04, 2023 at 09:58:03AM +0800, Jun Nie wrote:
>>>> Darrick J. Wong <djwong@kernel.org> 于2023年1月4日周三 03:17写道：
>>>>>
>>>>> On Thu, Dec 29, 2022 at 09:45:02AM +0800, Jun Nie wrote:
>>>>>> For 1k-block filesystems, the filesystem starts at block 1, not 
>>>>>> block 0.
>>>>>> If start_fsb is 0, it will be bump up to s_first_data_block. Then
>>>>>> ext4_get_group_no_and_offset don't know what to do and return garbage
>>>>>> results (blockgroup 2^32-1). The underflow make index
>>>>>> exceed es->s_groups_count in ext4_get_group_info() and trigger the 
>>>>>> BUG_ON.
>>>>>>
>>>>>> Fixes: 4a4956249dac0 ("ext4: fix off-by-one fsmap error on 1k 
>>>>>> block filesystems")
>>>>>> Link: 
>>>>>> https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
>>>>>> Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com
>>>>>> Signed-off-by: Jun Nie <jun.nie@linaro.org>
>>>>>> ---
>>>>>>   fs/ext4/fsmap.c | 6 ++++++
>>>>>>   1 file changed, 6 insertions(+)
>>>>>>
>>>>>> diff --git a/fs/ext4/fsmap.c b/fs/ext4/fsmap.c
>>>>>> index 4493ef0c715e..1aef127b0634 100644
>>>>>> --- a/fs/ext4/fsmap.c
>>>>>> +++ b/fs/ext4/fsmap.c
>>>>>> @@ -702,6 +702,12 @@ int ext4_getfsmap(struct super_block *sb, 
>>>>>> struct ext4_fsmap_head *head,
>>>>>>                if (handlers[i].gfd_dev > 
>>>>>> head->fmh_keys[0].fmr_device)
>>>>>>                        memset(&dkeys[0], 0, sizeof(struct 
>>>>>> ext4_fsmap));
>>>>>>
>>>>>> +             /*
>>>>>> +              * Re-check the range after above limit operation 
>>>>>> and reject
>>>>>> +              * 1K fs on block 0 as fs should start block 1. */
>>>>>> +             if (dkeys[0].fmr_physical ==0 && 
>>>>>> dkeys[1].fmr_physical == 0)
>>>>>> +                     continue;
>>>>>
>>>>> ...and if this filesystem has 4k blocks, and therefore *does* define a
>>>>> block 0?
>>>>
>>>> Yes, this is a real corner case test :-)
>>>
>>> So I'm really nervous about this change.  I don't understand the code;
>>> and I don't understand how the reproducer works.  I can certainly
>>> reproduce it using the reproducer found here[1], but it seems to
>>> require running multiple processes all creating loop devices and then
>>> running FS_IOC_GETMAP.
>>>
>>> [1] 
>>> https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
>>>
>>> If I change the reproducer to just run the execute_one() once, it
>>> doesn't trigger the bug.  It seems to only trigger when you have
>>> multiple processes all racing to create a loop device, mount the file
>>> system, try running FS_IOC_GETMAP --- and then delete the loop device
>>> without actually unmounting the file system.  Which is **weird***.
>>>
>>> I've tried taking the image, and just running "xfs_io -c fsmap /mnt",
>>> and that doesn't trigger it either.
>>>
>>> And I don't understand the reply to Darrick's question about why it's
>>> safe to add the check since for 4k block file systems, block 0 *is*
>>> valid.
>>>
>>> So if someone can explain to me what is going on here with this code
>>> (there are too many abstractions and what's going on with keys is just
>>> making my head hurt), *and* what the change actually does, and how to
>>> reproduce the problem with a ***simple*** reproducer -- the syzbot
>>> mess doesn't count, that would be great.  But applying a change that I
>>> don't understand to code I don't understand, to fix a reproducer which
>>> I also doesn't understand, just doesn't make me feel comfortable.
>>>
>>
>> Let me share what I understood until now. The low key is zeroed. The
>> high key is defined and uses a fmr_physical of value zero, which is
>> smaller than the first data block for the 1k-block ext4 fs (which starts
>> at offset 1024).
>>
>> -> ext4_getfsmap_datadev()
>>    keys[0].fmr_physical = 0, keys[1].fmr_physical = 0
>>    bofs = le32_to_cpu(sbi->s_es->s_first_data_block) = 1, eofs = 256
>>    start_fsb = keys[0].fmr_physical = 1, end_fsb = 
>> keys[1].fmr_physical = 0
>>    -> ext4_get_group_no_and_offset()
>>      blocknr = 1, le32_to_cpu(es->s_first_data_block) =1
>>    start_ag = 0, first_cluster = 0
>>    ->
>>      blocknr = 0, le32_to_cpu(es->s_first_data_block) =1
>>    end_ag = 4294967295, last_cluster = 8191
> 
> because of poor key validation we get a wrong end_ag which eventually
> causes the BUG_ON.
> 
>>
>>    Then there's a loop that stops when info->gfi_agno <= end_ag; that 
>> will trigger the BUG_ON in ext4_get_group_info() as the group nr 
>> exceeds EXT4_SB(sb)->s_groups_count)
>>    -> ext4_mballoc_query_range()
>>      -> ext4_mb_load_buddy()
>>        -> ext4_mb_load_buddy_gfp()
>>          -> ext4_get_group_info()
>>
>> It's an out of bounds request and Darrick suggested to not return any
>> mapping for the byte range 0-1023 for the 1k-block filesystem. The
>> alternative would be to return -EINVAL when the high key starts at
>> fmr_phisical of value zero for the 1k-block fs.
>>
>> In order to reproduce this one would have to create an 1k-block ext4 fs
>> and to pass a high key with fmr_physical of value zero, thus I would
>> expect to reproduce it with something like this:
>> xfs_io -c 'fsmap -d 0 0' /mnt/scratch
>>
>> However when doing this I notice that in
>> xfsprogs-dev/io/fsmap.c l->fmr_device and h->fmr_device will have value
>> zero, FS_IOC_GETFSMAP is called and then we receive no entries
>> (head->fmh_entries = 0). Now I'm trying to see what I do wrong, and how
>> to reproduce the bug.
>>


What I think it happens for the reproducer that I proposed, is that when
both {l, h}->fmr_device have value zero, the code exits early before
getting the fsmap:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/ext4/fsmap.c?h=v6.2-rc8#n691

Also, to my untrained fs eye it seems that the [-d|-l|-r] xfs_io's fsmap
options are intended only for XFS, as the {data, log, realtime} sections
are XFS specific. I wonder why "struct fs_path" from libfrog/paths.h is
not renamed to "struct xfs_path", it would have been less confusing.

It looks there's no support for xfs_io to query for a start and end
offset when asking for a fsmap on an ext4 fs. I'm checking how I can
extend the xfs_io fsmap ext4 support to validate my assumptions.

Cheers,
ta

next prev parent reply	other threads:[~2023-02-15 16:26 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-29  1:45 [PATCH] ext4: reject 1k block fs on the first block of disk Jun Nie
2023-01-03 19:17 ` Darrick J. Wong
2023-01-04  1:58   ` Jun Nie
2023-02-15  4:32     ` Theodore Ts'o
2023-02-15 11:46       ` Tudor Ambarus
2023-02-15 11:53         ` Tudor Ambarus
2023-02-15 16:26           ` Tudor Ambarus [this message]
2023-02-22 15:27       ` Tudor Ambarus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dbd7cd6f-5d1d-7fcf-bd19-d22fef4db746@linaro.org \
    --to=tudor.ambarus@linaro.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=djwong@kernel.org \
    --cc=joneslee@google.com \
    --cc=jun.nie@linaro.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).