Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <wqu@suse.com>, Qu Wenruo <quwenruo.btrfs@gmx.com>,
	linux-btrfs@vger.kernel.org
Cc: dsterba@suse.com, y16267966@gmail.com, linux-ext4@vger.kernel.org
Subject: Re: [PATCH v2 4/4] btrfs-progs: convert: support ext2 unwritten file data extents
Date: Mon, 6 May 2024 17:56:42 +0800	[thread overview]
Message-ID: <a3bd9271-c010-4eb5-8814-0f9247ff4117@oracle.com> (raw)
In-Reply-To: <fa9aa138-476d-413f-ac02-35156baacd66@suse.com>


>>> . Remove RFC
>>> . Identify the block with a merged preallocated extent and call 
>>> fail-safe
>>> . Qu has an idea that it could be marked as a hole, which may be 
>>> based on
>>>    top of this patch.
>>
>> Well, my idea of going holes other than preallocated extents is mostly
>> to avoid the extra @prealloc flag parameter.
>>
>> But that's not a big deal for now, as I found the following way to
>> easily crack your v2 patchset:


This patch and the below test case are working as designed it is not
a bug/crack, with the current limitation that it should fail (safer
than silent corruption) (as shown below) when there is a merged 
unwritten data extent.


   ERROR: inode 13 index 0: identified unsupported merged block length 1 
wanted 4


This is an intermediary stage while the full support is being added.


Given this option, the user will have a choice to work on the identified
inode and make it a non-unwritten extent so that btrfs-convert shall be
successful.


>>
>>   # fallocate -l 1G test.img
>>   # mkfs.ext4 -F test.img
>>   # mount test.img $mnt
>>   # xfs_io -f -c "falloc 0 16K" $mnt/file
>>   # sync
>>   # xfs_io -f -c "pwrite 0 4k" -c "pwrite 12k 4k" $mnt/file
>>   # umount $mnt
>>   # ./btrfs-convert test.img
>> btrfs-convert from btrfs-progs v6.8
>>
>> Source filesystem:
>>    Type:           ext2
>>    Label:
>>    Blocksize:      4096
>>    UUID:           0f98aa2a-b1ee-4e91-8815-9b9a7b4af00a
>> Target filesystem:
>>    Label:
>>    Blocksize:      4096
>>    Nodesize:       16384
>>    UUID:           3b8db399-8e25-495b-a41c-47afcb672020
>>    Checksum:       crc32c
>>    Features:       extref, skinny-metadata, no-holes, free-space-tree
>> (default)
>>      Data csum:    yes
>>      Inline data:  yes
>>      Copy xattr:   yes
>> Reported stats:
>>    Total space:      1073741824
>>    Free space:        872349696 (81.24%)
>>    Inode count:           65536
>>    Free inodes:           65523
>>    Block count:          262144
>> Create initial btrfs filesystem
>> Create ext2 image file
>> Create btrfs metadata
>> ERROR: inode 13 index 0: identified unsupported merged block length 1
>> wanted 4
>> ERROR: failed to copy ext2 inode 13: -22
>> ERROR: error during copy_inodes -22
>> WARNING: error during conversion, the original filesystem is not modified
>>



>> [...]
>>> +
>>> +    memset(&extent, 0, sizeof(struct ext2fs_extent));
>>> +    if (ext2fs_extent_get(handle, EXT2_EXTENT_CURRENT, &extent)) {
>>> +        error("ext2fs_extent_get EXT2_EXTENT_CURRENT failed inode %d",
>>> +               src->ext2_ino);
>>> +        ext2fs_extent_free(handle);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    if (extent.e_pblk != data->disk_block) {
>>> +    error("inode %d index %d found wrong extent e_pblk %llu wanted 
>>> disk_block %llu",
>>> +               src->ext2_ino, index, extent.e_pblk, data->disk_block);
>>> +        ext2fs_extent_free(handle);
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    if (extent.e_len != data->num_blocks) {
>>> +    error("inode %d index %d: identified unsupported merged block 
>>> length %u wanted %llu",
>>> +            src->ext2_ino, index, extent.e_len, data->num_blocks);
>>> +        ext2fs_extent_free(handle);
>>> +        return -EINVAL;
>>> +    }
>>
>> You have to split the extent in this case. As the example I gave, part
>> of the extent can have been written.
>> (And I'm not sure if the e_pblk check is also correct)
>>
>> I believe the example I gave could be a pretty good test case.
>> (Or you can go one step further to interleave every 4K)
> 
> Furthermore, I have to consider what is the best way to iterate all data 
> extents of an ext2 inode.
> 
> Instead of ext2fs_block_iterate2(), I'm wondering if 
> ext2fs_extent_goto() would be a better solution. (As long as if it can 
> handle holes).
> 
> Another thing is, please Cc this series to ext4 mailing list if possible.
> I hope to get some feedback from the ext4 exports as they may have a 
> much better idea than us.
> 

I've tried fixes without success. Empirically, I found
that the main issue is extent optimization and merging,
which ignores the unwritten flag, idk where is this
happening. I think it is during writing the ext4 image
at the inode BTRFS_FIRST_FREE_OBJECTID + 1.

If avoiding this optimization possible, the extent boundary
will align with ext4 and thus its flags.

Thanks, Anand



  reply	other threads:[~2024-05-06  9:56 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-06  3:04 [PATCH v2 0/4] btrfs-progs: add support ext4 unwritten file extent Anand Jain
2024-05-06  3:04 ` [PATCH v2 1/4] btrfs-progs: convert: refactor ext2_create_file_extents add argument ext2_inode Anand Jain
2024-05-06  3:04 ` [PATCH v2 2/4] btrfs-progs: convert: struct blk_iterate_data, add ext2-specific file inode pointers Anand Jain
2024-05-06  3:04 ` [PATCH v2 3/4] btrfs-progs: convert: refactor __btrfs_record_file_extent to add a prealloc flag Anand Jain
2024-05-06  3:04 ` [PATCH v2 4/4] btrfs-progs: convert: support ext2 unwritten file data extents Anand Jain
2024-05-06  5:41   ` Qu Wenruo
2024-05-06  5:59     ` Qu Wenruo
2024-05-06  9:56       ` Anand Jain [this message]
2024-05-06 10:25         ` Qu Wenruo
2024-05-07  1:25   ` Qu Wenruo
2024-05-09  9:17     ` Anand Jain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3bd9271-c010-4eb5-8814-0f9247ff4117@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    --cc=y16267966@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox