* [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection
@ 2026-04-03 4:32 Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 1/6] btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes Qu Wenruo
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
[CHANGELOG]
v2:
- Add a new test case to verify hole detection with/without no-holes
There is a bug in the ^no-holes handling that an incorrect file extent
is inserted with uninitialized type.
Exposed and fixed by Mark.
- Add a new test case to verify the file contents of hole detection
There is a bug in the refactor of read, which overwrites the buffer
instead of properly advance the cursor.
Exposed by Mark with Chris Mason's review prompts.
- Keep the existing first block based bad compress ratio detection
Previously if we had bad compression ratio after the first block, we
do not mark the inode incompressible.
Follow the existing behavior.
- Fix the hole size capping in fiemap mode
Which previously is not capping the hole size, and can trigger
UASSERT()s.
- Enhance the new fiemap test case with fssum
To verify both the hole and file contents matches.
The PR can be found here:
https://github.com/kdave/btrfs-progs/pull/1103
Although previously I added a SEEK_DATA/SEEK_HOLE based hole detection,
it doesn't distinguish holes from preallocated ranges.
Thus if a rootdir contains some preallocated extents and the end user
also expects such preallocated space in the target fs, they will be
replaced by holes.
The first 2 patches are enhancing mkfs test cases to be more robust,
covering both no-holes and ^no-holes features, as Mark exposed a bug
affecting ^no-holes only in previous hole-detection.
The 3rd patch extracts btrfs_insert_hole_extent() to make it a little
simpler to use, without the need to populate a local on-stack file
extent item.
The 4th patch makes compressed write path easier to read, without
combining both compressed and uncompressed paths.
The 5th patch is the core of the new fiemap based hole detection, which
utilized fiemap to detect preallocated space.
The final one is a functional test for the new fiemap feature.
Qu Wenruo (6):
btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes
btrfs-progs: mkfs-tests: add a test case to verify the content of
rootdir
btrfs-progs: implement the missing btrfs_insert_hole_extent()
btrfs-progs: mkfs/rootdir: extract compressed write path
btrfs-progs: mkfs/rootdir: use fiemap to do prealloc detection
btrfs-progs: mkfs-tests: add a new test case for fiemap based
detection
Makefile | 2 +-
kernel-shared/file-item.c | 17 +
kernel-shared/file.c | 6 +-
mkfs/rootdir.c | 369 ++++++++++++------
tests/mkfs-tests/041-hole-detection/test.sh | 48 ++-
tests/mkfs-tests/042-rootdir-contents/test.sh | 57 +++
tests/mkfs-tests/043-fiemap-detection/test.sh | 64 +++
7 files changed, 420 insertions(+), 143 deletions(-)
create mode 100755 tests/mkfs-tests/042-rootdir-contents/test.sh
create mode 100755 tests/mkfs-tests/043-fiemap-detection/test.sh
--
2.53.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/6] btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
@ 2026-04-03 4:32 ` Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 2/6] btrfs-progs: mkfs-tests: add a test case to verify the content of rootdir Qu Wenruo
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
Mark fixed a bug in the hole-detection code where the extent type is
uninitialized. Such bug can be detected by running the hole-detection
test case without no-holes feautres.
Update the test case to test with and without no-holes feature to
prevent such bug in the future.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
tests/mkfs-tests/041-hole-detection/test.sh | 48 ++++++++++++---------
1 file changed, 27 insertions(+), 21 deletions(-)
diff --git a/tests/mkfs-tests/041-hole-detection/test.sh b/tests/mkfs-tests/041-hole-detection/test.sh
index 2b4e03c7a2e7..bd69447eded7 100755
--- a/tests/mkfs-tests/041-hole-detection/test.sh
+++ b/tests/mkfs-tests/041-hole-detection/test.sh
@@ -40,30 +40,36 @@ run_check dd if=/dev/urandom of="$tmp/middle_hole" bs=$blocksize count=1
run_check dd if=/dev/urandom of="$tmp/middle_hole" bs=$blocksize count=1 seek=16
middle_hole_before=$(run_check_stdout md5sum "$tmp/middle_hole" | awk '{print $1}')
-run_check_mkfs_test_dev -s $blocksize --rootdir "$tmp"
-run_check $SUDO_HELPER "$TOP/btrfs" check "$TEST_DEV"
-run_check_mount_test_dev
+workload()
+{
+ run_check_mkfs_test_dev -s $blocksize --rootdir "$tmp" $@
+ run_check $SUDO_HELPER "$TOP/btrfs" check "$TEST_DEV"
+ run_check_mount_test_dev
-# There are only 3 blocks written, thus 'du' should only report such 3 blocks used.
-blocks=$(run_check_stdout du -B $blocksize "$TEST_MNT" | awk '{print $1}')
+ # There are only 3 blocks written, thus 'du' should only report such 3 blocks used.
+ blocks=$(run_check_stdout du -B $blocksize "$TEST_MNT" | awk '{print $1}')
-if [ "$blocks" != "3" ]; then
- _fail "Unexpected number of blocks written, has $blocks expect 3"
-fi
+ if [ "$blocks" != "3" ]; then
+ _fail "Unexpected number of blocks written, has $blocks expect 3"
+ fi
-full_hole_after=$(run_check_stdout md5sum "$TEST_MNT/full_hole" | awk '{print $1}')
-middle_data_after=$(run_check_stdout md5sum "$TEST_MNT/middle_data" | awk '{print $1}')
-middle_hole_after=$(run_check_stdout md5sum "$TEST_MNT/middle_hole" | awk '{print $1}')
+ full_hole_after=$(run_check_stdout md5sum "$TEST_MNT/full_hole" | awk '{print $1}')
+ middle_data_after=$(run_check_stdout md5sum "$TEST_MNT/middle_data" | awk '{print $1}')
+ middle_hole_after=$(run_check_stdout md5sum "$TEST_MNT/middle_hole" | awk '{print $1}')
-if [ "$full_hole_before" != "$full_hole_after" ]; then
- _fail "full_hole content changed"
-fi
-if [ "$middle_data_before" != "$middle_data_after" ]; then
- _fail "middle_data content changed"
-fi
-if [ "$middle_hole_before" != "$middle_hole_after" ]; then
- _fail "middle_hole content changed"
-fi
+ if [ "$full_hole_before" != "$full_hole_after" ]; then
+ _fail "full_hole content changed"
+ fi
+ if [ "$middle_data_before" != "$middle_data_after" ]; then
+ _fail "middle_data content changed"
+ fi
+ if [ "$middle_hole_before" != "$middle_hole_after" ]; then
+ _fail "middle_hole content changed"
+ fi
+ run_check_umount_test_dev
+}
+
+workload -O no-holes
+workload -O ^no-holes
-run_check_umount_test_dev
run_check rm -rf -- "$tmp"
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 2/6] btrfs-progs: mkfs-tests: add a test case to verify the content of rootdir
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 1/6] btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes Qu Wenruo
@ 2026-04-03 4:32 ` Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 3/6] btrfs-progs: implement the missing btrfs_insert_hole_extent() Qu Wenruo
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
When running mkfs.btrfs with --rootdir option, we should make sure the
resulted fs has the same content of the source fs.
Add a test case to verify that with the following contents as a rootdir:
- A regular file with random data
The file size is 32M.
- A sparse file with random data
The sparse file is 512M, but there is only a 1MiB random data at file
offset 128M.
- A directory generated by fsstress
Then we take fssum of the rootdir, create the btrfs, mount the btrfs and
verify the contents against fssum.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Makefile | 2 +-
tests/mkfs-tests/042-rootdir-contents/test.sh | 57 +++++++++++++++++++
2 files changed, 58 insertions(+), 1 deletion(-)
create mode 100755 tests/mkfs-tests/042-rootdir-contents/test.sh
diff --git a/Makefile b/Makefile
index cc998bf55f4a..f77494e8a71d 100644
--- a/Makefile
+++ b/Makefile
@@ -553,7 +553,7 @@ test-misc: btrfs btrfs-image btrfs-corrupt-block mkfs.btrfs btrfstune fssum fsst
@echo " TEST misc-tests.sh"
$(Q)bash tests/misc-tests.sh
-test-mkfs: btrfs mkfs.btrfs
+test-mkfs: btrfs mkfs.btrfs fssum fsstress
@echo " TEST mkfs-tests.sh"
$(Q)bash tests/mkfs-tests.sh
diff --git a/tests/mkfs-tests/042-rootdir-contents/test.sh b/tests/mkfs-tests/042-rootdir-contents/test.sh
new file mode 100755
index 000000000000..005ea28190b1
--- /dev/null
+++ b/tests/mkfs-tests/042-rootdir-contents/test.sh
@@ -0,0 +1,57 @@
+#!/bin/bash
+# Make sure the created btrfs with rootdir has its content matches its source
+
+source "$TEST_TOP/common" || exit
+
+check_prereq mkfs.btrfs
+check_prereq btrfs
+check_prereq fssum
+check_prereq fsstress
+
+if ! [ -f "/sys/fs/btrfs/features/supported_sectorsizes" ]; then
+ _not_run "kernel support for different block sizes missing"
+fi
+
+setup_root_helper
+prepare_test_dev
+
+fssum_prog="$INTERNAL_BIN/fssum"
+fsstress_prog="$INTERNAL_BIN/fsstress"
+tmp=$(_mktemp_dir mkfs-rootdir)
+
+mkdir "$tmp/rootdir"
+# Get the fs block size, normally it's page size (using tmpfs for /tmp),
+# but it can still be other values if /tmp is on a regular fs.
+blocksize=$(stat -f -c %S "$tmp")
+
+blocksize_supported=false
+for bs in $(cat /sys/fs/btrfs/features/supported_sectorsizes); do
+ if [ "$blocksize" == "$bs" ]; then
+ blocksize_supported=true
+ fi
+done
+
+if [ "$blocksize_supported" != "true" ]; then
+ _not_run "kernel support for mounting blocksize $blocksize is missing"
+fi
+
+run_check $SUDO_HELPER dd if=/dev/urandom of="$tmp/rootdir/large" bs=1M count=32
+run_check $SUDO_HELPER truncate -s 512M "$tmp/rootdir/sparse"
+run_check $SUDO_HELPER dd if=/dev/urandom of="$tmp/rootdir/sparse" bs=1M count=1 conv=notrunc seek=128
+run_check $SUDO_HELPER $fsstress_prog -w -n 256 -d "$tmp/rootdir/"
+run_check $SUDO_HELPER $fssum_prog -n -d -f -w "$tmp/fssum" "$tmp/rootdir"
+
+workload()
+{
+ run_check_mkfs_test_dev -s $blocksize --rootdir "$tmp/rootdir" $@
+ run_check $SUDO_HELPER "$TOP/btrfs" check "$TEST_DEV"
+ run_check_mount_test_dev
+
+ run_check $SUDO_HELPER $fssum_prog -r "$tmp/fssum" "$TEST_MNT/"
+ run_check_umount_test_dev
+}
+
+workload -O no-holes
+workload -O ^no-holes
+
+run_check rm -rf -- "$tmp"
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 3/6] btrfs-progs: implement the missing btrfs_insert_hole_extent()
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 1/6] btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 2/6] btrfs-progs: mkfs-tests: add a test case to verify the content of rootdir Qu Wenruo
@ 2026-04-03 4:32 ` Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 4/6] btrfs-progs: mkfs/rootdir: extract compressed write path Qu Wenruo
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
Commit f8efe9f724c0 ("btrfs-progs: sync file-item.h into progs")
introduce the definition of btrfs_insert_hole_extent() but without an
implementation.
Now there are several call sites implementing a simple wrapper to insert
a hole extent. It's time to implement that function, which is pretty
simple, just set the file extent type to REG and set
disk_num_bytes/disk_bytenr to 0 with proper ram_bytes/num_bytes.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
kernel-shared/file-item.c | 17 +++++++++++++++++
kernel-shared/file.c | 6 +-----
mkfs/rootdir.c | 8 ++------
3 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/kernel-shared/file-item.c b/kernel-shared/file-item.c
index c6cd3ce51124..76d6bfdb6824 100644
--- a/kernel-shared/file-item.c
+++ b/kernel-shared/file-item.c
@@ -25,6 +25,23 @@
#include "kernel-shared/file-item.h"
#include "kernel-shared/extent_io.h"
#include "common/internal.h"
+#include "common/messages.h"
+
+int btrfs_insert_hole_extent(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root, u64 objectid, u64 pos,
+ u64 num_bytes)
+{
+ struct btrfs_file_extent_item stack_fi = { 0 };
+ const u32 blocksize = root->fs_info->sectorsize;
+
+ UASSERT(IS_ALIGNED(pos, blocksize));
+ UASSERT(IS_ALIGNED(num_bytes, blocksize));
+
+ btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_REG);
+ btrfs_set_stack_file_extent_num_bytes(&stack_fi, num_bytes);
+ btrfs_set_stack_file_extent_ram_bytes(&stack_fi, num_bytes);
+ return btrfs_insert_file_extent(trans, root, objectid, pos, &stack_fi);
+}
#define MAX_CSUM_ITEMS(r, size) ((((BTRFS_LEAF_DATA_SIZE(r->fs_info) - \
sizeof(struct btrfs_item) * 2) / \
diff --git a/kernel-shared/file.c b/kernel-shared/file.c
index 12cd50e796b9..546e161abfdb 100644
--- a/kernel-shared/file.c
+++ b/kernel-shared/file.c
@@ -150,7 +150,6 @@ int btrfs_punch_hole(struct btrfs_trans_handle *trans,
u64 ino, u64 offset, u64 len)
{
struct btrfs_path *path;
- struct btrfs_file_extent_item stack_fi = { 0 };
int ret = 0;
path = btrfs_alloc_path();
@@ -165,10 +164,7 @@ int btrfs_punch_hole(struct btrfs_trans_handle *trans,
goto out;
}
- btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_REG);
- btrfs_set_stack_file_extent_num_bytes(&stack_fi, len);
- btrfs_set_stack_file_extent_ram_bytes(&stack_fi, len);
- ret = btrfs_insert_file_extent(trans, root, ino, offset, &stack_fi);
+ ret = btrfs_insert_hole_extent(trans, root, ino, offset, len);
out:
btrfs_free_path(path);
return ret;
diff --git a/mkfs/rootdir.c b/mkfs/rootdir.c
index 37191ee427b7..62f7c57c5e4a 100644
--- a/mkfs/rootdir.c
+++ b/mkfs/rootdir.c
@@ -411,8 +411,7 @@ static int insert_reserved_file_extent(struct btrfs_trans_handle *trans,
* hole. And hole extent has no size limit, no need to loop.
*/
if (disk_bytenr == 0)
- return btrfs_insert_file_extent(trans, root, ino,
- file_pos, stack_fi);
+ return btrfs_insert_hole_extent(trans, root, ino, file_pos, num_bytes);
path = btrfs_alloc_path();
if (!path)
@@ -819,10 +818,7 @@ static int add_file_item_extent(struct btrfs_trans_handle *trans,
*/
const u64 length = min_t(u64, next - file_pos, SZ_1G);
- btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_REG);
- btrfs_set_stack_file_extent_num_bytes(&stack_fi, length);
- btrfs_set_stack_file_extent_ram_bytes(&stack_fi, length);
- ret = btrfs_insert_file_extent(trans, root, objectid, file_pos, &stack_fi);
+ ret = btrfs_insert_hole_extent(trans, root, objectid, file_pos, length);
if (ret < 0) {
error("cannot insert hole for range [%llu, %llu)",
file_pos, file_pos + length);
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 4/6] btrfs-progs: mkfs/rootdir: extract compressed write path
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
` (2 preceding siblings ...)
2026-04-03 4:32 ` [PATCH v2 3/6] btrfs-progs: implement the missing btrfs_insert_hole_extent() Qu Wenruo
@ 2026-04-03 4:32 ` Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 5/6] btrfs-progs: mkfs/rootdir: use fiemap to do prealloc detection Qu Wenruo
2026-04-03 4:33 ` [PATCH v2 6/6] btrfs-progs: mkfs-tests: add a new test case for fiemap based detection Qu Wenruo
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
Currently add_file_item_extent() have several different
optimizations/handlings:
- Hole detection
Which happens before any writes.
Thus brings the minimal impact to the remaining methods.
- Compressed write
- Reflink from source fs
- Regular read/writes
The last 3 share the same extent reservation, but with quite some extra
handling.
E.g. for compressed writes if the compression failed, we need to reset
the buffer size and fallback to regular read/writes.
This makes the code much harder to read, and the shared code is minimal,
only sharing the same btrfs_reserve_extent() and
insert_reserved_file_extent() calls.
Extract compressed write into its dedicated helper so that the fallback
logic is much easier to understand.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
mkfs/rootdir.c | 249 ++++++++++++++++++++++++++++---------------------
1 file changed, 141 insertions(+), 108 deletions(-)
diff --git a/mkfs/rootdir.c b/mkfs/rootdir.c
index 62f7c57c5e4a..5c0bc012a354 100644
--- a/mkfs/rootdir.c
+++ b/mkfs/rootdir.c
@@ -779,6 +779,125 @@ static int do_reflink_write(struct btrfs_fs_info *info,
return 0;
}
+static s32 read_from_source(const struct btrfs_fs_info *fs_info,
+ const char *path, int fd,
+ char *buf, u64 filepos, u32 length)
+{
+ u32 cur = 0;
+
+ UASSERT(IS_ALIGNED(filepos, fs_info->sectorsize));
+ UASSERT(length <= MAX_EXTENT_SIZE);
+
+ while (cur < length) {
+ ssize_t ret;
+
+ ret = pread(fd, buf + cur, length - cur, filepos + cur);
+ if (ret < 0) {
+ error("cannot read %s at offset %llu length %u: %m",
+ path, filepos + cur, length - cur);
+ return -errno;
+ }
+ cur += ret;
+ }
+ return length;
+}
+
+/*
+ * Return >0 for the number of bytes read from @source and submitted as
+ * compressed write.
+ * Return <0 for errors, including non-fatal ones, e.g. -E2BIG if compression
+ * ratio is bad.
+ */
+static int try_compressed_write(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct btrfs_inode_item *btrfs_inode,
+ u64 objectid,
+ const struct source_descriptor *source,
+ u64 filepos, u32 length)
+{
+ struct btrfs_fs_info *fs_info = root->fs_info;
+ struct btrfs_file_extent_item stack_fi = { 0 };
+ struct btrfs_key key;
+ const u32 blocksize = fs_info->sectorsize;
+ const bool first_sector = !(btrfs_stack_inode_flags(btrfs_inode) &
+ BTRFS_INODE_COMPRESS);
+ u64 inode_flags = btrfs_stack_inode_flags(btrfs_inode);
+ u64 sb_flags = btrfs_super_incompat_flags(fs_info->super_copy);
+ u32 to_write;
+ ssize_t comp_ret;
+ int ret;
+
+ UASSERT(length > 0);
+ length = min_t(u32, length, BTRFS_MAX_COMPRESSED);
+ if (length <= root->fs_info->sectorsize)
+ return -E2BIG;
+ ret = read_from_source(root->fs_info, source->path_name, source->fd,
+ source->buf, filepos, length);
+ if (ret < 0)
+ return ret;
+ switch (g_compression) {
+ case BTRFS_COMPRESS_ZLIB:
+ comp_ret = zlib_compress_extent(first_sector, blocksize,
+ source->buf, length,
+ source->comp_buf);
+ break;
+#if COMPRESSION_LZO
+ case BTRFS_COMPRESS_LZO:
+ sb_flags |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO;
+ comp_ret = lzo_compress_extent(blocksize, source->buf,
+ length, source->comp_buf,
+ source->wrkmem);
+ break;
+#endif
+#if COMPRESSION_ZSTD
+ case BTRFS_COMPRESS_ZSTD:
+ sb_flags |= BTRFS_FEATURE_INCOMPAT_COMPRESS_ZSTD;
+ comp_ret = zstd_compress_extent(first_sector, blocksize,
+ source->buf, length,
+ source->comp_buf);
+ break;
+#endif
+ default:
+ comp_ret = -EINVAL;
+ break;
+ }
+ if (comp_ret < 0)
+ return comp_ret;
+
+ to_write = round_up(comp_ret, blocksize);
+ memset(source->comp_buf + comp_ret, 0, to_write - comp_ret);
+ inode_flags |= BTRFS_INODE_COMPRESS;
+ btrfs_set_stack_inode_flags(btrfs_inode, inode_flags);
+ btrfs_set_super_incompat_flags(fs_info->super_copy, sb_flags);
+
+ ret = btrfs_reserve_extent(trans, root, to_write, 0, 0, (u64)-1, &key, 1);
+ if (ret < 0)
+ return ret;
+ ret = write_data_to_disk(fs_info, source->comp_buf, key.objectid, to_write);
+ if (ret < 0)
+ return ret;
+ for (unsigned int i = 0; i < to_write / blocksize; i++) {
+ ret = btrfs_csum_file_block(trans, key.objectid + (i * blocksize),
+ BTRFS_EXTENT_CSUM_OBJECTID,
+ root->fs_info->csum_type,
+ source->comp_buf + (i * blocksize));
+ if (ret)
+ return ret;
+ }
+ btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_REG);
+ btrfs_set_stack_file_extent_disk_bytenr(&stack_fi, key.objectid);
+ btrfs_set_stack_file_extent_disk_num_bytes(&stack_fi, to_write);
+ btrfs_set_stack_file_extent_num_bytes(&stack_fi, round_up(length, blocksize));
+ btrfs_set_stack_file_extent_ram_bytes(&stack_fi, round_up(length, blocksize));
+ btrfs_set_stack_file_extent_compression(&stack_fi, g_compression);
+
+ ret = insert_reserved_file_extent(trans, root, objectid, btrfs_inode,
+ filepos, &stack_fi);
+ if (ret < 0)
+ return ret;
+ return length;
+}
+
static int add_file_item_extent(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_inode_item *btrfs_inode,
@@ -788,14 +907,13 @@ static int add_file_item_extent(struct btrfs_trans_handle *trans,
{
int ret;
u32 sectorsize = root->fs_info->sectorsize;
- u64 bytes_read, first_block, to_read, to_write;
+ u64 first_block, to_read, to_write;
struct btrfs_key key;
struct btrfs_file_extent_item stack_fi = { 0 };
u64 buf_size;
char *write_buf;
bool do_comp = g_compression != BTRFS_COMPRESS_NONE;
bool datasum = true;
- ssize_t comp_ret;
u64 flags = btrfs_stack_inode_flags(btrfs_inode);
off_t next;
@@ -835,120 +953,35 @@ static int add_file_item_extent(struct btrfs_trans_handle *trans,
if (next == (off_t)-1 || !IS_ALIGNED(next, sectorsize) || next > source->size)
next = source->size;
- buf_size = do_comp ? BTRFS_MAX_COMPRESSED : MAX_EXTENT_SIZE;
- to_read = min_t(u64, file_pos + buf_size, next) - file_pos;
- bytes_read = 0;
-
- while (bytes_read < to_read) {
- ssize_t ret_read;
-
- ret_read = pread(source->fd, source->buf + bytes_read,
- to_read - bytes_read, file_pos + bytes_read);
- if (ret_read < 0) {
- error("cannot read %s at offset %llu length %llu: %m",
- source->path_name, file_pos + bytes_read,
- to_read - bytes_read);
- return -errno;
- }
-
- bytes_read += ret_read;
- }
-
- if (bytes_read <= sectorsize)
- do_comp = false;
-
- if (do_comp) {
- bool first_sector = !(flags & BTRFS_INODE_COMPRESS);
-
- switch (g_compression) {
- case BTRFS_COMPRESS_ZLIB:
- comp_ret = zlib_compress_extent(first_sector, sectorsize,
- source->buf, bytes_read,
- source->comp_buf);
- break;
-#if COMPRESSION_LZO
- case BTRFS_COMPRESS_LZO:
- comp_ret = lzo_compress_extent(sectorsize, source->buf,
- bytes_read,
- source->comp_buf,
- source->wrkmem);
- break;
-#endif
-#if COMPRESSION_ZSTD
- case BTRFS_COMPRESS_ZSTD:
- comp_ret = zstd_compress_extent(first_sector, sectorsize,
- source->buf, bytes_read,
- source->comp_buf);
- break;
-#endif
- default:
- comp_ret = -EINVAL;
- break;
- }
-
+ if (do_comp && next - file_pos > sectorsize) {
+ ret = try_compressed_write(trans, root, btrfs_inode, objectid,
+ source, file_pos, next - file_pos);
+ if (ret > 0)
+ return ret;
+ if (ret < 0 && ret != -E2BIG)
+ return ret;
/*
- * If the function returned -E2BIG, the extent is incompressible.
- * If this is the first sector, add the nocompress flag,
- * increase the buffer size, and read the rest of the extent.
+ * If the inode doesn't have INODE_COMPRESS flag set, it means
+ * the compression failed at the first block.
+ * Set the NOCOMPRESS flag indicating bad compression ratio.
*/
- if (comp_ret == -E2BIG)
- do_comp = false;
- else if (comp_ret < 0)
- return comp_ret;
-
- if (comp_ret == -E2BIG && first_sector) {
+ if (!(btrfs_stack_inode_flags(btrfs_inode) & BTRFS_INODE_COMPRESS)) {
flags |= BTRFS_INODE_NOCOMPRESS;
btrfs_set_stack_inode_flags(btrfs_inode, flags);
-
- buf_size = MAX_EXTENT_SIZE;
- to_read = min_t(u64, file_pos + buf_size, next) - file_pos;
-
- while (bytes_read < to_read) {
- ssize_t ret_read;
-
- ret_read = pread(source->fd,
- source->buf + bytes_read,
- to_read - bytes_read,
- file_pos + bytes_read);
- if (ret_read < 0) {
- error("cannot read %s at offset %llu length %llu: %m",
- source->path_name,
- file_pos + bytes_read,
- to_read - bytes_read);
- return -errno;
- }
-
- bytes_read += ret_read;
- }
}
+ /* Fallback to other methods. */
}
- if (do_comp) {
- u64 features;
+ buf_size = MAX_EXTENT_SIZE;
+ to_read = min_t(u64, file_pos + buf_size, next) - file_pos;
+ ret = read_from_source(root->fs_info, source->path_name, source->fd,
+ source->buf, file_pos, to_read);
+ if (ret < 0)
+ return ret;
- to_write = round_up(comp_ret, sectorsize);
- write_buf = source->comp_buf;
- memset(write_buf + comp_ret, 0, to_write - comp_ret);
-
- flags |= BTRFS_INODE_COMPRESS;
- btrfs_set_stack_inode_flags(btrfs_inode, flags);
-
- if (g_compression == BTRFS_COMPRESS_ZSTD) {
- features = btrfs_super_incompat_flags(trans->fs_info->super_copy);
- features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_ZSTD;
- btrfs_set_super_incompat_flags(trans->fs_info->super_copy,
- features);
- } else if (g_compression == BTRFS_COMPRESS_LZO) {
- features = btrfs_super_incompat_flags(trans->fs_info->super_copy);
- features |= BTRFS_FEATURE_INCOMPAT_COMPRESS_LZO;
- btrfs_set_super_incompat_flags(trans->fs_info->super_copy,
- features);
- }
- } else {
- to_write = round_up(to_read, sectorsize);
- write_buf = source->buf;
- memset(write_buf + to_read, 0, to_write - to_read);
- }
+ to_write = round_up(to_read, sectorsize);
+ write_buf = source->buf;
+ memset(write_buf + to_read, 0, to_write - to_read);
ret = btrfs_reserve_extent(trans, root, to_write, 0, 0,
(u64)-1, &key, 1);
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 5/6] btrfs-progs: mkfs/rootdir: use fiemap to do prealloc detection
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
` (3 preceding siblings ...)
2026-04-03 4:32 ` [PATCH v2 4/6] btrfs-progs: mkfs/rootdir: extract compressed write path Qu Wenruo
@ 2026-04-03 4:32 ` Qu Wenruo
2026-04-03 4:33 ` [PATCH v2 6/6] btrfs-progs: mkfs-tests: add a new test case for fiemap based detection Qu Wenruo
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:32 UTC (permalink / raw)
To: linux-btrfs
Introduce a new fiemap ioctl based hole/prealloc detection, which has
the following features:
- Preallocated extent detection
- Hole detection
This allows the resulted image to reflect the source rootdir better,
with proper preallocated extents.
However not all major fses support fiemap, e.g. tmpfs doesn't support
fiemap.
So we still need the existing SEEK_DATA/SEEK_HOLE based solution as a
fallback.
Furthermore, SEEK_DATA/SEEK_HOLE will treat preallocated space as a
hole, thus we have to attempt fiemap before SEEK_DATA/SEEK_HOLE.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
mkfs/rootdir.c | 112 ++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 110 insertions(+), 2 deletions(-)
diff --git a/mkfs/rootdir.c b/mkfs/rootdir.c
index 5c0bc012a354..5fbbb2f43504 100644
--- a/mkfs/rootdir.c
+++ b/mkfs/rootdir.c
@@ -36,6 +36,7 @@
#if COMPRESSION_LZO
#include <lzo/lzo1x.h>
#endif
+#include <linux/fiemap.h>
#include "kernel-lib/sizes.h"
#include "kernel-shared/accessors.h"
#include "kernel-shared/uapi/btrfs_tree.h"
@@ -696,11 +697,13 @@ out:
struct source_descriptor {
int fd;
+ u32 blocksize;
char *buf;
u64 size;
const char *path_name;
char *comp_buf;
char *wrkmem;
+ bool no_fiemap;
};
static int do_reflink_write(struct btrfs_fs_info *info,
@@ -898,11 +901,106 @@ static int try_compressed_write(struct btrfs_trans_handle *trans,
return length;
}
+/*
+ * Return 0 if we are unable to use fiemap result to handle any range.
+ * Return >0 for the length of preallocated/hole space we have inserted.
+ * Return <0 for critical errors.
+ */
+static int try_fiemap(struct btrfs_trans_handle *trans,
+ struct btrfs_root *root,
+ struct btrfs_inode_item *btrfs_inode,
+ u64 objectid,
+ struct source_descriptor *source,
+ u64 filepos)
+{
+ struct btrfs_fs_info *fs_info = root->fs_info;
+ struct btrfs_file_extent_item stack_fi = { 0 };
+ struct btrfs_key key;
+ const u32 blocksize = fs_info->sectorsize;
+ struct fiemap *fiemap;
+ struct fiemap_extent *extent;
+ u64 length = round_up(source->size, blocksize) - filepos;
+ int ret;
+
+ UASSERT(IS_ALIGNED(filepos, blocksize));
+ length = min_t(u64, length, MAX_EXTENT_SIZE);
+
+ if (blocksize != source->blocksize || source->no_fiemap)
+ return 0;
+ fiemap = malloc(sizeof(struct fiemap) + sizeof(struct fiemap_extent));
+ if (!fiemap)
+ return -ENOMEM;
+
+ fiemap->fm_flags = FIEMAP_FLAG_SYNC;
+ fiemap->fm_extent_count = 1;
+ fiemap->fm_mapped_extents = 0;
+ fiemap->fm_start = filepos;
+ fiemap->fm_length = length;
+
+ ret = ioctl(source->fd, FS_IOC_FIEMAP, (unsigned long)fiemap);
+ if (ret < 0) {
+ source->no_fiemap = true;
+ free(fiemap);
+ return 0;
+ }
+ /* This is no more non-hole extent beyond @filepos. */
+ if (fiemap->fm_mapped_extents < 1) {
+ free(fiemap);
+ length = min_t(u64, length, SZ_1G);
+ ret = btrfs_insert_hole_extent(trans, root, objectid, filepos, length);
+ if (ret < 0)
+ return ret;
+ return length;
+ }
+ /*
+ * We really only care about the first returned extent which covers
+ * our block.
+ */
+ extent = &fiemap->fm_extents[0];
+
+ /*
+ * Returned range is beyond our @filepos. This means a hole
+ * between @filepos and @fe_logical.
+ */
+ if (extent->fe_logical > filepos) {
+ length = extent->fe_logical - filepos;
+ length = min_t(u64, length, SZ_1G);
+ free(fiemap);
+ ret = btrfs_insert_hole_extent(trans, root, objectid, filepos, length);
+ if (ret < 0)
+ return ret;
+ return length;
+ }
+
+ if (!(extent->fe_flags & FIEMAP_EXTENT_UNWRITTEN)) {
+ free(fiemap);
+ return 0;
+ }
+
+ length = extent->fe_logical + extent->fe_length - filepos;
+ free(fiemap);
+ ret = btrfs_reserve_extent(trans, root, length, 0, 0, (u64)-1, &key, 1);
+ if (ret)
+ return ret;
+
+ btrfs_set_stack_inode_flags(btrfs_inode,
+ btrfs_stack_inode_flags(btrfs_inode) | BTRFS_INODE_PREALLOC);
+ btrfs_set_stack_file_extent_type(&stack_fi, BTRFS_FILE_EXTENT_PREALLOC);
+ btrfs_set_stack_file_extent_disk_bytenr(&stack_fi, key.objectid);
+ btrfs_set_stack_file_extent_disk_num_bytes(&stack_fi, length);
+ btrfs_set_stack_file_extent_ram_bytes(&stack_fi, length);
+ btrfs_set_stack_file_extent_num_bytes(&stack_fi, length);
+ ret = insert_reserved_file_extent(trans, root, objectid, btrfs_inode, filepos, &stack_fi);
+ if (ret < 0)
+ return ret;
+ return length;
+}
+
static int add_file_item_extent(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_inode_item *btrfs_inode,
u64 objectid,
- const struct source_descriptor *source,
+ struct source_descriptor *source,
u64 file_pos)
{
int ret;
@@ -915,7 +1013,7 @@ static int add_file_item_extent(struct btrfs_trans_handle *trans,
bool do_comp = g_compression != BTRFS_COMPRESS_NONE;
bool datasum = true;
u64 flags = btrfs_stack_inode_flags(btrfs_inode);
- off_t next;
+ off_t next = file_pos;
if (g_do_reflink || flags & BTRFS_INODE_NOCOMPRESS)
do_comp = false;
@@ -925,6 +1023,14 @@ static int add_file_item_extent(struct btrfs_trans_handle *trans,
do_comp = false;
}
+ /*
+ * Try prealloc before hole detection, as preallocated space is also treated
+ * as hole by SEEK_DATA.
+ */
+ ret = try_fiemap(trans, root, btrfs_inode, objectid, source, file_pos);
+ if (ret)
+ return ret;
+
next = lseek(source->fd, file_pos, SEEK_DATA);
/* The current offset is inside a hole to the next of the file. */
if (next == (off_t)-1 && errno == ENXIO)
@@ -1475,9 +1581,11 @@ static int add_file_items(struct btrfs_trans_handle *trans,
source.fd = fd;
source.buf = buf;
source.size = st->st_size;
+ source.blocksize = st->st_blksize;
source.path_name = path_name;
source.comp_buf = comp_buf;
source.wrkmem = wrkmem;
+ source.no_fiemap = false;
while (file_pos < st->st_size) {
ret = add_file_item_extent(trans, root, btrfs_inode, objectid,
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v2 6/6] btrfs-progs: mkfs-tests: add a new test case for fiemap based detection
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
` (4 preceding siblings ...)
2026-04-03 4:32 ` [PATCH v2 5/6] btrfs-progs: mkfs/rootdir: use fiemap to do prealloc detection Qu Wenruo
@ 2026-04-03 4:33 ` Qu Wenruo
5 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2026-04-03 4:33 UTC (permalink / raw)
To: linux-btrfs
The new test case will create an inode on btrfs, with the following
layout:
- [0, 16K): hole
- [16K, 32K): regular
- [32K, 64K): hole
- [64K, 80K): prealloc
- [80K, 84K): regular
Using fiemap based detection, we should be able to create a btrfs with
exactly the same layout, verified by fssum with "-s" option.
And test both no-holes and ^no-holes features.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
tests/mkfs-tests/043-fiemap-detection/test.sh | 64 +++++++++++++++++++
1 file changed, 64 insertions(+)
create mode 100755 tests/mkfs-tests/043-fiemap-detection/test.sh
diff --git a/tests/mkfs-tests/043-fiemap-detection/test.sh b/tests/mkfs-tests/043-fiemap-detection/test.sh
new file mode 100755
index 000000000000..bf6b6f79cd9a
--- /dev/null
+++ b/tests/mkfs-tests/043-fiemap-detection/test.sh
@@ -0,0 +1,64 @@
+#!/bin/bash
+# Test basic hole detection features
+
+source "$TEST_TOP/common" || exit
+
+check_prereq mkfs.btrfs
+check_prereq btrfs
+check_prereq fssum
+check_global_prereq xfs_io
+
+if ! [ -f "/sys/fs/btrfs/features/supported_sectorsizes" ]; then
+ _not_run "kernel support for different block sizes missing"
+fi
+
+setup_root_helper
+
+# tmpdir is normally inside /tmp, which can be tmpfs on a lot of distro.
+# Unfortunately tmpfs doesn't support fiemap ioctl.
+# So we need another btrfs as rootdir.
+setup_loopdevs 2
+prepare_loopdevs
+
+real_dev=${loopdevs[1]}
+rootdir_dev=${loopdevs[2]}
+fssum_prog="$INTERNAL_BIN/fssum"
+tmp=$(_mktemp_dir mkfs-fiemap)
+
+blocksize_supported=false
+for bs in $(cat /sys/fs/btrfs/features/supported_sectorsizes); do
+ if [ "$bs" == "4096" ]; then
+ blocksize_supported=true
+ fi
+done
+
+if [ "$blocksize_supported" != "true" ]; then
+ _not_run "kernel support for mounting blocksize 4096 is missing"
+fi
+
+run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -s 4k -O no-holes "$rootdir_dev"
+run_check $SUDO_HELPER mount "$rootdir_dev" "$TEST_MNT"
+
+# Create an inode with holes, preallocated and regular file extents.
+run_check $SUDO_HELPER xfs_io -f \
+ -c "pwrite 16k 16k" -c "falloc 64k 16k" -c "pwrite 80k 4k" -c sync \
+ "$TEST_MNT/foobar"
+run_check $SUDO_HELPER $fssum_prog -n -d -s -f -w "$tmp/fssum" "$TEST_MNT"
+run_check $SUDO_HELPER umount "$TEST_MNT"
+
+workload()
+{
+ run_check $SUDO_HELPER mount "$rootdir_dev" "$TEST_MNT"
+ run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f --rootdir "$TEST_MNT" "$real_dev" $@
+ run_check $SUDO_HELPER umount "$TEST_MNT"
+
+ run_check $SUDO_HELPER "$TOP/btrfs" check "$real_dev"
+ run_check $SUDO_HELPER mount "$real_dev" "$TEST_MNT"
+ run_check $SUDO_HELPER $fssum_prog -r "$tmp/fssum" "$TEST_MNT"
+ run_check $SUDO_HELPER umount "$TEST_MNT"
+}
+
+workload -O no-holes
+workload -O ^no-holes
+run_check rm -rf -- "$tmp"
+cleanup_loopdevs
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-04-03 4:33 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03 4:32 [PATCH v2 0/6] btrfs-progs: mkfs/rootdir: cleanup and new fiemap based prealloc detection Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 1/6] btrfs-progs: mkfs-tests: also test hole-deteciton without no-holes Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 2/6] btrfs-progs: mkfs-tests: add a test case to verify the content of rootdir Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 3/6] btrfs-progs: implement the missing btrfs_insert_hole_extent() Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 4/6] btrfs-progs: mkfs/rootdir: extract compressed write path Qu Wenruo
2026-04-03 4:32 ` [PATCH v2 5/6] btrfs-progs: mkfs/rootdir: use fiemap to do prealloc detection Qu Wenruo
2026-04-03 4:33 ` [PATCH v2 6/6] btrfs-progs: mkfs-tests: add a new test case for fiemap based detection Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox