* [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
@ 2024-05-29 7:13 Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 01/10] btrfs-progs: rename block_count to byte_count Naohiro Aota
` (10 more replies)
0 siblings, 11 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
below.
- The FS size needs to be larger than minimal size that can host a btrfs,
but its calculation does not consider non-SINGLE profile
- The calculation also does not ensure tree-log BG and data relocation BG
- It allows creating a FS not aligned to the zone boundary
- It resets all device zones beyond the specified length
This series fixes the issues with some cleanups.
This one passed CI workflow here:
Patches 1 to 3 are clean up patches, so they should not change the behavior.
Patches 4 to 6 address the issues.
Patches 7 to 10 add/modify the test cases. First, patch 7 adds nullb
functions to use in later patches. Patch 8 adds a new test for
zone resetting. And, patches 9 and 10 rewrites existing tests with the
nullb helper.
Changes:
- v4:
- Fix source directory size alignment.
- v3: https://lore.kernel.org/linux-btrfs/dfd8887b-a2cb-425f-8705-0d6a94cefb9c@gmx.com/
- Tweak minimum FS size calculation style.
- Round down the specified byte_count towards sectorsize and zone
size, instead of banning unaligned byte_count.
- Add active zone description in the commit log of patch 6.
- Add nullb test functions and use them in tests.
- v2: https://lore.kernel.org/linux-btrfs/20240514182227.1197664-1-naohiro.aota@wdc.com/
- fix function declaration on older distro (non-ZONED setup)
- fix mkfs test failure
- v1: https://lore.kernel.org/linux-btrfs/20240514005133.44786-1-naohiro.aota@wdc.com/
Naohiro Aota (10):
btrfs-progs: rename block_count to byte_count
btrfs-progs: mkfs: remove duplicated device size check
btrfs-progs: mkfs: unify zoned mode minimum size calc into
btrfs_min_dev_size()
btrfs-progs: mkfs: fix minimum size calculation for zoned mode
btrfs-progs: mkfs: align byte_count with sectorsize and zone size
btrfs-progs: support byte length for zone resetting
btrfs-progs: test: add nullb setup functions
btrfs-progs: test: add test for zone resetting
btrfs-progs: test: use nullb helper and smaller zone size
btrfs-progs: test: use nullb helpers in 031-zoned-bgt
common/device-utils.c | 45 +++++++-----
kernel-shared/zoned.c | 23 +++++-
kernel-shared/zoned.h | 7 +-
mkfs/common.c | 62 ++++++++++++++---
mkfs/common.h | 2 +-
mkfs/main.c | 89 ++++++++++--------------
tests/common | 63 +++++++++++++++++
tests/mkfs-tests/030-zoned-rst/test.sh | 14 ++--
tests/mkfs-tests/031-zoned-bgt/test.sh | 30 ++------
tests/mkfs-tests/032-zoned-reset/test.sh | 43 ++++++++++++
10 files changed, 260 insertions(+), 118 deletions(-)
create mode 100755 tests/mkfs-tests/032-zoned-reset/test.sh
--
2.45.1
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v4 01/10] btrfs-progs: rename block_count to byte_count
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 02/10] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
` (9 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
block_count and dev_block_count are counting the size in bytes. And,
comparing them with e.g, "min_dev_size" is confusing. Rename them to
represent the unit better.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
common/device-utils.c | 28 +++++++++++-----------
mkfs/main.c | 56 +++++++++++++++++++++----------------------
2 files changed, 42 insertions(+), 42 deletions(-)
diff --git a/common/device-utils.c b/common/device-utils.c
index d086e9ea2564..86942e0c7041 100644
--- a/common/device-utils.c
+++ b/common/device-utils.c
@@ -222,11 +222,11 @@ out:
* - reset zones
* - delete end of the device
*/
-int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
- u64 max_block_count, unsigned opflags)
+int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
+ u64 max_byte_count, unsigned opflags)
{
struct btrfs_zoned_device_info *zinfo = NULL;
- u64 block_count;
+ u64 byte_count;
struct stat st;
int i, ret;
@@ -236,13 +236,13 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
return 1;
}
- block_count = device_get_partition_size_fd_stat(fd, &st);
- if (block_count == 0) {
+ byte_count = device_get_partition_size_fd_stat(fd, &st);
+ if (byte_count == 0) {
error("unable to determine size of %s", file);
return 1;
}
- if (max_block_count)
- block_count = min(block_count, max_block_count);
+ if (max_byte_count)
+ byte_count = min(byte_count, max_byte_count);
if (opflags & PREP_DEVICE_ZONED) {
ret = btrfs_get_zone_info(fd, file, &zinfo);
@@ -276,18 +276,18 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
if (discard_supported(file)) {
if (opflags & PREP_DEVICE_VERBOSE)
printf("Performing full device TRIM %s (%s) ...\n",
- file, pretty_size(block_count));
- device_discard_blocks(fd, 0, block_count);
+ file, pretty_size(byte_count));
+ device_discard_blocks(fd, 0, byte_count);
}
}
- ret = zero_dev_clamped(fd, zinfo, 0, ZERO_DEV_BYTES, block_count);
+ ret = zero_dev_clamped(fd, zinfo, 0, ZERO_DEV_BYTES, byte_count);
for (i = 0 ; !ret && i < BTRFS_SUPER_MIRROR_MAX; i++)
ret = zero_dev_clamped(fd, zinfo, btrfs_sb_offset(i),
- BTRFS_SUPER_INFO_SIZE, block_count);
+ BTRFS_SUPER_INFO_SIZE, byte_count);
if (!ret && (opflags & PREP_DEVICE_ZERO_END))
- ret = zero_dev_clamped(fd, zinfo, block_count - ZERO_DEV_BYTES,
- ZERO_DEV_BYTES, block_count);
+ ret = zero_dev_clamped(fd, zinfo, byte_count - ZERO_DEV_BYTES,
+ ZERO_DEV_BYTES, byte_count);
if (ret < 0) {
errno = -ret;
@@ -302,7 +302,7 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
}
free(zinfo);
- *block_count_ret = block_count;
+ *byte_count_ret = byte_count;
return 0;
err:
diff --git a/mkfs/main.c b/mkfs/main.c
index a467795d4428..950f76101058 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -80,8 +80,8 @@ static int opt_oflags = O_RDWR;
struct prepare_device_progress {
int fd;
char *file;
- u64 dev_block_count;
- u64 block_count;
+ u64 dev_byte_count;
+ u64 byte_count;
int ret;
};
@@ -1159,8 +1159,8 @@ static void *prepare_one_device(void *ctx)
}
prepare_ctx->ret = btrfs_prepare_device(prepare_ctx->fd,
prepare_ctx->file,
- &prepare_ctx->dev_block_count,
- prepare_ctx->block_count,
+ &prepare_ctx->dev_byte_count,
+ prepare_ctx->byte_count,
(bconf.verbose ? PREP_DEVICE_VERBOSE : 0) |
(opt_zero_end ? PREP_DEVICE_ZERO_END : 0) |
(opt_discard ? PREP_DEVICE_DISCARD : 0) |
@@ -1204,8 +1204,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
bool metadata_profile_set = false;
u64 data_profile = 0;
bool data_profile_set = false;
- u64 block_count = 0;
- u64 dev_block_count = 0;
+ u64 byte_count = 0;
+ u64 dev_byte_count = 0;
bool mixed = false;
char *label = NULL;
int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN);
@@ -1347,7 +1347,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
sectorsize = arg_strtou64_with_suffix(optarg);
break;
case 'b':
- block_count = arg_strtou64_with_suffix(optarg);
+ byte_count = arg_strtou64_with_suffix(optarg);
opt_zero_end = false;
break;
case 'v':
@@ -1623,34 +1623,34 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
* Block_count not specified, use file/device size first.
* Or we will always use source_dir_size calculated for mkfs.
*/
- if (!block_count)
- block_count = device_get_partition_size_fd_stat(fd, &statbuf);
+ if (!byte_count)
+ byte_count = device_get_partition_size_fd_stat(fd, &statbuf);
source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize,
min_dev_size, metadata_profile, data_profile);
- if (block_count < source_dir_size) {
+ if (byte_count < source_dir_size) {
if (S_ISREG(statbuf.st_mode)) {
- block_count = source_dir_size;
+ byte_count = source_dir_size;
} else {
warning(
"the target device %llu (%s) is smaller than the calculated source directory size %llu (%s), mkfs may fail",
- block_count, pretty_size(block_count),
+ byte_count, pretty_size(byte_count),
source_dir_size, pretty_size(source_dir_size));
}
}
- ret = zero_output_file(fd, block_count);
+ ret = zero_output_file(fd, byte_count);
if (ret) {
error("unable to zero the output file");
close(fd);
goto error;
}
/* our "device" is the new image file */
- dev_block_count = block_count;
+ dev_byte_count = byte_count;
close(fd);
}
- /* Check device/block_count after the nodesize is determined */
- if (block_count && block_count < min_dev_size) {
+ /* Check device/byte_count after the nodesize is determined */
+ if (byte_count && byte_count < min_dev_size) {
error("size %llu is too small to make a usable filesystem",
- block_count);
+ byte_count);
error("minimum size for btrfs filesystem is %llu",
min_dev_size);
goto error;
@@ -1661,9 +1661,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
* 1 zone for a metadata block group
* 1 zone for a data block group
*/
- if (opt_zoned && block_count && block_count < 5 * zone_size(file)) {
+ if (opt_zoned && byte_count && byte_count < 5 * zone_size(file)) {
error("size %llu is too small to make a usable filesystem",
- block_count);
+ byte_count);
error("minimum size for a zoned btrfs filesystem is %llu",
min_dev_size);
goto error;
@@ -1741,8 +1741,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
/* Start threads */
for (i = 0; i < device_count; i++) {
prepare_ctx[i].file = argv[optind + i - 1];
- prepare_ctx[i].block_count = block_count;
- prepare_ctx[i].dev_block_count = block_count;
+ prepare_ctx[i].byte_count = byte_count;
+ prepare_ctx[i].dev_byte_count = byte_count;
ret = pthread_create(&t_prepare[i], NULL, prepare_one_device,
&prepare_ctx[i]);
if (ret) {
@@ -1763,16 +1763,16 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
goto error;
}
- dev_block_count = prepare_ctx[0].dev_block_count;
- if (block_count && block_count > dev_block_count) {
+ dev_byte_count = prepare_ctx[0].dev_byte_count;
+ if (byte_count && byte_count > dev_byte_count) {
error("%s is smaller than requested size, expected %llu, found %llu",
- file, block_count, dev_block_count);
+ file, byte_count, dev_byte_count);
goto error;
}
/* To create the first block group and chunk 0 in make_btrfs */
system_group_size = (opt_zoned ? zone_size(file) : BTRFS_MKFS_SYSTEM_GROUP_SIZE);
- if (dev_block_count < system_group_size) {
+ if (dev_byte_count < system_group_size) {
error("device is too small to make filesystem, must be at least %llu",
system_group_size);
goto error;
@@ -1794,7 +1794,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
mkfs_cfg.label = label;
memcpy(mkfs_cfg.fs_uuid, fs_uuid, sizeof(mkfs_cfg.fs_uuid));
memcpy(mkfs_cfg.dev_uuid, dev_uuid, sizeof(mkfs_cfg.dev_uuid));
- mkfs_cfg.num_bytes = dev_block_count;
+ mkfs_cfg.num_bytes = dev_byte_count;
mkfs_cfg.nodesize = nodesize;
mkfs_cfg.sectorsize = sectorsize;
mkfs_cfg.stripesize = stripesize;
@@ -1889,7 +1889,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
file);
continue;
}
- dev_block_count = prepare_ctx[i].dev_block_count;
+ dev_byte_count = prepare_ctx[i].dev_byte_count;
if (prepare_ctx[i].ret) {
errno = -prepare_ctx[i].ret;
@@ -1898,7 +1898,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
}
ret = btrfs_add_to_fsid(trans, root, prepare_ctx[i].fd,
- prepare_ctx[i].file, dev_block_count,
+ prepare_ctx[i].file, dev_byte_count,
sectorsize, sectorsize, sectorsize);
if (ret) {
error("unable to add %s to filesystem: %d",
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 02/10] btrfs-progs: mkfs: remove duplicated device size check
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 01/10] btrfs-progs: rename block_count to byte_count Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 03/10] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
` (8 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
test_minimum_size() already checks if each device can host the initial
block groups. There is no need to check if the first device can host the
initial system chunk again.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
mkfs/main.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/mkfs/main.c b/mkfs/main.c
index 950f76101058..f6f67abf3b0e 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1189,7 +1189,6 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
struct prepare_device_progress *prepare_ctx = NULL;
struct mkfs_allocation allocation = { 0 };
struct btrfs_mkfs_config mkfs_cfg;
- u64 system_group_size;
/* Options */
bool force_overwrite = false;
struct btrfs_mkfs_features features = btrfs_mkfs_default_features;
@@ -1770,14 +1769,6 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
goto error;
}
- /* To create the first block group and chunk 0 in make_btrfs */
- system_group_size = (opt_zoned ? zone_size(file) : BTRFS_MKFS_SYSTEM_GROUP_SIZE);
- if (dev_byte_count < system_group_size) {
- error("device is too small to make filesystem, must be at least %llu",
- system_group_size);
- goto error;
- }
-
if (btrfs_bg_type_to_tolerated_failures(metadata_profile) <
btrfs_bg_type_to_tolerated_failures(data_profile))
warning("metadata has lower redundancy than data!\n");
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 03/10] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size()
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 01/10] btrfs-progs: rename block_count to byte_count Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 02/10] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 04/10] btrfs-progs: mkfs: fix minimum size calculation for zoned mode Naohiro Aota
` (7 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
We are going to implement a better minimum size calculation for the zoned
mode. Move the current logic to btrfs_min_dev_size() and unify the size
checking path.
Also, convert "int mixed" to "bool mixed" while at it.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
mkfs/common.c | 11 ++++++++++-
mkfs/common.h | 2 +-
mkfs/main.c | 22 +++++-----------------
3 files changed, 16 insertions(+), 19 deletions(-)
diff --git a/mkfs/common.c b/mkfs/common.c
index e61020002417..2550c2219c90 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -811,13 +811,22 @@ static u64 btrfs_min_global_blk_rsv_size(u32 nodesize)
return (u64)nodesize << 10;
}
-u64 btrfs_min_dev_size(u32 nodesize, int mixed, u64 meta_profile,
+u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile,
u64 data_profile)
{
u64 reserved = 0;
u64 meta_size;
u64 data_size;
+ /*
+ * 2 zones for the primary superblock
+ * 1 zone for the system block group
+ * 1 zone for a metadata block group
+ * 1 zone for a data block group
+ */
+ if (zone_size)
+ return 5 * zone_size;
+
if (mixed)
return 2 * (BTRFS_MKFS_SYSTEM_GROUP_SIZE +
btrfs_min_global_blk_rsv_size(nodesize));
diff --git a/mkfs/common.h b/mkfs/common.h
index d9183c997bb2..de0ff57beee8 100644
--- a/mkfs/common.h
+++ b/mkfs/common.h
@@ -105,7 +105,7 @@ struct btrfs_mkfs_config {
int make_btrfs(int fd, struct btrfs_mkfs_config *cfg);
int btrfs_make_root_dir(struct btrfs_trans_handle *trans,
struct btrfs_root *root, u64 objectid);
-u64 btrfs_min_dev_size(u32 nodesize, int mixed, u64 meta_profile,
+u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile,
u64 data_profile);
int test_minimum_size(const char *file, u64 min_dev_size);
int is_vol_small(const char *file);
diff --git a/mkfs/main.c b/mkfs/main.c
index f6f67abf3b0e..a437ecc40c7f 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1588,8 +1588,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
goto error;
}
- min_dev_size = btrfs_min_dev_size(nodesize, mixed, metadata_profile,
- data_profile);
+ min_dev_size = btrfs_min_dev_size(nodesize, mixed,
+ opt_zoned ? zone_size(file) : 0,
+ metadata_profile, data_profile);
/*
* Enlarge the destination file or create a new one, using the size
* calculated from source dir.
@@ -1650,21 +1651,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
if (byte_count && byte_count < min_dev_size) {
error("size %llu is too small to make a usable filesystem",
byte_count);
- error("minimum size for btrfs filesystem is %llu",
- min_dev_size);
- goto error;
- }
- /*
- * 2 zones for the primary superblock
- * 1 zone for the system block group
- * 1 zone for a metadata block group
- * 1 zone for a data block group
- */
- if (opt_zoned && byte_count && byte_count < 5 * zone_size(file)) {
- error("size %llu is too small to make a usable filesystem",
- byte_count);
- error("minimum size for a zoned btrfs filesystem is %llu",
- min_dev_size);
+ error("minimum size for a %sbtrfs filesystem is %llu",
+ opt_zoned ? "zoned mode " : "", min_dev_size);
goto error;
}
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 04/10] btrfs-progs: mkfs: fix minimum size calculation for zoned mode
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (2 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 03/10] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size Naohiro Aota
` (6 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
Currently, we check if a device is larger than 5 zones to determine we can
create btrfs on the device or not. Actually, we need more zones to create
DUP block groups, so it fails with "ERROR: not enough free space to
allocate chunk". Implement proper support for non-SINGLE profile.
Also, current code does not ensure we can create tree-log BG and data
relocation BG, which are essential for the real usage. Count them as
requirement too.
The calculation for a regular btrfs is also adjusted to use dev_stripes
style.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
mkfs/common.c | 67 +++++++++++++++++++++++++++++++++++++--------------
1 file changed, 49 insertions(+), 18 deletions(-)
diff --git a/mkfs/common.c b/mkfs/common.c
index 2550c2219c90..1b09c8b1a673 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -817,15 +817,50 @@ u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile
u64 reserved = 0;
u64 meta_size;
u64 data_size;
+ u64 dev_stripes;
- /*
- * 2 zones for the primary superblock
- * 1 zone for the system block group
- * 1 zone for a metadata block group
- * 1 zone for a data block group
- */
- if (zone_size)
- return 5 * zone_size;
+ if (zone_size) {
+ /* 2 zones for the primary superblock. */
+ reserved += 2 * zone_size;
+
+ /*
+ * 1 zone each for the initial SINGLE system, SINGLE
+ * metadata, and SINGLE data block group
+ */
+ reserved += 3 * zone_size;
+
+ /*
+ * On non-SINGLE profile, we need to add real system and
+ * metadata block group. And, we also need to add a space
+ * for a tree-log block group.
+ *
+ * SINGLE profile can reuse the initial block groups and
+ * only need to add a tree-log block group
+ */
+ dev_stripes = (meta_profile & BTRFS_BLOCK_GROUP_DUP) ? 2 : 1;
+ if (meta_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
+ meta_size = 3 * dev_stripes * zone_size;
+ else
+ meta_size = dev_stripes * zone_size;
+ reserved += meta_size;
+
+ /*
+ * On non-SINGLE profile, we need to add real data block
+ * group. And, we also need to add a space for a data
+ * relocation block group.
+ *
+ * SINGLE profile can reuse the initial block groups and
+ * only need to add a data relocation block group.
+ */
+ dev_stripes = (data_profile & BTRFS_BLOCK_GROUP_DUP) ? 2 : 1;
+ if (data_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
+ data_size = 2 * dev_stripes * zone_size;
+ else
+ data_size = dev_stripes * zone_size;
+ reserved += data_size;
+
+ return reserved;
+ }
if (mixed)
return 2 * (BTRFS_MKFS_SYSTEM_GROUP_SIZE +
@@ -863,22 +898,18 @@ u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile
*
* And use the stripe size to calculate its physical used space.
*/
+ dev_stripes = (meta_profile & BTRFS_BLOCK_GROUP_DUP) ? 2 : 1;
if (meta_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
- meta_size = SZ_8M + SZ_32M;
+ meta_size = dev_stripes * (SZ_8M + SZ_32M);
else
- meta_size = SZ_8M + SZ_8M;
- /* For DUP/metadata, 2 stripes on one disk */
- if (meta_profile & BTRFS_BLOCK_GROUP_DUP)
- meta_size *= 2;
+ meta_size = dev_stripes * (SZ_8M + SZ_8M);
reserved += meta_size;
+ dev_stripes = (data_profile & BTRFS_BLOCK_GROUP_DUP) ? 2 : 1;
if (data_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
- data_size = SZ_64M;
+ data_size = dev_stripes * SZ_64M;
else
- data_size = SZ_8M;
- /* For DUP/data, 2 stripes on one disk */
- if (data_profile & BTRFS_BLOCK_GROUP_DUP)
- data_size *= 2;
+ data_size = dev_stripes * SZ_8M;
reserved += data_size;
return reserved;
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (3 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 04/10] btrfs-progs: mkfs: fix minimum size calculation for zoned mode Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:45 ` Qu Wenruo
2024-05-29 7:13 ` [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting Naohiro Aota
` (5 subsequent siblings)
10 siblings, 1 reply; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota
While "byte_count" is eventually rounded down to sectorsize at make_btrfs()
or btrfs_add_to_fs_id(), it would be better round it down first and do the
size checks not to confuse the things.
Also, on a zoned device, creating a btrfs whose size is not aligned to the
zone boundary can be confusing. Round it down further to the zone boundary.
The size calculation with a source directory is also tweaked to be aligned.
device_get_partition_size_fd_stat() must be aligned down not to exceed the
device size. And, btrfs_mkfs_size_dir() should have return sectorsize aligned
size. So, add an UASSERT for it.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
mkfs/main.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/mkfs/main.c b/mkfs/main.c
index a437ecc40c7f..3446a5b1222f 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1591,6 +1591,12 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
min_dev_size = btrfs_min_dev_size(nodesize, mixed,
opt_zoned ? zone_size(file) : 0,
metadata_profile, data_profile);
+ if (byte_count) {
+ byte_count = round_down(byte_count, sectorsize);
+ if (opt_zoned)
+ byte_count = round_down(byte_count, zone_size(file));
+ }
+
/*
* Enlarge the destination file or create a new one, using the size
* calculated from source dir.
@@ -1624,9 +1630,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
* Or we will always use source_dir_size calculated for mkfs.
*/
if (!byte_count)
- byte_count = device_get_partition_size_fd_stat(fd, &statbuf);
+ byte_count = round_down(device_get_partition_size_fd_stat(fd, &statbuf),
+ sectorsize);
source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize,
min_dev_size, metadata_profile, data_profile);
+ UASSERT(IS_ALIGNED(source_dir_size, sectorsize));
if (byte_count < source_dir_size) {
if (S_ISREG(statbuf.st_mode)) {
byte_count = source_dir_size;
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (4 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-06-03 19:26 ` David Sterba
2024-05-29 7:13 ` [PATCH v4 07/10] btrfs-progs: test: add nullb setup functions Naohiro Aota
` (4 subsequent siblings)
10 siblings, 1 reply; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
Limit the reset target within the specified length.
Also, we need to check that there is no active zone outside of the FS
range. Having an active zone outside FS reduces the number of zones btrfs
can write simultaneously. Technically, we can still scan all the device
zones and keep active zones outside FS intact and try to live with the
limited active zones. But, that will make btrfs operations harder.
It is generally bad idea to use "-b" on a non-test usage on a device with
active zone limit in the first place. You really need to take care that FS
and outside the FS goes over the limit. That means you'll never be able to
use zones outside the FS anyway.
So, until there is a strong request for that, I don't think it's worthwhile
to do so.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
common/device-utils.c | 17 ++++++++++++-----
kernel-shared/zoned.c | 23 ++++++++++++++++++++++-
kernel-shared/zoned.h | 7 ++++---
3 files changed, 38 insertions(+), 9 deletions(-)
diff --git a/common/device-utils.c b/common/device-utils.c
index 86942e0c7041..7df7d9ce39d8 100644
--- a/common/device-utils.c
+++ b/common/device-utils.c
@@ -254,16 +254,23 @@ int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
if (!zinfo->emulated) {
if (opflags & PREP_DEVICE_VERBOSE)
- printf("Resetting device zones %s (%u zones) ...\n",
- file, zinfo->nr_zones);
+ printf("Resetting device zones %s (%llu zones) ...\n",
+ file, byte_count / zinfo->zone_size);
/*
* We cannot ignore zone reset errors for a zoned block
* device as this could result in the inability to write
* to non-empty sequential zones of the device.
*/
- if (btrfs_reset_all_zones(fd, zinfo)) {
- error("zoned: failed to reset device '%s' zones: %m",
- file);
+ ret = btrfs_reset_zones(fd, zinfo, byte_count);
+ if (ret) {
+ if (ret == EBUSY) {
+ error("zoned: device '%s' contains an active zone outside of the FS range",
+ file);
+ error("zoned: btrfs needs full control of active zones");
+ } else {
+ error("zoned: failed to reset device '%s' zones: %m",
+ file);
+ }
goto err;
}
}
diff --git a/kernel-shared/zoned.c b/kernel-shared/zoned.c
index fb1e1388804e..b4244966ca36 100644
--- a/kernel-shared/zoned.c
+++ b/kernel-shared/zoned.c
@@ -395,16 +395,24 @@ static int report_zones(int fd, const char *file,
* Discard blocks in the zones of a zoned block device. Process this with zone
* size granularity so that blocks in conventional zones are discarded using
* discard_range and blocks in sequential zones are reset though a zone reset.
+ *
+ * We need to ensure that zones outside of the FS is not active, so that
+ * the FS can use all the active zones. Return EBUSY if there is an active
+ * zone.
*/
-int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
+int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count)
{
unsigned int i;
int ret = 0;
ASSERT(zinfo);
+ ASSERT(IS_ALIGNED(byte_count, zinfo->zone_size));
/* Zone size granularity */
for (i = 0; i < zinfo->nr_zones; i++) {
+ if (byte_count == 0)
+ break;
+
if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL) {
ret = device_discard_blocks(fd,
zinfo->zones[i].start << SECTOR_SHIFT,
@@ -419,7 +427,20 @@ int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
if (ret)
return ret;
+
+ byte_count -= zinfo->zone_size;
}
+ for (; i < zinfo->nr_zones; i++) {
+ const enum blk_zone_cond cond = zinfo->zones[i].cond;
+
+ if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL)
+ continue;
+ if (cond == BLK_ZONE_COND_IMP_OPEN ||
+ cond == BLK_ZONE_COND_EXP_OPEN ||
+ cond == BLK_ZONE_COND_CLOSED)
+ return EBUSY;
+ }
+
return fsync(fd);
}
diff --git a/kernel-shared/zoned.h b/kernel-shared/zoned.h
index 6eba86d266bf..2bf24cbba62a 100644
--- a/kernel-shared/zoned.h
+++ b/kernel-shared/zoned.h
@@ -149,7 +149,7 @@ bool btrfs_redirty_extent_buffer_for_zoned(struct btrfs_fs_info *fs_info,
u64 start, u64 end);
int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info, u64 devid,
u64 offset, u64 length);
-int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo);
+int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count);
int zero_zone_blocks(int fd, struct btrfs_zoned_device_info *zinfo, off_t start,
size_t len);
int btrfs_wipe_temporary_sb(struct btrfs_fs_devices *fs_devices);
@@ -203,8 +203,9 @@ static inline int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info,
return 0;
}
-static inline int btrfs_reset_all_zones(int fd,
- struct btrfs_zoned_device_info *zinfo)
+static inline int btrfs_reset_zones(int fd,
+ struct btrfs_zoned_device_info *zinfo,
+ u64 byte_count)
{
return -EOPNOTSUPP;
}
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 07/10] btrfs-progs: test: add nullb setup functions
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (5 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 08/10] btrfs-progs: test: add test for zone resetting Naohiro Aota
` (3 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
Add functions to setup, create and remove nullb devices.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
tests/common | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)
diff --git a/tests/common b/tests/common
index 1f880adead6d..ef9fcd32870a 100644
--- a/tests/common
+++ b/tests/common
@@ -882,6 +882,69 @@ cond_wait_for_loopdevs() {
fi
}
+# prepare environment for nullb devices, set up the following variables
+# - nullb_count -- number of desired devices
+# - nullb_size -- size of the devices
+# - nullb_zone_size -- zone size of the devices
+# - nullb_devs -- array containing paths to all devices (after prepare is called)
+#
+# $1: number of nullb devices to be set up
+# $2: size of the devices
+# $3: zone size of the devices
+setup_nullbdevs()
+{
+ if [ "$#" -lt 3 ]; then
+ _fail "setup_nullbdevs <number of device> <size> <zone size>"
+ fi
+
+ setup_root_helper
+ local nullb="${TEST_TOP}/nullb"
+
+ run_mayfail $SUDO_HELPER "${nullb}" setup
+ if [ $? != 0 ]; then
+ _not_run "cannot setup nullb environment for zoned devices"
+ fi
+
+ nullb_count="$1"
+ nullb_size="$2"
+ nullb_zone_size="$3"
+ declare -a nullb_devs
+}
+
+# create all nullb devices from a given nullb environment
+prepare_nullbdevs()
+{
+ setup_root_helper
+ local nullb="${TEST_TOP}/nullb"
+
+ # Record any other pre-existing devices in case creation fails
+ run_check $SUDO_HELPER "${nullb}" ls
+
+ for i in `seq ${nullb_count}`; do
+ # Last line has the name of the device node path
+ out=$(run_check_stdout $SUDO_HELPER "${nullb}" create -s "${nullb_size}" -z "${nullb_zone_size}")
+ if [ $? != 0 ]; then
+ _fail "cannot create nullb zoned device $i"
+ fi
+ dev=$(echo "${out}" | tail -n 1)
+ nullb_devs[$i]=${dev}
+ done
+
+ run_check $SUDO_HELPER "${nullb}" ls
+}
+
+# remove nullb devices
+cleanup_nullbdevs()
+{
+ setup_root_helper
+ local nullb="${TEST_TOP}/nullb"
+
+ for dev in ${nullb_devs[@]}; do
+ name=$(basename ${dev})
+ run_check $SUDO_HELPER "${nullb}" rm "${name}"
+ done
+}
+
init_env()
{
TEST_MNT="${TEST_MNT:-$TEST_TOP/mnt}"
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 08/10] btrfs-progs: test: add test for zone resetting
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (6 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 07/10] btrfs-progs: test: add nullb setup functions Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 09/10] btrfs-progs: test: use nullb helper and smaller zone size Naohiro Aota
` (2 subsequent siblings)
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
Add test for mkfs.btrfs's zone reset behavior to check if
- it resets all the zones without "-b" option
- it detects an active zone outside of the FS range
- it do not reset a zone outside of the range
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
tests/mkfs-tests/032-zoned-reset/test.sh | 43 ++++++++++++++++++++++++
1 file changed, 43 insertions(+)
create mode 100755 tests/mkfs-tests/032-zoned-reset/test.sh
diff --git a/tests/mkfs-tests/032-zoned-reset/test.sh b/tests/mkfs-tests/032-zoned-reset/test.sh
new file mode 100755
index 000000000000..2aedb14abb03
--- /dev/null
+++ b/tests/mkfs-tests/032-zoned-reset/test.sh
@@ -0,0 +1,43 @@
+#!/bin/bash
+# Verify mkfs for zoned devices support block-group-tree feature
+
+source "$TEST_TOP/common" || exit
+
+check_global_prereq blkzone
+setup_root_helper
+# Create one 128M device with 4M zones, 32 of them
+setup_nullbdevs 1 128 4
+
+prepare_nullbdevs
+
+TEST_DEV="${nullb_devs[1]}"
+last_zone_sector=$(( 4 * 31 * 1024 * 1024 / 512 ))
+# Write some data to the last zone
+run_check $SUDO_HELPER dd if=/dev/urandom of="${TEST_DEV}" bs=1M count=4 seek=$(( 4 * 31 ))
+# Use single as it's supported on more kernels
+run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -m single -d single "${TEST_DEV}"
+# Check if the lat zone is empty
+run_check_stdout $SUDO_HELPER blkzone report -o ${last_zone_sector} -c 1 "${TEST_DEV}" | grep -Fq '(em)'
+if [ $? != 0 ]; then
+ _fail "last zone is not empty"
+fi
+
+# Write some data to the last zone
+run_check $SUDO_HELPER dd if=/dev/urandom of="${TEST_DEV}" bs=1M count=1 seek=$(( 4 * 31 ))
+# Create a FS excluding the last zone
+run_mayfail $SUDO_HELPER "$TOP/mkfs.btrfs" -f -b $(( 4 * 31 ))M -m single -d single "${TEST_DEV}"
+if [ $? == 0 ]; then
+ _fail "mkfs.btrfs should detect active zone outside of FS range"
+fi
+
+# Fill the last zone to finish it
+run_check $SUDO_HELPER dd if=/dev/urandom of="${TEST_DEV}" bs=1M count=3 seek=$(( 4 * 31 + 1 ))
+# Create a FS excluding the last zone
+run_mayfail $SUDO_HELPER "$TOP/mkfs.btrfs" -f -b $(( 4 * 31 ))M -m single -d single "${TEST_DEV}"
+# Check if the lat zone is not empty
+run_check_stdout $SUDO_HELPER blkzone report -o ${last_zone_sector} -c 1 "${TEST_DEV}" | grep -Fq '(em)'
+if [ $? == 0 ]; then
+ _fail "last zone is empty"
+fi
+
+cleanup_nullbdevs
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 09/10] btrfs-progs: test: use nullb helper and smaller zone size
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (7 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 08/10] btrfs-progs: test: add test for zone resetting Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 10/10] btrfs-progs: test: use nullb helpers in 031-zoned-bgt Naohiro Aota
2024-06-03 19:36 ` [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
With the change of minimal number of zones, mkfs-tests/030-zoned-rst now
fails because the loopback device is 2GB and can contain 8x 256MB zones.
Use the nullb helpers to choose a smaller zone size.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
tests/mkfs-tests/030-zoned-rst/test.sh | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/tests/mkfs-tests/030-zoned-rst/test.sh b/tests/mkfs-tests/030-zoned-rst/test.sh
index 2e048cf79f20..b1c696c96eb7 100755
--- a/tests/mkfs-tests/030-zoned-rst/test.sh
+++ b/tests/mkfs-tests/030-zoned-rst/test.sh
@@ -4,22 +4,22 @@
source "$TEST_TOP/common" || exit
setup_root_helper
-setup_loopdevs 4
-prepare_loopdevs
-TEST_DEV=${loopdevs[1]}
+setup_nullbdevs 4 128 4
+prepare_nullbdevs
+TEST_DEV=${nullb_devs[1]}
profiles="single dup raid1 raid1c3 raid1c4 raid10"
for dprofile in $profiles; do
for mprofile in $profiles; do
# It's sufficient to specify only 'zoned', the rst will be enabled
- run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d "$dprofile" -m "$mprofile" "${loopdevs[@]}"
+ run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d "$dprofile" -m "$mprofile" "${nullb_devs[@]}"
done
done
run_mustfail "unsupported profile raid56 created" \
- $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d raid5 -m raid5 "${loopdevs[@]}"
+ $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d raid5 -m raid5 "${nullb_devs[@]}"
run_mustfail "unsupported profile raid56 created" \
- $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d raid6 -m raid6 "${loopdevs[@]}"
+ $SUDO_HELPER "$TOP/mkfs.btrfs" -f -O zoned -d raid6 -m raid6 "${nullb_devs[@]}"
-cleanup_loopdevs
+cleanup_nullbdevs
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v4 10/10] btrfs-progs: test: use nullb helpers in 031-zoned-bgt
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (8 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 09/10] btrfs-progs: test: use nullb helper and smaller zone size Naohiro Aota
@ 2024-05-29 7:13 ` Naohiro Aota
2024-06-03 19:36 ` [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
10 siblings, 0 replies; 15+ messages in thread
From: Naohiro Aota @ 2024-05-29 7:13 UTC (permalink / raw)
To: linux-btrfs; +Cc: Naohiro Aota, Qu Wenruo
Rewrite 031-zoned-bgt with the nullb helpers.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
---
tests/mkfs-tests/031-zoned-bgt/test.sh | 30 +++++---------------------
1 file changed, 5 insertions(+), 25 deletions(-)
diff --git a/tests/mkfs-tests/031-zoned-bgt/test.sh b/tests/mkfs-tests/031-zoned-bgt/test.sh
index 91c107cd5a3b..e296c29b9238 100755
--- a/tests/mkfs-tests/031-zoned-bgt/test.sh
+++ b/tests/mkfs-tests/031-zoned-bgt/test.sh
@@ -4,37 +4,17 @@
source "$TEST_TOP/common" || exit
setup_root_helper
-prepare_test_dev
-
-nullb="$TEST_TOP/nullb"
# Create one 128M device with 4M zones, 32 of them
-size=128
-zone=4
-
-run_mayfail $SUDO_HELPER "$nullb" setup
-if [ $? != 0 ]; then
- _not_run "cannot setup nullb environment for zoned devices"
-fi
-
-# Record any other pre-existing devices in case creation fails
-run_check $SUDO_HELPER "$nullb" ls
-
-# Last line has the name of the device node path
-out=$(run_check_stdout $SUDO_HELPER "$nullb" create -s "$size" -z "$zone")
-if [ $? != 0 ]; then
- _fail "cannot create nullb zoned device $i"
-fi
-dev=$(echo "$out" | tail -n 1)
-name=$(basename "${dev}")
+setup_nullbdevs 1 128 4
-run_check $SUDO_HELPER "$nullb" ls
+prepare_nullbdevs
-TEST_DEV="${dev}"
+TEST_DEV="${nullb_devs[1]}"
# Use single as it's supported on more kernels
-run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -m single -d single -O block-group-tree "${dev}"
+run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -m single -d single -O block-group-tree "${TEST_DEV}"
run_check_mount_test_dev
run_check $SUDO_HELPER dd if=/dev/zero of="$TEST_MNT"/file bs=1M count=1
run_check $SUDO_HELPER "$TOP/btrfs" filesystem usage -T "$TEST_MNT"
run_check_umount_test_dev
-run_check $SUDO_HELPER "$nullb" rm "${name}"
+cleanup_nullbdevs
--
2.45.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size
2024-05-29 7:13 ` [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size Naohiro Aota
@ 2024-05-29 7:45 ` Qu Wenruo
2024-05-30 17:26 ` David Sterba
0 siblings, 1 reply; 15+ messages in thread
From: Qu Wenruo @ 2024-05-29 7:45 UTC (permalink / raw)
To: Naohiro Aota, linux-btrfs, David Sterba
在 2024/5/29 16:43, Naohiro Aota 写道:
> While "byte_count" is eventually rounded down to sectorsize at make_btrfs()
> or btrfs_add_to_fs_id(), it would be better round it down first and do the
> size checks not to confuse the things.
>
> Also, on a zoned device, creating a btrfs whose size is not aligned to the
> zone boundary can be confusing. Round it down further to the zone boundary.
>
> The size calculation with a source directory is also tweaked to be aligned.
> device_get_partition_size_fd_stat() must be aligned down not to exceed the
> device size. And, btrfs_mkfs_size_dir() should have return sectorsize aligned
> size. So, add an UASSERT for it.
>
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
To David, since I have the write permission to btrfs-progs and reviewed
the series, can I push it to devel branch now?
Thanks,
Qu
> ---
> mkfs/main.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/mkfs/main.c b/mkfs/main.c
> index a437ecc40c7f..3446a5b1222f 100644
> --- a/mkfs/main.c
> +++ b/mkfs/main.c
> @@ -1591,6 +1591,12 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
> min_dev_size = btrfs_min_dev_size(nodesize, mixed,
> opt_zoned ? zone_size(file) : 0,
> metadata_profile, data_profile);
> + if (byte_count) {
> + byte_count = round_down(byte_count, sectorsize);
> + if (opt_zoned)
> + byte_count = round_down(byte_count, zone_size(file));
> + }
> +
> /*
> * Enlarge the destination file or create a new one, using the size
> * calculated from source dir.
> @@ -1624,9 +1630,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
> * Or we will always use source_dir_size calculated for mkfs.
> */
> if (!byte_count)
> - byte_count = device_get_partition_size_fd_stat(fd, &statbuf);
> + byte_count = round_down(device_get_partition_size_fd_stat(fd, &statbuf),
> + sectorsize);
> source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize,
> min_dev_size, metadata_profile, data_profile);
> + UASSERT(IS_ALIGNED(source_dir_size, sectorsize));
> if (byte_count < source_dir_size) {
> if (S_ISREG(statbuf.st_mode)) {
> byte_count = source_dir_size;
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size
2024-05-29 7:45 ` Qu Wenruo
@ 2024-05-30 17:26 ` David Sterba
0 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2024-05-30 17:26 UTC (permalink / raw)
To: Qu Wenruo; +Cc: Naohiro Aota, linux-btrfs, David Sterba
On Wed, May 29, 2024 at 05:15:37PM +0930, Qu Wenruo wrote:
>
>
> 在 2024/5/29 16:43, Naohiro Aota 写道:
> > While "byte_count" is eventually rounded down to sectorsize at make_btrfs()
> > or btrfs_add_to_fs_id(), it would be better round it down first and do the
> > size checks not to confuse the things.
> >
> > Also, on a zoned device, creating a btrfs whose size is not aligned to the
> > zone boundary can be confusing. Round it down further to the zone boundary.
> >
> > The size calculation with a source directory is also tweaked to be aligned.
> > device_get_partition_size_fd_stat() must be aligned down not to exceed the
> > device size. And, btrfs_mkfs_size_dir() should have return sectorsize aligned
> > size. So, add an UASSERT for it.
> >
> > Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
>
> Reviewed-by: Qu Wenruo <wqu@suse.com>
>
> To David, since I have the write permission to btrfs-progs and reviewed
> the series, can I push it to devel branch now?
OK, go on. I'm done for now with changes to fix the LE/BE and unaligned
access. For future it would be better if you create a pull request and
mark it that you want to merge it yourself, I could miss it in replies
to patches in the middle of a series.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting
2024-05-29 7:13 ` [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting Naohiro Aota
@ 2024-06-03 19:26 ` David Sterba
0 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2024-06-03 19:26 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs, Qu Wenruo
On Wed, May 29, 2024 at 04:13:21PM +0900, Naohiro Aota wrote:
> Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
> Limit the reset target within the specified length.
>
> Also, we need to check that there is no active zone outside of the FS
> range. Having an active zone outside FS reduces the number of zones btrfs
> can write simultaneously. Technically, we can still scan all the device
> zones and keep active zones outside FS intact and try to live with the
> limited active zones. But, that will make btrfs operations harder.
>
> It is generally bad idea to use "-b" on a non-test usage on a device with
> active zone limit in the first place. You really need to take care that FS
> and outside the FS goes over the limit. That means you'll never be able to
> use zones outside the FS anyway.
>
> So, until there is a strong request for that, I don't think it's worthwhile
> to do so.
>
> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
> Reviewed-by: Qu Wenruo <wqu@suse.com>
> ---
> common/device-utils.c | 17 ++++++++++++-----
> kernel-shared/zoned.c | 23 ++++++++++++++++++++++-
> kernel-shared/zoned.h | 7 ++++---
> 3 files changed, 38 insertions(+), 9 deletions(-)
>
> diff --git a/common/device-utils.c b/common/device-utils.c
> index 86942e0c7041..7df7d9ce39d8 100644
> --- a/common/device-utils.c
> +++ b/common/device-utils.c
> @@ -254,16 +254,23 @@ int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
>
> if (!zinfo->emulated) {
> if (opflags & PREP_DEVICE_VERBOSE)
> - printf("Resetting device zones %s (%u zones) ...\n",
> - file, zinfo->nr_zones);
> + printf("Resetting device zones %s (%llu zones) ...\n",
> + file, byte_count / zinfo->zone_size);
> /*
> * We cannot ignore zone reset errors for a zoned block
> * device as this could result in the inability to write
> * to non-empty sequential zones of the device.
> */
> - if (btrfs_reset_all_zones(fd, zinfo)) {
> - error("zoned: failed to reset device '%s' zones: %m",
> - file);
> + ret = btrfs_reset_zones(fd, zinfo, byte_count);
> + if (ret) {
> + if (ret == EBUSY) {
> + error("zoned: device '%s' contains an active zone outside of the FS range",
> + file);
> + error("zoned: btrfs needs full control of active zones");
> + } else {
> + error("zoned: failed to reset device '%s' zones: %m",
> + file);
> + }
> goto err;
> }
> }
> diff --git a/kernel-shared/zoned.c b/kernel-shared/zoned.c
> index fb1e1388804e..b4244966ca36 100644
> --- a/kernel-shared/zoned.c
> +++ b/kernel-shared/zoned.c
> @@ -395,16 +395,24 @@ static int report_zones(int fd, const char *file,
> * Discard blocks in the zones of a zoned block device. Process this with zone
> * size granularity so that blocks in conventional zones are discarded using
> * discard_range and blocks in sequential zones are reset though a zone reset.
> + *
> + * We need to ensure that zones outside of the FS is not active, so that
> + * the FS can use all the active zones. Return EBUSY if there is an active
> + * zone.
> */
> -int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
> +int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count)
> {
> unsigned int i;
> int ret = 0;
>
> ASSERT(zinfo);
> + ASSERT(IS_ALIGNED(byte_count, zinfo->zone_size));
>
> /* Zone size granularity */
> for (i = 0; i < zinfo->nr_zones; i++) {
> + if (byte_count == 0)
> + break;
> +
> if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL) {
> ret = device_discard_blocks(fd,
> zinfo->zones[i].start << SECTOR_SHIFT,
> @@ -419,7 +427,20 @@ int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
>
> if (ret)
> return ret;
> +
> + byte_count -= zinfo->zone_size;
> }
> + for (; i < zinfo->nr_zones; i++) {
> + const enum blk_zone_cond cond = zinfo->zones[i].cond;
> +
> + if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL)
> + continue;
> + if (cond == BLK_ZONE_COND_IMP_OPEN ||
> + cond == BLK_ZONE_COND_EXP_OPEN ||
> + cond == BLK_ZONE_COND_CLOSED)
> + return EBUSY;
Should this return -EBUSY? It should not matter for this case but by
convention it would be better to use only negative errnos. I found
another one that's in the same call chain that still returns plain
errno, discard_range(). This should be fixed, possibly separately, so
I'll keep your patch as is.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
` (9 preceding siblings ...)
2024-05-29 7:13 ` [PATCH v4 10/10] btrfs-progs: test: use nullb helpers in 031-zoned-bgt Naohiro Aota
@ 2024-06-03 19:36 ` David Sterba
10 siblings, 0 replies; 15+ messages in thread
From: David Sterba @ 2024-06-03 19:36 UTC (permalink / raw)
To: Naohiro Aota; +Cc: linux-btrfs
On Wed, May 29, 2024 at 04:13:15PM +0900, Naohiro Aota wrote:
> mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
> below.
>
> - The FS size needs to be larger than minimal size that can host a btrfs,
> but its calculation does not consider non-SINGLE profile
> - The calculation also does not ensure tree-log BG and data relocation BG
> - It allows creating a FS not aligned to the zone boundary
> - It resets all device zones beyond the specified length
>
> This series fixes the issues with some cleanups.
>
> This one passed CI workflow here:
>
> Patches 1 to 3 are clean up patches, so they should not change the behavior.
>
> Patches 4 to 6 address the issues.
>
> Patches 7 to 10 add/modify the test cases. First, patch 7 adds nullb
> functions to use in later patches. Patch 8 adds a new test for
> zone resetting. And, patches 9 and 10 rewrites existing tests with the
> nullb helper.
>
> Changes:
> - v4:
> - Fix source directory size alignment.
> - v3: https://lore.kernel.org/linux-btrfs/dfd8887b-a2cb-425f-8705-0d6a94cefb9c@gmx.com/
> - Tweak minimum FS size calculation style.
> - Round down the specified byte_count towards sectorsize and zone
> size, instead of banning unaligned byte_count.
> - Add active zone description in the commit log of patch 6.
> - Add nullb test functions and use them in tests.
> - v2: https://lore.kernel.org/linux-btrfs/20240514182227.1197664-1-naohiro.aota@wdc.com/
> - fix function declaration on older distro (non-ZONED setup)
> - fix mkfs test failure
> - v1: https://lore.kernel.org/linux-btrfs/20240514005133.44786-1-naohiro.aota@wdc.com/
>
> Naohiro Aota (10):
> btrfs-progs: rename block_count to byte_count
> btrfs-progs: mkfs: remove duplicated device size check
> btrfs-progs: mkfs: unify zoned mode minimum size calc into
> btrfs_min_dev_size()
> btrfs-progs: mkfs: fix minimum size calculation for zoned mode
> btrfs-progs: mkfs: align byte_count with sectorsize and zone size
> btrfs-progs: support byte length for zone resetting
> btrfs-progs: test: add nullb setup functions
> btrfs-progs: test: add test for zone resetting
> btrfs-progs: test: use nullb helper and smaller zone size
> btrfs-progs: test: use nullb helpers in 031-zoned-bgt
Added to devel, thanks.
One thing that may be worth adding to the documentation is behaviour
regarding the active zones outside of the fs range. The error messages
are clear if this happens, for troubleshooting it may be useful to say
what to do or check if this happens.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-06-03 19:37 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 7:13 [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 01/10] btrfs-progs: rename block_count to byte_count Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 02/10] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 03/10] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 04/10] btrfs-progs: mkfs: fix minimum size calculation for zoned mode Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 05/10] btrfs-progs: mkfs: align byte_count with sectorsize and zone size Naohiro Aota
2024-05-29 7:45 ` Qu Wenruo
2024-05-30 17:26 ` David Sterba
2024-05-29 7:13 ` [PATCH v4 06/10] btrfs-progs: support byte length for zone resetting Naohiro Aota
2024-06-03 19:26 ` David Sterba
2024-05-29 7:13 ` [PATCH v4 07/10] btrfs-progs: test: add nullb setup functions Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 08/10] btrfs-progs: test: add test for zone resetting Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 09/10] btrfs-progs: test: use nullb helper and smaller zone size Naohiro Aota
2024-05-29 7:13 ` [PATCH v4 10/10] btrfs-progs: test: use nullb helpers in 031-zoned-bgt Naohiro Aota
2024-06-03 19:36 ` [PATCH v4 00/10] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox