Linux Btrfs filesystem development
 help / color / mirror / Atom feed
* [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
@ 2024-05-14  0:51 Naohiro Aota
  2024-05-14  0:51 ` [PATCH 1/7] btrfs-progs: rename block_count to byte_count Naohiro Aota
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
below.

- The FS size needs to be larger than minimal size that can host a btrfs,
  but its calculation does not consider non-SINGLE profile
- The calculation also does not ensure tree-log BG and data relocation BG
- It allows creating a FS not aligned to the zone boundary
- It resets all device zones beyond the specified length

This series fixes the issues with some cleanups.

Patches 1 to 3 are clean up patches, so they should not change the behavior.

Patches 4 to 6 address the issues and the last patch adds a test case.

Naohiro Aota (7):
  btrfs-progs: rename block_count to byte_count
  btrfs-progs: mkfs: remove duplicated device size check
  btrfs-progs: mkfs: unify zoned mode minimum size calc into
    btrfs_min_dev_size()
  btrfs-progs: mkfs: fix minimum size calculation for zoned
  btrfs-progs: mkfs: check if byte_count is zone size aligned
  btrfs-progs: support byte length for zone resetting
  btrfs-progs: add test for zone resetting

 common/device-utils.c                    | 45 +++++++------
 kernel-shared/zoned.c                    | 23 ++++++-
 kernel-shared/zoned.h                    |  2 +-
 mkfs/common.c                            | 48 +++++++++++++-
 mkfs/common.h                            |  2 +-
 mkfs/main.c                              | 82 ++++++++++--------------
 tests/mkfs-tests/032-zoned-reset/test.sh | 62 ++++++++++++++++++
 7 files changed, 192 insertions(+), 72 deletions(-)
 create mode 100755 tests/mkfs-tests/032-zoned-reset/test.sh

--
2.45.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/7] btrfs-progs: rename block_count to byte_count
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 2/7] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

block_count and dev_block_count are counting the size in bytes. And,
comparing them with e.g, "min_dev_size" is confusing. Rename them to
represent the unit better.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 common/device-utils.c | 28 +++++++++++-----------
 mkfs/main.c           | 56 +++++++++++++++++++++----------------------
 2 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/common/device-utils.c b/common/device-utils.c
index d086e9ea2564..86942e0c7041 100644
--- a/common/device-utils.c
+++ b/common/device-utils.c
@@ -222,11 +222,11 @@ out:
  * - reset zones
  * - delete end of the device
  */
-int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
-		u64 max_block_count, unsigned opflags)
+int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
+		u64 max_byte_count, unsigned opflags)
 {
 	struct btrfs_zoned_device_info *zinfo = NULL;
-	u64 block_count;
+	u64 byte_count;
 	struct stat st;
 	int i, ret;
 
@@ -236,13 +236,13 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
 		return 1;
 	}
 
-	block_count = device_get_partition_size_fd_stat(fd, &st);
-	if (block_count == 0) {
+	byte_count = device_get_partition_size_fd_stat(fd, &st);
+	if (byte_count == 0) {
 		error("unable to determine size of %s", file);
 		return 1;
 	}
-	if (max_block_count)
-		block_count = min(block_count, max_block_count);
+	if (max_byte_count)
+		byte_count = min(byte_count, max_byte_count);
 
 	if (opflags & PREP_DEVICE_ZONED) {
 		ret = btrfs_get_zone_info(fd, file, &zinfo);
@@ -276,18 +276,18 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
 		if (discard_supported(file)) {
 			if (opflags & PREP_DEVICE_VERBOSE)
 				printf("Performing full device TRIM %s (%s) ...\n",
-						file, pretty_size(block_count));
-			device_discard_blocks(fd, 0, block_count);
+						file, pretty_size(byte_count));
+			device_discard_blocks(fd, 0, byte_count);
 		}
 	}
 
-	ret = zero_dev_clamped(fd, zinfo, 0, ZERO_DEV_BYTES, block_count);
+	ret = zero_dev_clamped(fd, zinfo, 0, ZERO_DEV_BYTES, byte_count);
 	for (i = 0 ; !ret && i < BTRFS_SUPER_MIRROR_MAX; i++)
 		ret = zero_dev_clamped(fd, zinfo, btrfs_sb_offset(i),
-				       BTRFS_SUPER_INFO_SIZE, block_count);
+				       BTRFS_SUPER_INFO_SIZE, byte_count);
 	if (!ret && (opflags & PREP_DEVICE_ZERO_END))
-		ret = zero_dev_clamped(fd, zinfo, block_count - ZERO_DEV_BYTES,
-				       ZERO_DEV_BYTES, block_count);
+		ret = zero_dev_clamped(fd, zinfo, byte_count - ZERO_DEV_BYTES,
+				       ZERO_DEV_BYTES, byte_count);
 
 	if (ret < 0) {
 		errno = -ret;
@@ -302,7 +302,7 @@ int btrfs_prepare_device(int fd, const char *file, u64 *block_count_ret,
 	}
 
 	free(zinfo);
-	*block_count_ret = block_count;
+	*byte_count_ret = byte_count;
 	return 0;
 
 err:
diff --git a/mkfs/main.c b/mkfs/main.c
index a467795d4428..950f76101058 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -80,8 +80,8 @@ static int opt_oflags = O_RDWR;
 struct prepare_device_progress {
 	int fd;
 	char *file;
-	u64 dev_block_count;
-	u64 block_count;
+	u64 dev_byte_count;
+	u64 byte_count;
 	int ret;
 };
 
@@ -1159,8 +1159,8 @@ static void *prepare_one_device(void *ctx)
 	}
 	prepare_ctx->ret = btrfs_prepare_device(prepare_ctx->fd,
 				prepare_ctx->file,
-				&prepare_ctx->dev_block_count,
-				prepare_ctx->block_count,
+				&prepare_ctx->dev_byte_count,
+				prepare_ctx->byte_count,
 				(bconf.verbose ? PREP_DEVICE_VERBOSE : 0) |
 				(opt_zero_end ? PREP_DEVICE_ZERO_END : 0) |
 				(opt_discard ? PREP_DEVICE_DISCARD : 0) |
@@ -1204,8 +1204,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	bool metadata_profile_set = false;
 	u64 data_profile = 0;
 	bool data_profile_set = false;
-	u64 block_count = 0;
-	u64 dev_block_count = 0;
+	u64 byte_count = 0;
+	u64 dev_byte_count = 0;
 	bool mixed = false;
 	char *label = NULL;
 	int nr_global_roots = sysconf(_SC_NPROCESSORS_ONLN);
@@ -1347,7 +1347,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 				sectorsize = arg_strtou64_with_suffix(optarg);
 				break;
 			case 'b':
-				block_count = arg_strtou64_with_suffix(optarg);
+				byte_count = arg_strtou64_with_suffix(optarg);
 				opt_zero_end = false;
 				break;
 			case 'v':
@@ -1623,34 +1623,34 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		 * Block_count not specified, use file/device size first.
 		 * Or we will always use source_dir_size calculated for mkfs.
 		 */
-		if (!block_count)
-			block_count = device_get_partition_size_fd_stat(fd, &statbuf);
+		if (!byte_count)
+			byte_count = device_get_partition_size_fd_stat(fd, &statbuf);
 		source_dir_size = btrfs_mkfs_size_dir(source_dir, sectorsize,
 				min_dev_size, metadata_profile, data_profile);
-		if (block_count < source_dir_size) {
+		if (byte_count < source_dir_size) {
 			if (S_ISREG(statbuf.st_mode)) {
-				block_count = source_dir_size;
+				byte_count = source_dir_size;
 			} else {
 				warning(
 "the target device %llu (%s) is smaller than the calculated source directory size %llu (%s), mkfs may fail",
-					block_count, pretty_size(block_count),
+					byte_count, pretty_size(byte_count),
 					source_dir_size, pretty_size(source_dir_size));
 			}
 		}
-		ret = zero_output_file(fd, block_count);
+		ret = zero_output_file(fd, byte_count);
 		if (ret) {
 			error("unable to zero the output file");
 			close(fd);
 			goto error;
 		}
 		/* our "device" is the new image file */
-		dev_block_count = block_count;
+		dev_byte_count = byte_count;
 		close(fd);
 	}
-	/* Check device/block_count after the nodesize is determined */
-	if (block_count && block_count < min_dev_size) {
+	/* Check device/byte_count after the nodesize is determined */
+	if (byte_count && byte_count < min_dev_size) {
 		error("size %llu is too small to make a usable filesystem",
-			block_count);
+			byte_count);
 		error("minimum size for btrfs filesystem is %llu",
 			min_dev_size);
 		goto error;
@@ -1661,9 +1661,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	 * 1 zone for a metadata block group
 	 * 1 zone for a data block group
 	 */
-	if (opt_zoned && block_count && block_count < 5 * zone_size(file)) {
+	if (opt_zoned && byte_count && byte_count < 5 * zone_size(file)) {
 		error("size %llu is too small to make a usable filesystem",
-			block_count);
+			byte_count);
 		error("minimum size for a zoned btrfs filesystem is %llu",
 			min_dev_size);
 		goto error;
@@ -1741,8 +1741,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	/* Start threads */
 	for (i = 0; i < device_count; i++) {
 		prepare_ctx[i].file = argv[optind + i - 1];
-		prepare_ctx[i].block_count = block_count;
-		prepare_ctx[i].dev_block_count = block_count;
+		prepare_ctx[i].byte_count = byte_count;
+		prepare_ctx[i].dev_byte_count = byte_count;
 		ret = pthread_create(&t_prepare[i], NULL, prepare_one_device,
 				     &prepare_ctx[i]);
 		if (ret) {
@@ -1763,16 +1763,16 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		goto error;
 	}
 
-	dev_block_count = prepare_ctx[0].dev_block_count;
-	if (block_count && block_count > dev_block_count) {
+	dev_byte_count = prepare_ctx[0].dev_byte_count;
+	if (byte_count && byte_count > dev_byte_count) {
 		error("%s is smaller than requested size, expected %llu, found %llu",
-		      file, block_count, dev_block_count);
+		      file, byte_count, dev_byte_count);
 		goto error;
 	}
 
 	/* To create the first block group and chunk 0 in make_btrfs */
 	system_group_size = (opt_zoned ? zone_size(file) : BTRFS_MKFS_SYSTEM_GROUP_SIZE);
-	if (dev_block_count < system_group_size) {
+	if (dev_byte_count < system_group_size) {
 		error("device is too small to make filesystem, must be at least %llu",
 				system_group_size);
 		goto error;
@@ -1794,7 +1794,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	mkfs_cfg.label = label;
 	memcpy(mkfs_cfg.fs_uuid, fs_uuid, sizeof(mkfs_cfg.fs_uuid));
 	memcpy(mkfs_cfg.dev_uuid, dev_uuid, sizeof(mkfs_cfg.dev_uuid));
-	mkfs_cfg.num_bytes = dev_block_count;
+	mkfs_cfg.num_bytes = dev_byte_count;
 	mkfs_cfg.nodesize = nodesize;
 	mkfs_cfg.sectorsize = sectorsize;
 	mkfs_cfg.stripesize = stripesize;
@@ -1889,7 +1889,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 				file);
 			continue;
 		}
-		dev_block_count = prepare_ctx[i].dev_block_count;
+		dev_byte_count = prepare_ctx[i].dev_byte_count;
 
 		if (prepare_ctx[i].ret) {
 			errno = -prepare_ctx[i].ret;
@@ -1898,7 +1898,7 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		}
 
 		ret = btrfs_add_to_fsid(trans, root, prepare_ctx[i].fd,
-					prepare_ctx[i].file, dev_block_count,
+					prepare_ctx[i].file, dev_byte_count,
 					sectorsize, sectorsize, sectorsize);
 		if (ret) {
 			error("unable to add %s to filesystem: %d",
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/7] btrfs-progs: mkfs: remove duplicated device size check
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
  2024-05-14  0:51 ` [PATCH 1/7] btrfs-progs: rename block_count to byte_count Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 3/7] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

test_minimum_size() already checks if each device can host the initial
block groups. There is no need to check if the first device can host the
initial system chunk again.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 mkfs/main.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/mkfs/main.c b/mkfs/main.c
index 950f76101058..f6f67abf3b0e 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1189,7 +1189,6 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	struct prepare_device_progress *prepare_ctx = NULL;
 	struct mkfs_allocation allocation = { 0 };
 	struct btrfs_mkfs_config mkfs_cfg;
-	u64 system_group_size;
 	/* Options */
 	bool force_overwrite = false;
 	struct btrfs_mkfs_features features = btrfs_mkfs_default_features;
@@ -1770,14 +1769,6 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		goto error;
 	}
 
-	/* To create the first block group and chunk 0 in make_btrfs */
-	system_group_size = (opt_zoned ? zone_size(file) : BTRFS_MKFS_SYSTEM_GROUP_SIZE);
-	if (dev_byte_count < system_group_size) {
-		error("device is too small to make filesystem, must be at least %llu",
-				system_group_size);
-		goto error;
-	}
-
 	if (btrfs_bg_type_to_tolerated_failures(metadata_profile) <
 	    btrfs_bg_type_to_tolerated_failures(data_profile))
 		warning("metadata has lower redundancy than data!\n");
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/7] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size()
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
  2024-05-14  0:51 ` [PATCH 1/7] btrfs-progs: rename block_count to byte_count Naohiro Aota
  2024-05-14  0:51 ` [PATCH 2/7] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 4/7] btrfs-progs: mkfs: fix minimum size calculation for zoned Naohiro Aota
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

We are going to implement a better minimum size calculation for the zoned
mode. Move the current logic to btrfs_min_dev_size() and unify the size
checking path.

Also, convert "int mixed" to "bool mixed" while at it.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 mkfs/common.c | 11 ++++++++++-
 mkfs/common.h |  2 +-
 mkfs/main.c   | 22 +++++-----------------
 3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/mkfs/common.c b/mkfs/common.c
index 3c48a6c120e7..af54089654a0 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -811,13 +811,22 @@ static u64 btrfs_min_global_blk_rsv_size(u32 nodesize)
 	return (u64)nodesize << 10;
 }
 
-u64 btrfs_min_dev_size(u32 nodesize, int mixed, u64 meta_profile,
+u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile,
 		       u64 data_profile)
 {
 	u64 reserved = 0;
 	u64 meta_size;
 	u64 data_size;
 
+	/*
+	 * 2 zones for the primary superblock
+	 * 1 zone for the system block group
+	 * 1 zone for a metadata block group
+	 * 1 zone for a data block group
+	 */
+	if (zone_size)
+		return 5 * zone_size;
+
 	if (mixed)
 		return 2 * (BTRFS_MKFS_SYSTEM_GROUP_SIZE +
 			    btrfs_min_global_blk_rsv_size(nodesize));
diff --git a/mkfs/common.h b/mkfs/common.h
index d9183c997bb2..de0ff57beee8 100644
--- a/mkfs/common.h
+++ b/mkfs/common.h
@@ -105,7 +105,7 @@ struct btrfs_mkfs_config {
 int make_btrfs(int fd, struct btrfs_mkfs_config *cfg);
 int btrfs_make_root_dir(struct btrfs_trans_handle *trans,
 			struct btrfs_root *root, u64 objectid);
-u64 btrfs_min_dev_size(u32 nodesize, int mixed, u64 meta_profile,
+u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile,
 		       u64 data_profile);
 int test_minimum_size(const char *file, u64 min_dev_size);
 int is_vol_small(const char *file);
diff --git a/mkfs/main.c b/mkfs/main.c
index f6f67abf3b0e..a437ecc40c7f 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1588,8 +1588,9 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		goto error;
 	}
 
-	min_dev_size = btrfs_min_dev_size(nodesize, mixed, metadata_profile,
-					  data_profile);
+	min_dev_size = btrfs_min_dev_size(nodesize, mixed,
+					  opt_zoned ? zone_size(file) : 0,
+					  metadata_profile, data_profile);
 	/*
 	 * Enlarge the destination file or create a new one, using the size
 	 * calculated from source dir.
@@ -1650,21 +1651,8 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 	if (byte_count && byte_count < min_dev_size) {
 		error("size %llu is too small to make a usable filesystem",
 			byte_count);
-		error("minimum size for btrfs filesystem is %llu",
-			min_dev_size);
-		goto error;
-	}
-	/*
-	 * 2 zones for the primary superblock
-	 * 1 zone for the system block group
-	 * 1 zone for a metadata block group
-	 * 1 zone for a data block group
-	 */
-	if (opt_zoned && byte_count && byte_count < 5 * zone_size(file)) {
-		error("size %llu is too small to make a usable filesystem",
-			byte_count);
-		error("minimum size for a zoned btrfs filesystem is %llu",
-			min_dev_size);
+		error("minimum size for a %sbtrfs filesystem is %llu",
+		      opt_zoned ? "zoned mode " : "", min_dev_size);
 		goto error;
 	}
 
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/7] btrfs-progs: mkfs: fix minimum size calculation for zoned
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
                   ` (2 preceding siblings ...)
  2024-05-14  0:51 ` [PATCH 3/7] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 5/7] btrfs-progs: mkfs: check if byte_count is zone size aligned Naohiro Aota
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

Currently, we check if a device is larger than 5 zones to determine we can
create btrfs on the device or not. Actually, we need more zones to create
DUP block groups, so it fails with "ERROR: not enough free space to
allocate chunk". Implement proper support for non-SINGLE profile.

Also, current code does not ensure we can create tree-log BG and data
relocaiton BG, which are essential for the real usage. Count them as
requirement too.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 mkfs/common.c | 53 +++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 45 insertions(+), 8 deletions(-)

diff --git a/mkfs/common.c b/mkfs/common.c
index af54089654a0..a5100b296f65 100644
--- a/mkfs/common.c
+++ b/mkfs/common.c
@@ -818,14 +818,51 @@ u64 btrfs_min_dev_size(u32 nodesize, bool mixed, u64 zone_size, u64 meta_profile
 	u64 meta_size;
 	u64 data_size;

-	/*
-	 * 2 zones for the primary superblock
-	 * 1 zone for the system block group
-	 * 1 zone for a metadata block group
-	 * 1 zone for a data block group
-	 */
-	if (zone_size)
-		return 5 * zone_size;
+	if (zone_size) {
+		/* 2 zones for the primary superblock. */
+		reserved += 2 * zone_size;
+
+		/*
+		 * 1 zone each for the initial system, metadata, and data block
+		 * group
+		 */
+		reserved += 3 * zone_size;
+
+		/*
+		 * non-SINGLE profile needs:
+		 * 1 zone for system block group
+		 * 1 zone for normal metadata block group
+		 * 1 zone for tree-log block group
+		 *
+		 * SINGLE profile only need to add tree-log block group
+		 */
+		if (meta_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
+			meta_size = 3 * zone_size;
+		else
+			meta_size = zone_size;
+		/* DUP profile needs two zones for each block group. */
+		if (meta_profile & BTRFS_BLOCK_GROUP_DUP)
+			meta_size *= 2;
+		reserved += meta_size;
+
+		/*
+		 * non-SINGLE profile needs:
+		 * 1 zone for data block group
+		 * 1 zone for data relocation block group
+		 *
+		 * SINGLE profile only need to add data relocationblock group
+		 */
+		if (data_profile & BTRFS_BLOCK_GROUP_PROFILE_MASK)
+			data_size = 2 * zone_size;
+		else
+			data_size = zone_size;
+		/* DUP profile needs two zones for each block group. */
+		if (data_profile & BTRFS_BLOCK_GROUP_DUP)
+			data_size *= 2;
+		reserved += data_size;
+
+		return reserved;
+	}

 	if (mixed)
 		return 2 * (BTRFS_MKFS_SYSTEM_GROUP_SIZE +
--
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/7] btrfs-progs: mkfs: check if byte_count is zone size aligned
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
                   ` (3 preceding siblings ...)
  2024-05-14  0:51 ` [PATCH 4/7] btrfs-progs: mkfs: fix minimum size calculation for zoned Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 6/7] btrfs-progs: support byte length for zone resetting Naohiro Aota
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

Creating a btrfs whose size is not aligned to the zone boundary is
meaningless and allowing it can confuse users. Disallow creating it.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 mkfs/main.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mkfs/main.c b/mkfs/main.c
index a437ecc40c7f..faf397848cc4 100644
--- a/mkfs/main.c
+++ b/mkfs/main.c
@@ -1655,6 +1655,11 @@ int BOX_MAIN(mkfs)(int argc, char **argv)
 		      opt_zoned ? "zoned mode " : "", min_dev_size);
 		goto error;
 	}
+	if (byte_count && opt_zoned && !IS_ALIGNED(byte_count, zone_size(file))) {
+		error("size %llu is not aligned to zone size %llu", byte_count,
+		      zone_size(file));
+		goto error;
+	}
 
 	for (i = saved_optind; i < saved_optind + device_count; i++) {
 		char *path;
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/7] btrfs-progs: support byte length for zone resetting
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
                   ` (4 preceding siblings ...)
  2024-05-14  0:51 ` [PATCH 5/7] btrfs-progs: mkfs: check if byte_count is zone size aligned Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14  0:51 ` [PATCH 7/7] btrfs-progs: add test " Naohiro Aota
  2024-05-14 15:39 ` [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

Even with "mkfs.btrfs -b", mkfs.btrfs resets all the zones on the device.
Limit the reset target within the specified length.

Also, we need to check that there is no active zone outside of the FS
range. If there is one, btrfs fails to meet the active zone limit properly.

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 common/device-utils.c | 17 ++++++++++++-----
 kernel-shared/zoned.c | 23 ++++++++++++++++++++++-
 kernel-shared/zoned.h |  2 +-
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/common/device-utils.c b/common/device-utils.c
index 86942e0c7041..7df7d9ce39d8 100644
--- a/common/device-utils.c
+++ b/common/device-utils.c
@@ -254,16 +254,23 @@ int btrfs_prepare_device(int fd, const char *file, u64 *byte_count_ret,
 
 		if (!zinfo->emulated) {
 			if (opflags & PREP_DEVICE_VERBOSE)
-				printf("Resetting device zones %s (%u zones) ...\n",
-				       file, zinfo->nr_zones);
+				printf("Resetting device zones %s (%llu zones) ...\n",
+				       file, byte_count / zinfo->zone_size);
 			/*
 			 * We cannot ignore zone reset errors for a zoned block
 			 * device as this could result in the inability to write
 			 * to non-empty sequential zones of the device.
 			 */
-			if (btrfs_reset_all_zones(fd, zinfo)) {
-				error("zoned: failed to reset device '%s' zones: %m",
-				      file);
+			ret = btrfs_reset_zones(fd, zinfo, byte_count);
+			if (ret) {
+				if (ret == EBUSY) {
+					error("zoned: device '%s' contains an active zone outside of the FS range",
+					      file);
+					error("zoned: btrfs needs full control of active zones");
+				} else {
+					error("zoned: failed to reset device '%s' zones: %m",
+					      file);
+				}
 				goto err;
 			}
 		}
diff --git a/kernel-shared/zoned.c b/kernel-shared/zoned.c
index fb1e1388804e..b4244966ca36 100644
--- a/kernel-shared/zoned.c
+++ b/kernel-shared/zoned.c
@@ -395,16 +395,24 @@ static int report_zones(int fd, const char *file,
  * Discard blocks in the zones of a zoned block device. Process this with zone
  * size granularity so that blocks in conventional zones are discarded using
  * discard_range and blocks in sequential zones are reset though a zone reset.
+ *
+ * We need to ensure that zones outside of the FS is not active, so that
+ * the FS can use all the active zones. Return EBUSY if there is an active
+ * zone.
  */
-int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
+int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count)
 {
 	unsigned int i;
 	int ret = 0;
 
 	ASSERT(zinfo);
+	ASSERT(IS_ALIGNED(byte_count, zinfo->zone_size));
 
 	/* Zone size granularity */
 	for (i = 0; i < zinfo->nr_zones; i++) {
+		if (byte_count == 0)
+			break;
+
 		if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL) {
 			ret = device_discard_blocks(fd,
 					     zinfo->zones[i].start << SECTOR_SHIFT,
@@ -419,7 +427,20 @@ int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo)
 
 		if (ret)
 			return ret;
+
+		byte_count -= zinfo->zone_size;
 	}
+	for (; i < zinfo->nr_zones; i++) {
+		const enum blk_zone_cond cond = zinfo->zones[i].cond;
+
+		if (zinfo->zones[i].type == BLK_ZONE_TYPE_CONVENTIONAL)
+			continue;
+		if (cond == BLK_ZONE_COND_IMP_OPEN ||
+		    cond == BLK_ZONE_COND_EXP_OPEN ||
+		    cond == BLK_ZONE_COND_CLOSED)
+			return EBUSY;
+	}
+
 	return fsync(fd);
 }
 
diff --git a/kernel-shared/zoned.h b/kernel-shared/zoned.h
index 6eba86d266bf..104fb7b19490 100644
--- a/kernel-shared/zoned.h
+++ b/kernel-shared/zoned.h
@@ -149,7 +149,7 @@ bool btrfs_redirty_extent_buffer_for_zoned(struct btrfs_fs_info *fs_info,
 					   u64 start, u64 end);
 int btrfs_reset_chunk_zones(struct btrfs_fs_info *fs_info, u64 devid,
 			    u64 offset, u64 length);
-int btrfs_reset_all_zones(int fd, struct btrfs_zoned_device_info *zinfo);
+int btrfs_reset_zones(int fd, struct btrfs_zoned_device_info *zinfo, u64 byte_count);
 int zero_zone_blocks(int fd, struct btrfs_zoned_device_info *zinfo, off_t start,
 		     size_t len);
 int btrfs_wipe_temporary_sb(struct btrfs_fs_devices *fs_devices);
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 7/7] btrfs-progs: add test for zone resetting
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
                   ` (5 preceding siblings ...)
  2024-05-14  0:51 ` [PATCH 6/7] btrfs-progs: support byte length for zone resetting Naohiro Aota
@ 2024-05-14  0:51 ` Naohiro Aota
  2024-05-14 15:39 ` [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
  7 siblings, 0 replies; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14  0:51 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Naohiro Aota

Add test for mkfs.btrfs's zone reset behavior to check if

- it resets all the zones without "-b" option
- it detects an active zone outside of the FS range
- it does not reset a zone outside of the range

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
---
 tests/mkfs-tests/032-zoned-reset/test.sh | 62 ++++++++++++++++++++++++
 1 file changed, 62 insertions(+)
 create mode 100755 tests/mkfs-tests/032-zoned-reset/test.sh

diff --git a/tests/mkfs-tests/032-zoned-reset/test.sh b/tests/mkfs-tests/032-zoned-reset/test.sh
new file mode 100755
index 000000000000..6a599dd2874f
--- /dev/null
+++ b/tests/mkfs-tests/032-zoned-reset/test.sh
@@ -0,0 +1,62 @@
+#!/bin/bash
+# Verify mkfs for zoned devices support block-group-tree feature
+
+source "$TEST_TOP/common" || exit
+
+setup_root_helper
+prepare_test_dev
+
+nullb="$TEST_TOP/nullb"
+# Create one 128M device with 4M zones, 32 of them
+size=128
+zone=4
+
+run_mayfail $SUDO_HELPER "$nullb" setup
+if [ $? != 0 ]; then
+	_not_run "cannot setup nullb environment for zoned devices"
+fi
+
+# Record any other pre-existing devices in case creation fails
+run_check $SUDO_HELPER "$nullb" ls
+
+# Last line has the name of the device node path
+out=$(run_check_stdout $SUDO_HELPER "$nullb" create -s "$size" -z "$zone")
+if [ $? != 0 ]; then
+	_fail "cannot create nullb zoned device $i"
+fi
+dev=$(echo "$out" | tail -n 1)
+name=$(basename "${dev}")
+
+run_check $SUDO_HELPER "$nullb" ls
+
+TEST_DEV="${dev}"
+last_zone_sector=$(( 4 * 31 * 1024 * 1024 / 512 ))
+# Write some data to the last zone
+run_check $SUDO_HELPER dd if=/dev/urandom of="${dev}" bs=1M count=4 seek=$(( 4 * 31 ))
+# Use single as it's supported on more kernels
+run_check $SUDO_HELPER "$TOP/mkfs.btrfs" -f -m single -d single "${dev}"
+# Check if the lat zone is empty
+$SUDO_HELPER blkzone report -o ${last_zone_sector} -c 1 "${dev}" | grep -Fq '(em)'
+if [ $? != 0 ]; then
+	_fail "last zone is not empty"
+fi
+
+# Write some data to the last zone
+run_check $SUDO_HELPER dd if=/dev/urandom of="${dev}" bs=1M count=1 seek=$(( 4 * 31 ))
+# Create a FS excluding the last zone
+run_mayfail $SUDO_HELPER "$TOP/mkfs.btrfs" -f -b $(( 4 * 31 ))M -m single -d single "${dev}"
+if [ $? == 0 ]; then
+	_fail "mkfs.btrfs should detect active zone outside of FS range"
+fi
+
+# Fill the last zone to finish it
+run_check $SUDO_HELPER dd if=/dev/urandom of="${dev}" bs=1M count=3 seek=$(( 4 * 31 + 1 ))
+# Create a FS excluding the last zone
+run_mayfail $SUDO_HELPER "$TOP/mkfs.btrfs" -f -b $(( 4 * 31 ))M -m single -d single "${dev}"
+# Check if the lat zone is not empty
+$SUDO_HELPER blkzone report -o ${last_zone_sector} -c 1 "${dev}" | grep -Fq '(em)'
+if [ $? == 0 ]; then
+	_fail "last zone is empty"
+fi
+
+run_check $SUDO_HELPER "$nullb" rm "${name}"
--
2.45.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
  2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
                   ` (6 preceding siblings ...)
  2024-05-14  0:51 ` [PATCH 7/7] btrfs-progs: add test " Naohiro Aota
@ 2024-05-14 15:39 ` David Sterba
  2024-05-14 17:14   ` Naohiro Aota
  7 siblings, 1 reply; 11+ messages in thread
From: David Sterba @ 2024-05-14 15:39 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: linux-btrfs

On Mon, May 13, 2024 at 06:51:26PM -0600, Naohiro Aota wrote:
> mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
> below.
> 
> - The FS size needs to be larger than minimal size that can host a btrfs,
>   but its calculation does not consider non-SINGLE profile
> - The calculation also does not ensure tree-log BG and data relocation BG
> - It allows creating a FS not aligned to the zone boundary
> - It resets all device zones beyond the specified length
> 
> This series fixes the issues with some cleanups.
> 
> Patches 1 to 3 are clean up patches, so they should not change the behavior.
> 
> Patches 4 to 6 address the issues and the last patch adds a test case.
> 
> Naohiro Aota (7):
>   btrfs-progs: rename block_count to byte_count
>   btrfs-progs: mkfs: remove duplicated device size check
>   btrfs-progs: mkfs: unify zoned mode minimum size calc into
>     btrfs_min_dev_size()
>   btrfs-progs: mkfs: fix minimum size calculation for zoned
>   btrfs-progs: mkfs: check if byte_count is zone size aligned
>   btrfs-progs: support byte length for zone resetting
>   btrfs-progs: add test for zone resetting

I did a quick CI check, the mkfs tests fails. You can open a pull
request to get your changes tested (it can be just for the testing
purpose, if you note that I'll skip it until the final version).

https://github.com/kdave/btrfs-progs/actions/runs/9081685951

There are also some compatibility build tests on older distros,

https://github.com/kdave/btrfs-progs/actions/runs/9081685969

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
  2024-05-14 15:39 ` [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
@ 2024-05-14 17:14   ` Naohiro Aota
  2024-05-14 17:21     ` David Sterba
  0 siblings, 1 reply; 11+ messages in thread
From: Naohiro Aota @ 2024-05-14 17:14 UTC (permalink / raw)
  To: David Sterba; +Cc: linux-btrfs@vger.kernel.org

On Tue, May 14, 2024 at 05:39:38PM +0200, David Sterba wrote:
> On Mon, May 13, 2024 at 06:51:26PM -0600, Naohiro Aota wrote:
> > mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
> > below.
> > 
> > - The FS size needs to be larger than minimal size that can host a btrfs,
> >   but its calculation does not consider non-SINGLE profile
> > - The calculation also does not ensure tree-log BG and data relocation BG
> > - It allows creating a FS not aligned to the zone boundary
> > - It resets all device zones beyond the specified length
> > 
> > This series fixes the issues with some cleanups.
> > 
> > Patches 1 to 3 are clean up patches, so they should not change the behavior.
> > 
> > Patches 4 to 6 address the issues and the last patch adds a test case.
> > 
> > Naohiro Aota (7):
> >   btrfs-progs: rename block_count to byte_count
> >   btrfs-progs: mkfs: remove duplicated device size check
> >   btrfs-progs: mkfs: unify zoned mode minimum size calc into
> >     btrfs_min_dev_size()
> >   btrfs-progs: mkfs: fix minimum size calculation for zoned
> >   btrfs-progs: mkfs: check if byte_count is zone size aligned
> >   btrfs-progs: support byte length for zone resetting
> >   btrfs-progs: add test for zone resetting
> 
> I did a quick CI check, the mkfs tests fails. You can open a pull
> request to get your changes tested (it can be just for the testing
> purpose, if you note that I'll skip it until the final version).
> 
> https://github.com/kdave/btrfs-progs/actions/runs/9081685951

Thank you. I just noticed some workflows are running on my btrfs-progs
repository too.

I'm checking the fixed code with this workflow just in case.

> There are also some compatibility build tests on older distros,
> 
> https://github.com/kdave/btrfs-progs/actions/runs/9081685969

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support
  2024-05-14 17:14   ` Naohiro Aota
@ 2024-05-14 17:21     ` David Sterba
  0 siblings, 0 replies; 11+ messages in thread
From: David Sterba @ 2024-05-14 17:21 UTC (permalink / raw)
  To: Naohiro Aota; +Cc: David Sterba, linux-btrfs@vger.kernel.org

On Tue, May 14, 2024 at 05:14:39PM +0000, Naohiro Aota wrote:
> On Tue, May 14, 2024 at 05:39:38PM +0200, David Sterba wrote:
> > On Mon, May 13, 2024 at 06:51:26PM -0600, Naohiro Aota wrote:
> > > mkfs.btrfs -b <byte_count> on a zoned device has several issues listed
> > > below.
> > > 
> > > - The FS size needs to be larger than minimal size that can host a btrfs,
> > >   but its calculation does not consider non-SINGLE profile
> > > - The calculation also does not ensure tree-log BG and data relocation BG
> > > - It allows creating a FS not aligned to the zone boundary
> > > - It resets all device zones beyond the specified length
> > > 
> > > This series fixes the issues with some cleanups.
> > > 
> > > Patches 1 to 3 are clean up patches, so they should not change the behavior.
> > > 
> > > Patches 4 to 6 address the issues and the last patch adds a test case.
> > > 
> > > Naohiro Aota (7):
> > >   btrfs-progs: rename block_count to byte_count
> > >   btrfs-progs: mkfs: remove duplicated device size check
> > >   btrfs-progs: mkfs: unify zoned mode minimum size calc into
> > >     btrfs_min_dev_size()
> > >   btrfs-progs: mkfs: fix minimum size calculation for zoned
> > >   btrfs-progs: mkfs: check if byte_count is zone size aligned
> > >   btrfs-progs: support byte length for zone resetting
> > >   btrfs-progs: add test for zone resetting
> > 
> > I did a quick CI check, the mkfs tests fails. You can open a pull
> > request to get your changes tested (it can be just for the testing
> > purpose, if you note that I'll skip it until the final version).
> > 
> > https://github.com/kdave/btrfs-progs/actions/runs/9081685951
> 
> Thank you. I just noticed some workflows are running on my btrfs-progs
> repository too.

Yes, the jobs are matched by the branch names and will start if you have
Actions enabled in the repository.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2024-05-14 17:29 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-14  0:51 [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support Naohiro Aota
2024-05-14  0:51 ` [PATCH 1/7] btrfs-progs: rename block_count to byte_count Naohiro Aota
2024-05-14  0:51 ` [PATCH 2/7] btrfs-progs: mkfs: remove duplicated device size check Naohiro Aota
2024-05-14  0:51 ` [PATCH 3/7] btrfs-progs: mkfs: unify zoned mode minimum size calc into btrfs_min_dev_size() Naohiro Aota
2024-05-14  0:51 ` [PATCH 4/7] btrfs-progs: mkfs: fix minimum size calculation for zoned Naohiro Aota
2024-05-14  0:51 ` [PATCH 5/7] btrfs-progs: mkfs: check if byte_count is zone size aligned Naohiro Aota
2024-05-14  0:51 ` [PATCH 6/7] btrfs-progs: support byte length for zone resetting Naohiro Aota
2024-05-14  0:51 ` [PATCH 7/7] btrfs-progs: add test " Naohiro Aota
2024-05-14 15:39 ` [PATCH 0/7] btrfs-progs: zoned: proper "mkfs.btrfs -b" support David Sterba
2024-05-14 17:14   ` Naohiro Aota
2024-05-14 17:21     ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox